Welcome, Guest. Please Login
 
  HomeHelpSearchLogin FAQ Radified Ghost.Classic Ghost.New Bootable CD Blog  
 
Pages: 1 2 3 
Send Topic Print
Apache file-extention mojo (Read 51613 times)
Rad
Radministrator
*****
Offline


Sufferin' succotash

Posts: 4090
Newport Beach, California


Back to top
Re: Apache file-extention mojo
Reply #15 - Apr 21st, 2008 at 3:12pm
 
Quote:
The thing with a re-organization like this is that it takes time

Hmmm. I was thinking I could make one change and blammo > we're rocking with cool, rad web pages. At this point, I fail to see what will take time. (I'll continue reading.)

Quote:
so that during the transition time either name works

Okay. This makes sense. Either name is what I want, for reasons previously mentioned (pre-existing links and search engines).

Quote:
Now, in UNIX the traditional tool for this kind of process when applied to static content is the symbolic link; you'd migrate in stages by first (using a simple shell script) creating symlinks under the new name for all the existing files, so that during the transition time either name works. Then you can migrate the internal hyperlinks so that you consistently use the new name form, and finally you exchange the real file and the symlink (again, a simple shell script). At that point the symlinks can sit around as long as they need to.

This paragraph seems to be the meat-n-potatoes of what I'm after. I see what you're saying but don't understand they underlying technology to accomplish.

I'd just like to start with one file (the home page). Is that possible, or does it open a site-wide can o' worms once I start?

Quote:
A word of warning about that; if you want to physically use .rad, remember that not only will you have to teach Apache that .rad has a mime-type of text/html (which is easy enough using the AddType configuration directive),

Uh, "warning," .. that didn't sound good. Maybe what I want to do ( a better approach), would be to KEEP using index2.html and work all thr Apache mojo from the Apache end. My aim/desire here is to maintain ONLY ONE FILE (not two). However that is accomplished doesn't matter. Whatever you would recommend as the more elegant and practical approach is fine by me.

Quote:
could be well be little annoyances that plague you from now until doomsday 

I already have plenty of little annoyances in my life. What wouldyou recommend as the best option?
 
WWW  
IP Logged
 

Nigel Bree
Ex Member




Back to top
Re: Apache file-extention mojo
Reply #16 - Apr 21st, 2008 at 3:17pm
 
Rad wrote on Apr 21st, 2008 at 2:50pm:
Do I need VMWare to use VMWare Player?  

Nope. Player *is* VMWare, it's just a simple edition that doesn't let you make completely new virtual machines from scratch or do some of the sophisticated snapshot operations that the full version does.

Just download Player, install it, pick up an existing Virtual Machine, called "appliances" when set up for Player - let's say you want to try Ubuntu, just download an appliance (unpack it if it's zipped up) and then you can run the virtual machine by double-clicking the VMX file and you're off!
 
 
IP Logged
 
Nigel Bree
Ex Member




Back to top
Re: Apache file-extention mojo
Reply #17 - Apr 21st, 2008 at 3:29pm
 
Rad wrote on Apr 21st, 2008 at 3:12pm:
At this point, I fail to see what will take time. (I'll continue reading.

Well, these things tend to start with wanting to make just one quick change... but it never stops with just one, does it?

As soon as you have an alias like .rad, if you like it you'll want to start preferring it (so it's what appears in bookmarks and the like), which means rewriting all the internal hyperlinks in your pages so that's the "official" name you expose to the external world. That change is the one that takes time and effort to accomplish.

And so, the dominos start to fall....
 
 
IP Logged
 
Nigel Bree
Ex Member




Back to top
Re: Apache file-extention mojo
Reply #18 - Apr 21st, 2008 at 3:34pm
 
Rad wrote on Apr 21st, 2008 at 3:12pm:
I'd just like to start with one file (the home page).

OK, then, just put this into the root .htaccess file:
 RewriteEngine on
 RewriteRule ^index.rad index2.html

Do that, and then you should be able to browse to index.rad and what Apache will serve you up is what's in index2.html in the filesystem.

Quote:
Maybe what I want to do ( a better approach), would be to KEEP using index2.html and work all thr Apache mojo from the Apache end.

That's definitely the best. If a file contains HTML, you should call it .html, meaning no surprises or confusion for you (or any HTML tools you use) in the future.
 
 
IP Logged
 
Rad
Radministrator
*****
Offline


Sufferin' succotash

Posts: 4090
Newport Beach, California


Back to top
Re: Apache file-extention mojo
Reply #19 - Apr 21st, 2008 at 3:37pm
 
Quote:
Apache's URL-processing pipeline is that it traverses the URL component by component

Mostly Greek. Read it over several times. Started seeing more light, but still over my head.

As a side note, just thinking out loud, it seems interesting that a browser can read a file named *.ars, or *.rad .. when when those files are not natively supported by web browsers. Know what I mean? I mean, SOMETHING must be telling the browser that *.rad = *.html .. right?


 
WWW  
IP Logged
 
Nigel Bree
Ex Member




Back to top
Re: Apache file-extention mojo
Reply #20 - Apr 21st, 2008 at 4:01pm
 
It just means that as it works through, say, a URL like http://radified.com/cgi-bin/yabb2/YaBB.pl?num=1208670491/15#18 Apache starts at the left-hand side and starts chewing through the URL one part at a time until it's eaten the whole thing or found something to serve up.

cgi-bin
 -> look for <Directory> sections under httpd.conf to find where the root is in the filesystem
 -> check what it is, directory or file under the root
 -> not a file, look in ~/.htaccess
 -> look for rules that apply to cgi-bin/yabb2/YaBB.pl?num=1208670491/15#18
 -> none, tick off cgi-bin
 -> look for <Directory> sections under httpd.conf to find where cgi-bin is in the filesystem

yabb2
 -> check what it is, directory or file under cgi-bin
 -> not a file, look in ~/cgi-bin/.htaccess
 -> look for rules that apply to yabb2/YaBB.pl?num=1208670491/15#18
 -> if none, tick off yabb2
 -> look for <Directory> sections under httpd.conf to find where cgi-bin/yabb2 is in the filesystem

YaBB.pl
 -> check what it is, directory or file under
 -> it's a file, look in .htaccess for allow/deny
 -> it's a script, think about giving it to mod_perl
 -> give it to mod_perl, which then starts chewing on num=1208670491/15#18


Edit: fix cut-n-paste error
 
 
IP Logged
 

Nigel Bree
Ex Member




Back to top
Re: Apache file-extention mojo
Reply #21 - Apr 21st, 2008 at 4:10pm
 
Rad wrote on Apr 21st, 2008 at 3:37pm:
I mean, SOMETHING must be telling the browser that *.rad = *.html .. right?

Yup. That typically goes into the HTTP part of the transaction. There's an HTTP header called Content-Type which describes what kind of content of the result of an HTTP get is. text/plain, text/html, whatever.

If the content is generated by a script, it can write whatever it likes there, but for content coming from the filesystem, this is determined by a part of Apache that uses the file extensions - the mod_mime component, which uses the AddType directive, and there's one of those in effect for .html files.

There isn't a rule for .rad, but because of the way mod_rewrite works when the rewrite rules are in the .htaccess files, if mod_rewrite does anything to the URL it basically re-submits the rewritten URL to Apache to start over again. So on the second (or third, or however many tries it takes) attempt Apache eventually gets to the .html file in the filesystem, and only *then* decides it has to fake up the Content-Type: header. By that stage it's working with something called .html, so that's what it uses to look up in the mime-type registry, and Hey Presto! your browser (which doesn't care about file extensions itself) is being told "trust me, this is really HTML" by a Content-Type: of text/html
 
 
IP Logged
 
Rad
Radministrator
*****
Offline


Sufferin' succotash

Posts: 4090
Newport Beach, California


Back to top
Re: Apache file-extention mojo
Reply #22 - Apr 21st, 2008 at 5:38pm
 
Quote:
OK, then, just put this into the root .htaccess file:
RewriteEngine on
RewriteRule ^index.rad index2.html

Okay, I'm going to try this now.

I am wondering if I load page index2.html, will it still load page index2.html, or will it change to index.rad on-the-fly?

It would be cool if every time somebody loaded index2.html, they got index.rad .. and also (of course) if they loaded index.rad, they got index.rad

I love this stuff. I really appreciate your help. If you need web space or a rad email acct or a MTOS blog .. or something I can do to reciprocate, I am eager to do so. (Same offer applies to all who help make Radified so Rad.)
 
WWW  
IP Logged
 
Rad
Radministrator
*****
Offline


Sufferin' succotash

Posts: 4090
Newport Beach, California


Back to top
Re: Apache file-extention mojo
Reply #23 - Apr 21st, 2008 at 5:42pm
 
Woohoo!!!

http://radified.com/index.rad

That is sooo cool.

Now, can we get it to convert index2.html to index.rad on the fly?

Or is that something we shouldn't want to do?

The index.rad updates as I update index2.html right? (Update, yeah it does. Makes sense. Might have to refresh tho.)

So I can start using index.rad here as my homepage link? .. from now on? .. no foreseeable problems?

I am stoked! (Wonder why these little things get me so excited.)
 
WWW  
IP Logged
 
Nigel Bree
Ex Member




Back to top
Re: Apache file-extention mojo
Reply #24 - Apr 21st, 2008 at 6:32pm
 
Rad wrote on Apr 21st, 2008 at 5:42pm:
Now, can we get it to convert index2.html to index.rad on the fly?

Let's just be clear what you mean by this, because there are actually two three things.

If you want the underlying file in the filesystem called index.rad, that will be doable, but remember my warning about the struggle that will cause? If you flip the rewrite rule around, the underlying file type coming off the filesystem will be .rad, and then Apache won't be able to use the existing mime-type rules, and so the web browsers won't see HTML any more until we fix that, and as I said that's the start of a Sisyphean slipperly slope (say it 5 times fast!) with trying to "fix up" the fact that you aren't putting HTML in HTML files.

So, this is one to try in a VM rather than live. But to do it, use this in .htaccess

 RewriteEngine On
 RewriteRule ^index2.html index.rad
 AddType text/html .rad

and then rename the file on the filesystem using

 mv index2.html index.rad

Now, it's the index2.html file that doesn't exist and the .rad one that does.

The other alternative way of reading your question, by the way (which I've thus far pretended doesn't exist) is "can I change all the outgoing URLs served up without having to edit my HTML files"?

The scary fact is that you can. You shouldn't, but you can.

Apache allows arbitrary filter modules to get in between the source content and the actual network phase, and one such thing is called mod_proxy_html - it works by reparsing all the HTML that 's in the original content, goes hunting for URLs in all kinds of out-of-the-way places, and then spits out rewritten HTML.

You really, really, *don't* want to do this in your case. It's possible, but this is seriously a tool of the ultimate last resort. I'd bet your web hosts are sane and don't have this module enabled, because there are nasty things that can be done with it if it's misconfigured.

Edit: three things. Let me move on to the third possibility next message, since it's actually useful to know.
 
 
IP Logged
 
Rad
Radministrator
*****
Offline


Sufferin' succotash

Posts: 4090
Newport Beach, California


Back to top
Re: Apache file-extention mojo
Reply #25 - Apr 21st, 2008 at 6:44pm
 
By "on-the-fly" I mean .. reader requests (clicks on link to) index2.html, and what they receive is > index.rad

We agreed that we want the file which actually sits in directories (both on my home computer and on the server) ..to have an *.html extension.

In this way, a user would receive index.rad whether they clicked on index.rad or index2.html (but both would look like content contained in index2.html)
 
WWW  
IP Logged
 

Nigel Bree
Ex Member




Back to top
Re: Apache file-extention mojo
Reply #26 - Apr 21st, 2008 at 7:04pm
 
The third thing you can add to the mix is a capability of mod_alias, which is to send an HTTP-level response for incoming URLs to tell the requesting web browser to use a different URL. Unlike mod_rewrite, which is sneaky and invisible, this exists to be visible (and slow, because it requires end-to-end round trips).

The Redirect directive (which comes in various flavours) matches a URL and sends back a special HTTP status code that says "not here any more, but....". You can say whether the redirection is temporary, or permanent - and if permanent, things like web spiders (oh like say, search engines) take this as a big hint to reindex things under the new preferred URL.

The URL you send people to can't be relative, it has to be a full URL. Which on the one hand has its uses, since you can use it to redirect subdomains, but on the other hand is fragile because if someone comes via www.radified.com or just plain radified.com, it'll end up forcing the issue for them.

If you take the above example which has .rad as the file in the filesystem, and then add this to the .htaccess

 Redirect /index2.html http://www.radified.com/index.rad

then if you type http://radified.com/index2.html into a browser, the browser address bar will switch to http://www.radified.com/index.rad instead (at the latency cost of an extra server round-trip).

If you're thinking next "can I use mod_alias to generate redirects for index2.html to index.rad and then use mod_rewrite to map index.rad back to index2.html" .... well, at that point you're Crossing the Streams. Just don't go there.

It can be done, but you want to do it in a maintainable and robust way and that means you are best advised to take a whole 'nother step up the ladder to the next level on the Tower of Power, and use CGI-type handling instead to completely divorce the URLs from the filesystem, at least as far as Apache proper is aware of it.
 
 
IP Logged
 
Rad
Radministrator
*****
Offline


Sufferin' succotash

Posts: 4090
Newport Beach, California


Back to top
Re: Apache file-extention mojo
Reply #27 - Apr 21st, 2008 at 10:03pm
 
Nigel, thanks for all the help today. Starting to get tired. Just got dark here .. (which means it probably just got light there).

I will review (study) your other posts tomorrow. Feel like I learned a lot today.

You rock! (I gave you + karma points.)
 
WWW  
IP Logged
 
Nigel Bree
Ex Member




Back to top
Re: Apache file-extention mojo
Reply #28 - Apr 22nd, 2008 at 7:12pm
 
Rad wrote on Apr 21st, 2008 at 10:03pm:
Nigel, thanks for all the help today. Starting to get tired. Just got dark here .. (which means it probably just got light there).

Now that we're past the equinox (daylight savings gone here, in place for you) it's actually only 5 hours different from NZST to PDT. It's 7 hours pre-equinox (southern summer) when we're in NZDT and you're in PST. So, you wrote that at 3pm my time - been in the office 28 hours straight at that point, gotta love the final stretch before a release.

Quote:
You rock! (I gave you + karma points.)

Karma evidently didn't agree. Because I was so tired heading home last night around 8pm I forgot one of the speed traps I know to avoid and got nailed with a speeding fine. Perfect finale to what was already a pretty lousy day.

Aaaaaaaanyway, lemme know when you're ready for the next phase.

What I'd suggest the ideal approach going forward is, is that you use a script to effectively provide a mirror world for your existing site. That can give you a better way of doing what mod_proxy_html does, much simpler - effectively you can have two sites in parallel (and you could in fact do this with subdomains) where one has a DocumentRoot with the custom .htaccess that refers to a script that actually grabs the source documents from the other site's DocumentRoot, and not only presents them to the world using a custom URL (where documents are .rad) but can rewrite a few of the inter-document links.

That gives the benefit of having your actual physical site structure stay as-is, perfectly normal, with HTML in .html files and normal inter-document URLs that all the standard web tools - link checkers, structure validators, etc. - are happy with, and you have a script that transforms the documents as they pass through to make the alternate view of things more seamless.

The point of using a script rather than something like mod_proxy_html (which as I said is unlikely to be available to you anyway - reverse proxying is NOT something you want to mess with) is that the latter is complex, and opaque, and designed for dealing with nasty legacy websites. By building it yourself you might miss some of the exotic corner cases, but the result is probably more useful to you and you'll understand it better.
 
 
IP Logged
 
Rad
Radministrator
*****
Offline


Sufferin' succotash

Posts: 4090
Newport Beach, California


Back to top
Re: Apache file-extention mojo
Reply #29 - Apr 22nd, 2008 at 9:00pm
 
Quote:
it's actually only 5 hours different from NZST to PDT

Seems hard to believe. I'll have to analyze closer on the big, 12-foot tall globe at the bank next time I stop by.

Quote:
been in the office 28 hours straight  

Actually, I'm not surprised, cuz I know you're a true professional. They're lucky to have you.

Quote:
got nailed with a speeding fine

That totally suks. I *hate* speeding tiks. Such a waste of money. Do you have these new cameras at busy traffic lights there .. that take your picture? They set the yellow-light shorter in order to increase revenue.

Quote:
lemme know when you're ready for the next phase

Thanks. I appreciate your help more than I can say. I get the Bug tomorrow AM, and this is my weekend, so I'll be playing dad 'til Sunday PM.

Quote:
What I'd suggest the ideal approach going forward  

When Nigel shares his version of "ideal" I'm listening. I think Magoo mentioned something similar to this when we were discussing the possibily of running a mirror server down-under (to lower response times for those on the other side of the world).

Quote:
a better way of doing what mod_proxy_html does, much simpler  

Hmmm. My eyebrows are raised.

Quote:
but can rewrite a few of the inter-document links

Didn't know that could be done .. altho I should realize there's probably very little that CAN'T be done .. if one knows the necessary mojo (like you).

Quote:
is that you use a script to effectively provide a mirror world for your existing site

I would obviouskly need help with this.

As a side note, from reading some of your previous posts in this thread, one thing that stuck out in my mind was that I DIDN'T want to do anything that increased response time, as I feel a fast-loading page is important. For example, there was one option you spelled out where it would take *two* round-trips for a reader to get a page. That didn't sound appealing to me.
 
WWW  
IP Logged
 
Pages: 1 2 3 
Send Topic Print