301 redirects - the tricky ones

June 14, 2008 – 9:54 pm

Here’s a tip for all who inherited or upgraded an existing site, and have to retain some crucial URLs. I generally hate dealing with this stuff, because it feels like polluting a nice new shiny project with legacy code that fulfills no other purpose than make your site work with outdated systems.. such as, err… Google! ;)
But then again, at the end of the day sites are only useful if you can find the information you’re looking for, and if the whole world is linking to your site already, we have to use redirects to make sure visitors continue to find the content they expect.

So on with it! Needless to say, the place to put redirects is the .htaccess file at your web root.

Now, for static content, redirects are straightforward - you simply write:

redirect 301 -from -to -flags (pseudo code)

for example:

redirect 301 /register.php /customer/join

Now, where it gets tricky is dynamic urls. Say you have an existing site with a product catalog, and your URLs look something like this:

/products/detail.php?product_id=31

While it is possible to redirect this via the same 301 redirect we used above, you would have to keep working with the same query string format, as a normal redirect would only appends it to the new url, resulting in, say, /newscript.php?product_id=31. However, what we really want to do here, is turn this legacy url into a nice sexy new-school url, such as /products/detail/31. In order to do that, we can’t just use redirects, we gotta do some url rewriting by defining a rewrite rule. Rewrite rules - nomen est omen - take a url and reformat it according to certain rules.

The resulting code is:

RewriteCond %{QUERY_STRING} product_id=([0-9]+) [NC]
RewriteRule ^products/detail.php /products/detail/%1? [R=301, L]

What’s happening here? Basically, our rewrite condition checks a server variable for a pattern. Server variables that are made available by apache are accessed via %{VARIABLE_NAME_HERE}, and in this case the variable we want is QUERY_STRING, but different circumstances could call for others, such as HTTP_REFERER, REQUEST_METHOD, DOCUMENT_ROOT, etc.

For a great cheat sheet on all the variables and flags you can use, go to http://www.ilovejackdaniels.com/cheat-sheets/mod_rewrite-cheat-sheet. It’s one of the best cheat-sheets I’ve ever seen since college ;)

Next, the pattern we check QUERY_STRING for is product_id=([0-9]+), as product_id is the name of the GET parameter the site was using to determine which product to show. In this case we know that this id is numeric, so we’re using a pattern that matches 1 or more digits (+ for 1 or more). Only if this condition is met, the following rewrite rule applies. Also, the first matched pattern is stored in a temporary variable %1. For subsequent matches the variable name is %2, %3 etc..

The rewrite rule itself simply rewrites /products/details.php into /products/detail, but we still need to stick the id into the url, so we append the previously matched pattern from the rewrite condition %1 and get /products/detail/%1?

Note that a period (.) has a special meaning in patterns, so you need to escape it via to say you mean a literal period.

Also note the flags we set on the condition and rule. The [NC] stands for “no case,” or “caSeInseNsitiVe.” Not required in this case, but you never know..

The R=301 tells the requesting browser to interpret this redirect as a 301 permanent redirect, and the L states that this is the Last rule that should be applied to this url. This may or may not apply to your case.. nothing prevents you from applying further rules below, in which case you need to remove the L.

That’s it, good rewriting!


Filed under: Rnadom Sftuf — Tags: , , , , , , — by Richtermeister

Powered by WordPress