Harness the power of the htaccess file for SEO

Harness The Power Of The HTACCESS File For SEO With These 12 Code Examples!

By Lee Kane


HTACCESS stands for hypertext access. The file enables you to override the main servers configuration files and replace it with your own configuration. If you are using shared hosting like me, then you must first check with your host that it is ok to use the htaccess file on their server, as use of the htaccess file can be seen as hacking.

access htaccess

But in my research i found that most hosts do allow the file they just limit what it can and can not configure. When i appraoched my host streamline.net they informed me that i was able to use it for mod_rewrites which is basically all you will want it to do.

What Can The htaccess File Do?

Well, you want your website to run faster and smoother, and you most definitely want it to get spotted by the search engine spiders who, i have been secretly informed, love websites which correctly -

  • deliver clean URL's (uniform resource locators)
  • sends missing page 404 status codes
  • uses a 301 redirect to the correct address of your website
  • delivers your web pages super fast with Gzip compression
  • and handles duplicate content issues by dishing out your website with either the forward slash ( / ) or not

It does some other stuff too, but the above are the most important tricks your htaccess file can do, and will directly result in better SEO.

How Is A htaccess File Created?

You have probably been pulling your hair out over this one as i did as i was unable to find a clear example of how it was done.

The trick is the quotaion marks ( “” )

First open a note pad file.

Then click on save as.

Then type this

“.htaccess”

include the quotation marks or else it will not work, and then save.

how to create a htaccess file

Tip! Type “.htaccess” into the file name box don't copy and paste this part.

If done correctly you should be left with a file which is not recognizable by Microsoft Windows named .htaccess To open and edit this file you will need to use a text editor like Dreamweaver.

Where Do I Put The htaccess File?

There appears to be some dilema out there on the world wide web about where you should place the htaccess file, which depends on what reason you are using it for. If like in my case you are using it to configure your websites rewrites then you should place it in your root folder ie. htdocs folder ie. the same folder as your index file.

YES! I here you say. Well stay tuned as here comes the good stuff.

First Up! RewriteEngine On

This piece of code will go above the first rewrite rule and only has to be used once inside the htaccess file.

RewriteEngine on
RewriteBase /

And to comment inside the file you use # symbol.

# below code to be inserted once at the top of the file
RewriteEngine on
RewriteBase /

How Do I Redirect Visitors To The Correct Version Of My Website?

If you don't use a rel="canonical" then the chances are their are several versions of your website’s home page festooned all over the SERPS (search engine result pages). Although this is not really an issue as Google have stated, if you are going to use analytics then using the htaccess file to correct this is the way to go.

If you do not tell those lovable search engine spiders which home page is the correct one they will get confused. As without telling them they may come across -

www.your-website.com (the www. version)

your-website.com (the naked version)

and the index page of both versions

www.your-webiste.com/index.html

your-website.com/index.html

WHAT! I know, I know.

First you have to decide which website address you want visitors to end up at.

Personally i went for www version of my website as opposed to the naked version, but the choice is entirely down to your own preference and makes no difference to the recognition of your website by search engines or its rank. Secondly you want to loose the index part of your website address, you can keep it if you want, but it goes hand in hand with clean urls.

The htaccess code to redirect all versions of your website address to the www version is

RewriteCond %{HTTP_HOST} ^your-website\.com$ [NC]
RewriteRule ^(.*)$ http://www.your-website.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^.*/index
RewriteRule ^(.*)index$ http://www.your-website.com/$1 [R=301,L]

If you would like to go naked with your website address then its exactly the same code just switch the website address around like below.

RewriteCond %{HTTP_HOST} ^www.your-website\.com$ [NC]
RewriteRule ^(.*)$ http://your-website.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^.*/index
RewriteRule ^(.*)index$ http://your-website.com/$1 [R=301,L]

Clean Up Your URL’s And Make Them Prettier With The htaccess File

Yep you read that right. Surprisingly this is one of the most satisfying of htaccess tricks. Here is what it does.

You have a webpage URL and at the end of that webpage URL you have an ugly looking extension like this -

www.your-website.com/living-life-in-the-fast-lane.html

or

www.your-website.com/living-life-in-the-fast-lane.php

the .html or .php being the ugly part.

Why is it deemed ugly? Well in my personal opinion it isn't, and in this case the name of the URL is fine and would be ok even if you left it that way. The idea is that by simplifying the URL it will make it easier for people to remember, but the whole issue really relates to big dynamic websites that serve php generated URL's that end with long strings of code.

Still, cleaning up that URL and removing the extension is good practice and at the end of the day will make your website look better.

To acheive this just insert the code below to your htaccess file -

RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html

RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.*)$ $1.php

The first block of code will remove the extension .html  from your URL and the second block of code will remove the .php Just insert the code which relates to the format of your website.

NOTE! This code will only work on your website by removing the extension from all your internal links, as in the example below, and applies to relative and absolute URL's.

<ul>
<li><a href="home.html">Home</a></li>
<li><a href="services.html">Services</a></li>
<li><a href="about.html">About</a></li>
<li><a href="contact.html">Contact</a></li>
</ul>

should look like -

<ul>
<li><a href="home">Home</a></li>
<li><a href="services">Services</a></li>
<li><a href="about">About</a></li>
<li><a href="contact">Contact</a></li>
</ul>

Of course there are other extensions apart from php and html, and this code would probably work for them too you will just have to experiment.

Duplicate Content And The Trailing Slash

htaccess and the trialing slash

Another well known issue with urls is the trailing slash. Like the canonical address issue, servers will sometimes return results in SERPS for the webpage url with and without the trailing slash, creating what could be viewed by the search engines as duplicate content. Again there is lots of debate about whether you should have the slash at the end of your clean url or not. But the issue resides on what type of website you have and gets more complicated with large and more dynamic websites.

It is your own preference, as there is no evidence yet to suggest it makes any difference. The most important thing is, is that you choose to return one or the other. For small service and information websites its not really an issue as your urls are more likely to only contain words pertaining to your web page content which is more then enough for them spiders to get acquainted with.

The rule is - a slash at the end of a file would denote a directory where as to leave it blank, ie no slash, would be to denote a file.

And a file with content is what this webpage is, so to eliminate any problems search engines might have with indexing your webpages you can insert this code which will remove the trailing slash and stop any duplicate content.

RewriteRule ^(.*)/$ http://%{HTTP_HOST}/$1

Redirect 404 Errors To A Custom Page

custom 404

You can also utilize the htaccess file to redirect 404 and other errors like -

  • 400 - Bad request
  • 401 - Authorization Required
  • 403 - Forbidden directory
  • 404 - Page not found
  • 500 - Internal Server Error

to custom pages.

This short piece of code will make sure any lost net surfers will be set on the right path -

ErrorDocument 404 /temp/page-temporarily-unavailable404

You can add further error pages like this -

ErrorDocument 400 /temp/page-temporarily-unavailable400
ErrorDocument 401 /temp/page-temporarily-unavailable401
ErrorDocument 403 /temp/page-temporarily-unavailable403
ErrorDocument 500 /temp/page-temporarily-unavailable500

Ofcourse you can name your custom error pages whatever you like, page-temporarily-unavailable404 is the name i gave the page. Make sure you also add the directory the file is in, as in the above example my custom error page is in a directory called temp.

You could also place the following piece of javascript in the bottom of the custom page which will automatically redirect after 10 seconds (1 second = 1000 milliseconds). You can set the redirect time to whatever you want, and its landing page.

<script type="text/javascript">
setTimeout('ourRedirect()', 10000)
function ourRedirect (){
	location.href='http://www.your-website.com/'}
    </script>

Unleash Your Website With The Gzip Code For The htaccess File
faster webpages with gzip

One of the first SEO tools i used was the SEOquake toolbar, and among the different tests it runs with its diagnostic tool was the server test which for my website came up with -

gzip - No

Under Show Advice it said:

Deserves moderate attention
Solution is difficult
We suggest that you use Gzip.
You can boost the speed of your website by using Gzip.

I spent ages researching gzip, and i have to admit i couldn't figure out how to implement it. Turns out i was looking at it all wrong. Still no matter as i learnt from the experience.

Gzip has to be implemented from the server. Some website hosts do this automatically but others don't. Eventually i found the answer and with much relief discovered it can be initiated from the htaccess file.

What gzip does is to compress your web page and communicate with your visitors browser to serve the contents of your page in its smaller compressed state. Google developers Optimizing encoding and transfer size of text-based assets explains it all along with some other techniques. Add the code below to initiate gzip.

AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE application/x-javascript
AddType x-font/woff .woff
AddType image/x-icon .ico
AddType image/png .png
AddType image/svg .svg
AddType image/jpg .jpg
BrowserMatch ^Mozilla/4 gzip-only-text/html
BrowserMatch ^Mozilla/4\.0[678] no-gzip
BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
Header append Vary User-Agent

The htaccess code for this is pretty self explanatory and you can add types to it if required.

However if you have php pages on your website then the above code will not work properly. To initiate gzip on a webpage served as php, you have to add this to the top of every page.

<?php if (substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip')) ob_start("ob_gzhandler"); else ob_start(); ?>

Conquer The 404 With A 301 Redirect
301 redirects

404’s cant be helped sometimes, or can they, well no, i suppose if you have a whopping website and several people working on it web pages will get moved about. But in your case there is no excuse! With your new found friend htaccess you can banish them 404s and custom pages so no one will ever see them again except for you, maybe.

301 redirects are real easy to implement and real handy. There have been occasions where i had some strange pages turn up on my website, and until i had time to investigate what had gone wrong the 301 redirect took care of them. I also changed the URL’s on quite a few of my web pages to bring them up to SEO scratch, and used the 301 to redirect the old pages to the new ones. It’s a lot easier then the fetch tool at google webmaster, which of course only takes care of google.

The 301 redirect takes care of all search engines, until they index your new pages. It also takes care of websites that have linked to those pages, which is a really good thing, as when those websites check for broken links yours will not be one of them.

Insert the below code to redirect old URL’s to the new URL’s

Redirect 301 /information/free-seo-tools http://www.your-website.co.uk/articles/free-seo-tools

Cash In On Them Caches For Some Extra Speed

Caching is a process in which a browser stores resources on a PC, laptop, tablet smartphone, like images and text files when it downloads a webpage

Unless the individual visitor cleans their cache on a regular basis then next time they visit your site the browser will automatically look to see if it already contains files in its cache from the website to use, as to speed up the download of the webpage on its current visit.

Within the htaccess file you can control the way the browser reacts when visiting your website by using mod_expires and mod_headers.

Using mod_headers gives you greater control of the caching process, as it tells the browser which files should be revalidated, as to whether they need to be downloaded, and which files should be cached on someones elses pc, along with different expiry times for different file types.

<ifModule mod_headers.c>
# Turn on Expires and set default expires to 3 days
ExpiresActive On
ExpiresDefault A259200
 
<filesMatch ".(ico|gif|jpg|jpeg|png|flv|pdf|swf|mov|mp3|wmv|ppt|svg)$">
ExpiresDefault A2419200
Header append Cache-Control "public"
</filesMatch>
 
<filesMatch ".(xml|txt|html|js|css)$">
ExpiresDefault A604800
Header append Cache-Control "private, must-revalidate"
</filesMatch>
 
# Force no caching for dynamic files
<filesMatch ".(php|cgi|pl|htm)$">
ExpiresDefault A0
Header set Cache-Control "no-store, no-cache, must-revalidate, max-age=0"
Header set Pragma "no-cache"
</filesMatch>
</ifModule>

You can adjust the expire times to your own needs which is calculated in seconds ie.

1 minute = 60
1 hour = 3600
1 day = 86400

You can add or remove the file types you require from the example above.

Using AddCharset For Your Webpages And AddType For RSS

Your htaccess file can also set the charset of your webpages, so instead of using the Meta Tag -

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

you could insert this into the htaccess file -

AddCharset utf-8 .html .css .php

The only down side to this is that if a visitor to your website views your page offline the browser may have difficulty in displaying certain characters as the charset will not be declared. W3 gives some good advice and explinations on Setting charset information in .htaccess.

This next piece of code though cheered me up and works for RSS feeds. I like my feed and it gets submitted wherever i can. It validated but i always got a warning -

Your feed appears to be encoded as "utf-8", but your server is reporting "US-ASCII"

Puzzled me for ages but i eventually found the answer which is -

AddType application/rss+xml;charset=utf-8 .rss

after adding this there was no more confusion, and i got a perfectly valid feed.

How Do I Stop People Viewing My htaccess File?

block outside access to htaccess

Finally after all that! You want to hide this file from the prying eyes of those who may wish to use the information contained within it for nefarious uses. What kind of uses would that be? Well there is not much in this article that you would want to hide, but the htaccess file can also be used to password protect webpages on your website. So to protect the file you can use this piece of code, with strong pattern matching, for extra measure.

<Files ~ "^.*\.([Hh][Tt][Aa])">
 order allow,deny
 deny from all
 satisfy all
</Files>

Although you will find that most hosts automatically protect the file anyway. But just in case best add it in there.

The htaccess file is one of those website tools which is truly invaluable and free! With ease it clears up so many problems often with just a few lines of code. Its simply super duper!

You Might Also Like

Let me know what you think!