Should Your URLs Point to the Directory or the Index Page?

URLs with Directory Name Vs Full URLs with Filename


Should Your URLs Point to the Directory or the Index Page?

by Christopher Heng, thesitewizard.com

I recently received a query from a visitor attempting to create a Sitemaps file using the Sitemaps protocol as described in the tutorial How to Get Search Engines to Discover (Index) All the Web Pages on Your Site. He wanted to know whether he should refer to a page on his site as (say) www.example.com/about/index.html, as www.example.com/about/ or as both in his site map. Both web addresses ("URLs") point to the same file. This brief article attempts to answer that question. My answer, however, as you will see, applies to more than just the site map.

Background Information

Web servers are configured to deliver a default web page (if it exists) whenever a browser requests for a directory name. For example, if you were to ask for www.example.com/about/, a typical web server will look for a file called index.html in the about folder of your website. If it exists, the server will deliver that page's content to the browser. The browser's address bar, however, will still show the URL you requested, which is www.example.com/about/ in this example. If the page does not exist, and the server is not configured to look for any other index page, it will just show a directory listing of the about folder (unless you have disabled that facility on your site).

This means that for special pages like the "index.html" of your directory, there are actually two ways of accessing the file.

Multiple URLs and Content Duplication

The problem of having more than one URL pointing to the same file is not primarily a human usability problem (since humans can easily figure out they're looking at the same page). It is a search engine problem. I have written about this problem at length elsewhere, such as in the article How to Create a Search Engine Friendly Website. If you have not read that article, please read it now before continuing futher. I shall assume you understand the issues of content duplication in the rest of this article.

In view of the problems discussed in that tutorial, if there are two or more ways of referring to a particular web page on your site, you should always decide on one URL and consistently use that on your site. For example, decide whether you want to refer to a page as www.example.com/about/ or www.example.com/about/index.html. Once you've made that decision, make sure that all web pages on your site link to the page using the form of URL that you've settled on. Your site map, whether a normal site map or the search engine specific site map using the sitemaps protocol, should also refer to that page with the same URL.

Does It Matter Which URL I Use?

Which form of URL should you use? The one with the directory name alone, or the one with the filename? There are a few ways to look at this.

  1. Some people argue that using the directory name alone (like www.example.com/about/) is superior to using the actual filename. When you use the directory name, the web server will transparently find the index file and deliver it to the user (or search engine). In theory, this means that if you ever want to change to use a different filename for your index page, such as if you want to use a script like index.php to display the page instead of a static page like index.html, you can easily do that without changing any of the URLs on your site. All you have to do is to modify your server configuration file accordingly.

  2. In practice, however, the above advantage is not significant. If you currently directly refer to index.html and later want to use a script file named index.php to generate the content, it's also possible to modify your server configuration file so that the web server invokes index.php when the index.html file is requested. The technique for this is given in my article How to Masquerade Your CGI/PHP Scripts as Static HTML Pages and it involves no more work than that required to deliver index.php for a directory name.

  3. If you have a brand new site that has not been indexed yet, and cannot decide which method to use, use the directory name form (like example.com/about/). I personally think it is marginally better because the URL is shorter. Short URLs have some advantages: besides being easier to remember, they also avoid some of the mangling that hits long URLs by third party sites and forum software, as mentioned in my article How to Create Good Filenames for Your Web Pages.

  4. If your site has already been in existence for some time, you should look for the form of URL that is most frequently used, both by your website and by others linking to your site, and use that URL consistently throughout your site. This is the method I adopted on thesitewizard.com. The site had already been in existence for a while, with its subdirectory index pages referred to by name, before I realised I preferred the shorter form. Since there were already many links pointing directly to these folder index pages, changing them will cause more problems than it solves. As a result, I decided to be consistent and stick to the form I had been using in the past.

    If your site is in the same boat, and you have a change of heart about what constitutes a prettier URL, you too may have to resign yourself to your current form for practical reasons, as I did.

Don't spend too much time mulling over whether to use the directory name or the index file form of URL. In practice, it probably does not matter which you use. Just decide on one form and stick to it. The important thing is to be consistent. If you have referred to a file as example.com/about/ in the past, continue to refer to it as such and don't link to it in other places as example.com/about/index.html. This applies to both your web pages as well as to your site map.

Copyright © 2008 by Christopher Heng. All rights reserved.
Get more free tips and articles like this, on web design, promotion, revenue and scripting, from http://www.thesitewizard.com/.

thesitewizard™ News Feed (RSS Site Feed)  Subscribe to thesitewizard.com newsfeed

Do you find this article useful? You can learn of new articles and scripts that are published on thesitewizard.com by subscribing to the RSS feed. Simply point your RSS feed reader or a browser that supports RSS feeds at http://www.thesitewizard.com/thesitewizard.xml. You can read more about how to subscribe to RSS site feeds from my RSS FAQ.

Please Do Not Reprint This Article

This article is copyrighted. Please do not reproduce this article in whole or part, in any form, without obtaining my written permission.

Related Pages

New Articles

Popular Articles

How to Link to This Page

It will appear on your page as:

Should Your URLs Point to the Directory or the Index Page?





Home
Donate
Contact Us
Link to Us
Topics
Site Map

Getting Started
Web Design
Search Engines
Revenue Making
Domains
Web Hosting
Blogging
JavaScripts
PHP
Perl / CGI
HTML
CSS
.htaccess / Apache
Newsletters
General
Seasonal
Reviews
FAQs
Wizards

 

 
Free webmasters and programmers resources, scripts and tutorials
 
HowtoHaven.com: Free How-To Guides
 
Site Design Tips at thesitewizard.com
Find this site useful?
Please link to us.