For URLs The Future is Flat
by Thomas | July 12th, 2007
Last week I discussed domains a bit and in the past I have written about new URL forms such as dirty URLs. Today I present a corollary to those ideas which I dub the “flat URL.” The idea is simple, don’t bail out right away – there are some payouts to be had if you read and think carefully.
Basically it boils down to this, instead of using the traditional URL form: http://www.companyname.com/products/category/productname.html or following the ideas of the linked article: http://company.com/products/category/productname you change your URL structure to something like: http://company.com/productname.
Ok, you might think I am just suggesting to put everything in a site’s root directory. No, not quite and that actually would be a tremendously bad idea for managing the content of the site. In fact it would a have a potentially serious delivery impact beyond the organizational chaos because if you have thousands of documents in a single directory the operating system will have a problem going through and fetching a request for one in a particularly timely fashion.
What I am really getting at here is the difference between logical and physical address spaces. From a physical point of view go ahead and use your content management system or even just file system to organize your content in a hierarchy. Thus you would still have your /products/productname style organization. However, do not expose that to the user – show them a simple logical system–a flat overlay – with the idea being simply http://company.com/Keyword-or-unique-ID. Think of the slash as a search query trigger against our special site content keywords.
So in the site I might have the keyword ‘Thomas’ which would have synonyms like Thomas+Powell, Thomas+A+Powell, Tom, Tom+Powell and if you were my aunt Nancy, Tommy, but Aunt Nancy fortunately is not part of our target audience so we won’t map that one. The physical location of what this keyword translates into would be /about/staff/thomas.html or something similar likely without file extension. So if the user typed in http://pint.com/thomas or one of the variations they would receive the content of interest. We would have a server filter that would map this keyword entry to the appropriate URL. This isn’t anything that new as many Wiki systems use this but they put something in the way to do the translation so their links are http://pint.com/keyword/thomas. That is a fine first step towards flat URLs if you want to do it but if you know the idea, understand that technically speaking it isn’t much harder to remove the “/keyword/” part of the URL.
No you might say, “wait a minute, how many people really would use this kind of direct URL mapping?” Well likely few, but you are missing the point a bit. This form wouldn’t just be used in direct type-in URLs, I would use them in my very XHTML markup so a link to my page wouldn’t be <a href=”http://pint.com/about/staff/thomas.html”> it would be <a href=”http://pint.com/thomas”>. Bloggers should be familiar with this it is the Perm-URI concept. However, what is suggested is that you use that scheme in all of your links internally as well. By doing this you have provide a degree of abstraction that is quite valuable since you are now hiding internal content organization. You can now move around all your files as long as you provide that mapping from logical to physical.
Let’s summarize some of the pros and cons of flat URLs that come to mind:
- Short and easy to type both for end user and site maintainer doing markup
- Better for marketing purposes for type and remember
- Hide physical organization from end user
- Allows for changing file system structure
- Has small side-effect of making it difficult for intruders to map information space
- Promotes keyword focused thinking and taxonomy or at least folksonomy-building
- Complexity of implementing system to drive keyword system
- Effort of planning of keywords
- Collision of keywords with directory names or other keywords
There is no doubt that this will take some more effort to set-up, but it is also clear that hierarchies fail at very large sizes and a keyword overlay, which is really what search does, is the only route to sanity. However, historically speaking there is much reason to believe this approach to organization is the correct one. DNS for example works this way – cdn.pint.com converts to some underlying IP that is more implementation related. Why not consider using more friendly and portable abstractions than the location of content? Once you do this you can see all sorts of possibilities for content-based load balancing and a whole host of other (at this point) seemingly bizarre ideas.
It is my personal take that URLs as created today are the same as Pennsylvania 6-5000 – likely to be misunderstood by our children or even ourselves as an archaic information access format. Though I guess we may have to wait until after someone creates a catchy ditty that is about a Web address. Jenny Jenny how can I link to you.. w-w-w-dot-jenny-dot-com, er…maybe not. I guess little rhymes with .COM or www so we might have to wait a long while.
<plug alert>Do you want this now? – more capable content management systems and things like’s PINT’s Web platform are architected to do just what this article describes. So talk to who runs your site, your vendor or us to flatten your URL world</plug alert>