New Technologies

  • Java
  • Javascript
  • DTML
  • Dot Net
  • ASP .Net
  • C# .Net
  • PHP
Your Ad Here

Monday, December 15, 2008

What is the maximum length of a URL?

Microsoft Internet Explorer (Browser)

Microsoft states that the maximum length of a URL in Internet Explorer is 2,083 characters, with no more than 2,048 characters in the path portion of the URL. In my tests, attempts to use URLs longer than this produced a clear error message in Internet Explorer.

Firefox (Browser)

After 65,536 characters, the location bar no longer displays the URL in Windows Firefox 1.5.x. However, longer URLs will work. I stopped testing after 100,000 characters.

Safari (Browser)

At least 80,000 characters will work. I stopped testing after 80,000 characters.

Opera (Browser)

At least 190,000 characters will work. I stopped testing after 190,000 characters. Opera 9 for Windows continued to display a fully editable, copyable and pasteable URL in the location bar even at 190,000 characters.

Apache (Server)

My early attempts to measure the maximum URL length in web browsers bumped into a server URL length limit of approximately 4,000 characters, after which Apache produces a “413 Entity Too Large” error. I used the current up to date Apache build found in Red Hat Enterprise Linux 4. The official Apache documentation only mentions an 8,192-byte limit on an individual field in a request.

Microsoft Internet Information Server

The default limit is 16,384 characters (yes, Microsoft’s web server accepts longer URLs than Microsoft’s web browser). This is configurable.

Perl HTTP::Daemon (Server)

Up to 8,000 bytes will work. Those constructing web application servers with Perl’s HTTP::Daemon module will encounter a 16,384 byte limit on the combined size of all HTTP request headers. This does not include POST-method form data, file uploads, etc., but it does include the URL. In practice this resulted in a 413 error when a URL was significantly longer than 8,000 characters. This limitation can be easily removed. Look for all occurrences of 16×1024 in Daemon.pm and replace them with a larger value. Of course, this does increase your exposure to denial of service attacks.

Recommendations

Extremely long URLs are usually a mistake. URLs over 2,000 characters will not work in the most popular web browser. Don’t use them if you intend your site to work for the majority of Internet users.When you wish to submit a form containing many fields, which would otherwise produce a very long URL, the standard solution is to use the POST method rather than the GET method:

<form action="myscript.php" method="POST">
...
</form>

The form fields are then transmitted as part of the HTTP transaction body, not as part of the URL, and are not subject to the URL length limit. Short-lived information should not be stored in URLs.As a rule of thumb, if a piece of information isn’t needed to regenerate the same page as a result of returning to a favorite or bookmark, then it doesn’t belong in the URL.

The Bookmark Problem

In very rare cases, it may be useful to keep a large amount of “state” information in a URL. For instance, users of a map-navigating website might wish to add the currently displayed map to their “bookmarks” or “favorites” list and return later. If you must do this and your URLs are approaching 2,000 characters in length, keep your representation of the information as compact as you can, squeezing out as much “air” as possible. If your field names take up too much space, use a fixed field order instead. Squeeze out any field that doesn’t really need to be bookmarked. And avoid large decimal numbers - use only as much accuracy as you must, and consider a base-64 representation using letters and digits (I didn’t say this was easy).In extreme cases, consider using the gzip algorithm to compress your pretty but excessively long URL. Then reencode that binary data in base64 using only characters that are legal in URLs. This can yield a 3-4x space gain, at the cost of some CPU time when you unzip the URL again on the next visit. Again, I never said it was easy!

An alternative is to store the state information in a file or a database. Then you can store only the identifier needed to look up that information again in the URL. The disadvantage here is that you will have many state files or database records. Some of which might be linked to on websites run by others. One solution to this problem is to delete the state files or database records for the URLs that have not been revisited after a certain amount of time.

“What happens if the URL is too long for the server?”What exactly happens if a browser that supports very long URLs (such as Firefox) submits a long URL to a web server that does not support very long URLs (such as a standard build of Apache)?

The answer: nothing dramatic. Apache responds with a “413 Entity Too Large” error, and the request fails.

This response is preferable to cutting the URL short, because the results of cutting the URL short are unpredictable. What would that mean to the web application? It varies. So it’s better for the request to fail.

In the bad old days, some web servers and web browsers failed to truncate or ignore long URLs, resulting in dangerous “buffer overflow” situations. These could be used to insert executable code where it didn’t belong… resulting in a security hole that could be exploited to do bad things.

These days, the major browsers and servers are secure against such obvious attacks - although more subtle security flaws are often discovered (and, usually, promptly fixed).

While it’s true that modern servers are themselves well-secured against long URLs, there are still badly written CGI programs out there. Those who write CGI programs in C and other low-level languages must take responsibility for paying close attention to potential buffer overflows. The CGIC library can help with this.

In any case, if you’re a web developer and you’re still asking this question, then you probably haven’t paid attention to my advice about how to avoid the problem completely.

 

No comments:

Your Ad Here