Title: What are those strange URL prefixes (e.g. %7E) in my log files?
Question:
I'm looking at my personal log file information and some of the entries
contain the string %7E right before my UW NetID. What's with this?
Answer:
The %7E represents a tilde character (~) that has been specially encoded
so that it's okay to place in a Web address. For example, you might see
this
/%7Ewebdemo/index.html
where you might expect to see this:
/~webdemo/index.html
Note: the tilde used to be required, but it is no longer used on the
central UW Web servers for students, staff, faculty, courses, and
departments.
The strange %7E is a result of the official Internet specification for Web
addresses or URLs. Perusing the specification reveals a set of characters
deemed "unsafe" within URLs. The tilde (~) character is one of these
characters. Here is the rationale for this, straight from the source
[Internet RFC 1738]:
Other characters are unsafe because gateways and other
transport agents are known to sometimes modify such
characters. These characters are "{", "}", "|", "\",
"^", "~", "[", "]", and "`".
Under strict adherence to the specifications, unsafe characters are
encoded in URLs. The encoding scheme uses a hexadecimal representation of
each unsafe character to make it safe. For example, according to the ASCII
character set, the tilde character corresponds with a hexadecimal value of
7E. So, the string "%7E" denotes an encoded tilde character, one that is
safe for use in a URL.
How often do people and browsers actually follow a strict interpretation
of this rule with regard to the tilde character? Not very often it seems.
You should account for a few of these entries in your log file
information, but the majority of time tildes are unencoded. It's
impractical to type the encoded version, and, besides, the unencoded
version seems to work just fine.
For more thorough information on this topic, refer to the URL
specification itself:
http://www.w3.org/Addressing/
Don't be surprised if you find yourself chuckling over the germane
qualifier: "This specification does not necessarily describe WWW
practice." People using the tilde notation understand this well.