Friday, January 13, 2012

 

JavaScript's encodeURIComponent and character sets

The EcmaScript 5 standard (section 15.1.3.4) and the Mozilla Developer Network page on encodeURIComponent both say that the string will be encoded to UTF-8, regardless of the underlying page's character set. That seemed entirely too convenient to believe, so I wrote a page that would let me exercise it. I hit it with modern browsers as well as Firefox 2 and Internet Explorer 5.5 and 6. Amazingly, they handled it correctly, even for pages in obscure character sets like GBK.

 Sleep well tonight in the knowledge that your characters are safe in hands of encodeURIComponent.

This page is powered by Blogger. Isn't yours?