Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

In answering another question I became aware that my Javascript/DOM knowledge had become a bit out of date in that I am still using escape/unescape to encode the contents of URL components whereas it appears I should now be using encodeURIComponent/decodeURIComponent instead.

What I want to know is what is wrong with escape/unescape ? There are some vague suggestions that there is some sort of problem around Unicode characters, but I can't find any definite explanation.

My web experience is fairly biased, almost all of it has been writing big Intranet apps tied to Internet Explorer. That has involved a lot of use of escape/unescape and the apps involved have fully supported Unicode for many years now.

So what are the Unicode problems that escape/unescape are supposed to have ? Does anyone have any test cases to demonstrate the problems ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
591 views
Welcome To Ask or Share your Answers For Others

1 Answer

What I want to know is what is wrong with escape/unescape ?

They're not “wrong” as such, they're just their own special string format which looks a bit like URI-parameter-encoding but actually isn't. In particular:

  • ‘+’ means plus, not space
  • there is a special “%uNNNN” format for encoding Unicode UTF-16 code points, instead of encoding UTF-8 bytes

So if you use escape() to create URI parameter values you will get the wrong results for strings containing a plus, or any non-ASCII characters.

escape() could be used as an internal JavaScript-only encoding scheme, for example to escape cookie values. However now that all browsers support encodeURIComponent (which wasn't originally the case), there's no reason to use escape in preference to that.

There is only one modern use for escape/unescape that I know of, and that's as a quick way to implement a UTF-8 encoder/decoder, by leveraging the UTF-8 processing in URIComponent handling:

utf8bytes= unescape(encodeURIComponent(unicodecharacters));
unicodecharacters= decodeURIComponent(escape(utf8bytes));

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...