Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

A Java char is 2 bytes (max size of 65,536) but there are 95,221 Unicode characters. Does this mean that you can't handle certain Unicode characters in a Java application?

Does this boil down to what character encoding you are using?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
232 views
Welcome To Ask or Share your Answers For Others

1 Answer

You can handle them all if you're careful enough.

Java's char is a UTF-16 code unit. For characters with code-point > 0xFFFF it will be encoded with 2 chars (a surrogate pair).

See http://www.oracle.com/us/technologies/java/supplementary-142654.html for how to handle those characters in Java.

(BTW, in Unicode 5.2 there are 107,154 assigned characters out of 1,114,112 slots.)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...