I have a string that is a sentence, written in chinese.
This contains chinese characters, and other filler things, like spaces, comma, exclamation marks and etc., all encoded in UTF8.
Using regex with a latin1 string, I could use preg_replace
and [a-zA-Z]
to clean it and remove the filler.
How can I keep only the chinese "alphabet" characters in the chinese string while removing all the filler items?
See Question&Answers more detail:os