Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

In my string I have utf-8 non-breaking space (0xc2a0) and I want to replace it with something else.

When I use

$str=preg_replace('~xc2xa0~', 'X', $str);

it works OK.

But when I use

$str=preg_replace('~x{C2A0}~siu', 'W', $str);

non-breaking space is not found (and replaced).

Why? What is wrong with second regexp?

The format x{C2A0} is correct, also I used u flag.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
423 views
Welcome To Ask or Share your Answers For Others

1 Answer

Actually the documentation about escape sequences in PHP is wrong. When you use xc2xa0 syntax, it searches for UTF-8 character. But with x{c2a0} syntax, it tries to convert the Unicode sequence to UTF-8 encoded character.

A non breaking space is U+00A0 (Unicode) but encoded as C2A0 in UTF-8. So if you try with the pattern ~x{00a0}~siu, it will work as expected.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...