Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I need to remove all /*...*/ style comments from JSON data. How do I do it with regular expressions so that string values like this

{
    "propName": "Hello " /* hi */ there."
}

remain unchanged?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.1k views
Welcome To Ask or Share your Answers For Others

1 Answer

You must first avoid all the content that is inside double quotes using the backtrack control verbs SKIP and FAIL (or a capture)

$string = <<<'LOD'
{
    "propName": "Hello " /* don't remove **/ there." /*this must be removed*/
}
LOD;

$result = preg_replace('~"(?:[^"]+|\.)*+"(*SKIP)(*FAIL)|/*(?:[^*]+|*+(?!/))*+*/~s', '',$string);

// The same with a capture:

$result = preg_replace('~("(?:[^"]+|\.)*+")|/*(?:[^*]+|*+(?!/))*+*/~s', '$1',$string);

Pattern details:

"(?:[^"]+|\.)*+"

This part describe the possible content inside quotes:

"              # literal quote
(?:            # open a non-capturing group
    [^"]+   # all characters that are not  or "
  |            # OR
    \.)*+    # escaped char (that can be a quote)
"

Then You can make this subpattern fails with (*SKIP)(*FAIL) or (*SKIP)(?!). The SKIP forbid the backtracking before this point if the pattern fails after. FAIL forces the pattern to fail. Thus, quoted part are skipped (and can't be in the result since you make the subpattern fail after).

Or you use a capturing group and you add the reference in the replacement pattern.

/*(?:[^*]+|*+(?!/))*+*/

This part describe content inside comments.

/*           # open the comment
(?:           
    [^*]+     # all characters except *
  |           # OR
    *+(?!/)  # * not followed by / (note that you can't use 
              # a possessive quantifier here)
)*+           # repeat the group zero or more times
*/           # close the comment

The s modifier is used here only when a backslash is before a newline inside quotes.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...