Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have a content which has content along with HTML tags inside the content. I am trying to identify <ins></ins> and <del></del> with the conditions mentioned in the image

http://i.stack.imgur.com/8iNWl.png

The regex is https://regex101.com/r/cE4mE3/30

It is failing in only single case, that is when there an HTML tag or special character inside <ins></ins> its not identifying correctly. In the above regex there is a </ins></ins> inside another <ins></ins> and hence it is breaking before the start of open <ins> tag. The regex identification must stop only when there is fullstop or comma or space between an <ins></ins>. But if there is any HTML tag or another <ins></ins> tag itself inside another <ins></ins> the identification must continue.

In the above regex the groups which are to be selected are

 1. <ins class="ins">ff</ins><del class="del">C</del>om<del class="del"> </del><ins class="ins"><ins class="ins">g</ins></ins><del class="del"> g</del>gp<del class="del">a</del>n<del class="del">y</del>

and

 2. test<del class="del">test</del><ins class="ins">tik</ins><del class="del">peop</del>man<del class="del"> </del></i><del class="del"> g</del>gp<del class="del">a</del>n<del class="del">y</del>

But as there are HTML tags between the identification is stopping near the HTML tag in 1 and 2 groups.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
482 views
Welcome To Ask or Share your Answers For Others

1 Answer

Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...