Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have a sed command that is working fine, except when it comes across a newline right in the file somewhere. Here is my command:

sed -i 's,<a href="(.*)">(.*)</a>,2 - 1,g'

Now, it works perfectly, but I just ran across this file that has the a tag like so:

<a href="link">Click
        here now</a>

Of course it didn't find this one. So I need to modify it somehow to allow for lines breaks in the search. But I have no clue how to make it allow for that unless I go over the entire file first off and remove all before hand. Problem there is I loose all formatting in the file.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
276 views
Welcome To Ask or Share your Answers For Others

1 Answer

You can do this by inserting a loop into your sed script:

sed -e '/<a href/{;:next;/</a>/!{N;b next;};s,<a href="(.*)">(.*)</a>,2 - 1,g;}' yourfile

As-is, that will leave an embedded newline in the output, and it wasn't clear if you wanted it that way or not. If not, just substitute out the newline:

sed -e '/<a href/{;:next;/</a>/!{N;b next;};s/
//g;s,<a href="(.*)">(.*)</a>,2 - 1,g;}' yourfile

And maybe clean up extra spaces:

sed -e '/<a href/{;:next;/</a>/!{N;b next;};s/
//g;s/s{2,}/ /g;s,<a href="(.*)">(.*)</a>,2 - 1,g;}' yourfile

Explanation: The /<a href/{...} lets us ignore lines we don't care about. Once we find one we like, we check to see if it has the end marker. If not (/<a>/!) we grab the next line and a newline (N) and branch (b) back to :next to see if we've found it yet. Once we find it we continue on with the substitutions.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...