Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I got some text extracted and wish to clean it up by RegEx.

I have learned basic RegEx but not sure how to build this one:

str = '''
this is 
a line that has been cut.
This is a line that should start on a new line
'''

should be converted to this:

str = '''
this is a line that has been cut.
This is a line that should start on a new line
'''

This r'w w' seems to catch it, but not sure how to replace the new line with space and not touch the end and beginning of words

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
148 views
Welcome To Ask or Share your Answers For Others

1 Answer

You can use this lookbehind regex for re.sub:

>>> str = '''
... this is
... a line that has been cut.
... This is a line that should start on a new line
... '''
>>> print re.sub(r'(?<!.)
', '', str)
this is a line that has been cut.
This is a line that should start on a new line
>>>

RegEx Demo

(?<!.) matches all line breaks that are not preceded by a dot.

If you don't want a match based on presence of dot then use:

re.sub(r'(?<=ws)
', '', str)

RegEx Demo 2


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...