Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I want to extract lines following some sequence from a file. E.g. a file contains many lines and I want line in sequence

journey (a,b) from station south chennai to station punjab chandigarh
journey (c,d) from station jammu katra to city punjab chandigarh
journey (e) from station 

let's say above is the code and I want to extract the following information from the first first two lines:

e.g this is the sequence first word is journey--- then brackets will contain two words, ---- then word from --- and then it could be word station or city --- and then again any string --- then again word to --- and then it could be word station or city---

What would be the regular expression for that? Note: Words in brackets may contain special characters e.g -,_

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
258 views
Welcome To Ask or Share your Answers For Others

1 Answer

This will return the elements you want:

import re

s = '''journey (a,b) from station south chennai to station punjab chandigarh
journey (c,d) from station jammu katra to city punjab chandigarh
journey (e) from station
journey (c,d) from station ANYSTRING jammu katra to ANYSTRING city punjab chandigarh
'''

matches_single = re.findall('journey (([^,]+,[^,]+)) from (S+ S+s{0,1}S*) to (S+ S+s{0,1}S*)', s)
for match in matches_single:
    print(match)
matches_line = re.findall('(journey ([^,]+,[^,]+) from S+ S+s{0,1}S* to S+ S+s{0,1}S*)', s)
for match in matches_line:
    print(match)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...