Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have many log files with the format like:

2012-09-12 23:12:00 other logs here

and i need to extract the time string and compare the time delta between two log records. I did that with this:

for line in log:
    l = line.strip().split()
    timelist = [int(n) for n in re.split("[- :]", l[0]+' ' + l[1])]
    #now the timelist looks like [2012,9,12,23,12,0]

Then when i got two records

d1 = datetime.datetime(timelist1[0], timelist1[1], timelist1[2], timelist1[3], timelist1[4], timelist1[5])
d2 = datetime.datetime(timelist2[0], timelist2[1], timelist2[2], timelist2[3], timelist2[4], timelist2[5])
delta = (d2-d1).seconds

The problem is it runs slowly,is there anyway to improve the performance?Thanks in advance.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
191 views
Welcome To Ask or Share your Answers For Others

1 Answer

You could get rid of the regex and use map:

date_time = datetime.datetime

for line in log:
    date, time = line.strip().split(' ', 2)[:2]

    timelist = map(int, date.split('-') + time.split(':'))
    d = date_time(*timelist)
  • I think .split(' ', 2) will be faster than just .split() because it only splits up to two times and only on spaces, not on any whitespace.
  • map(int, l) is faster than [int(x) for x in l] the last time I checked.
  • If you can, get rid of .strip().

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...