Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

This piece of code is reading a large file line by line, process each line then end the process whenever there is no new entry:

file = open(logFile.txt', 'r')
count = 0

     while 1:
         where = file.tell()
         line = file.readline()
         if not line:
             count = count + 1
             if count >= 10:
               break
             time.sleep(1)
             file.seek(where)
         else:
            #process line 

In my experence, reading line by line takes very long time, so I tried to improve this code to read chunk of lines each time:

from itertools import islice
N = 100000
with open('logFile.txt', 'r') as file:
    while True:
       where = file.tell()
       next_n_lines = list(islice(file, N)).__iter__()
       if not next_n_lines:
          count = count + 1
          if count >= 10:
             break
          time.sleep(1)
          file.seek(where)
       for line in next_n_lines:
        # process next_n_lines

This works fine except for the ending part, it doen't end the process (break the while loop) even if there is no more lines in file. Any suggestions?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
163 views
Welcome To Ask or Share your Answers For Others

1 Answer

The original code already reads large chunks of the file at a time, it just returns one line of the data at a time. You've just added a redundant generator that takes 10 lines at a time, using the read line functionality of the file object.

With only a few exceptions, the best way to iterate over the lines in a file is as follows.

with open('filename.txt') as f:
    for line in f:
        ...

If you need to iterate over preset numbers of lines at time then try the following:

from itertools import islice, chain

def to_chunks(iterable, chunksize):
    it = iter(iterable)
    while True:
        first = next(it)
        # Above raises StopIteration if no items left, causing generator
        # to exit gracefully.
        rest = islice(it, chunksize-1)
        yield chain((first,), rest)


with open('filename.txt') as f:
    for chunk in to_chunks(f, 10):
        for line in chunk:
            ...

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...