Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have an input file as such:

This is a text block start
This is the end

And this is another
with more than one line
and another line.

The desired task is to read the files by section delimited by some special line, in this case it's an empty line, e.g. [out]:

[['This is a text block start', 'This is the end'],
['And this is another','with more than one line', 'and another line.']]

I have been getting the desired output by doing so:

def per_section(it):
    """ Read a file and yield sections using empty line as delimiter """
    section = []
    for line in it:
        if line.strip('
'):
            section.append(line)
        else:
            yield ''.join(section)
            section = []
    # yield any remaining lines as a section too
    if section:
        yield ''.join(section)

But if the special line is a line that starts with # e.g.:

# Some comments, maybe the title of the following section
This is a text block start
This is the end
# Some other comments and also the title
And this is another
with more than one line
and another line.

I have to do this:

def per_section(it):
    """ Read a file and yield sections using empty line as delimiter """
    section = []
    for line in it:
        if line[0] != "#":
            section.append(line)
        else:
            yield ''.join(section)
            section = []
    # yield any remaining lines as a section too
    if section:
        yield ''.join(section)

If i were to allow the per_section() to have a delimiter parameter, I could try this:

def per_section(it, delimiter== '
'):
    """ Read a file and yield sections using empty line as delimiter """
    section = []
    for line in it:
        if line.strip('
') and delimiter == '
':
            section.append(line)
        elif delimiter= '#' and line[0] != "#":
            section.append(line)
        else:
            yield ''.join(section)
            section = []
    # yield any remaining lines as a section too
    if section:
        yield ''.join(section)

But is there a way so that i don't hard-code all possible delimiters?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
526 views
Welcome To Ask or Share your Answers For Others

1 Answer

How about pass a predicate?

def per_section(it, is_delimiter=lambda x: x.isspace()):
    ret = []
    for line in it:
        if is_delimiter(line):
            if ret:
                yield ret  # OR  ''.join(ret)
                ret = []
        else:
            ret.append(line.rstrip())  # OR  ret.append(line)
    if ret:
        yield ret

Usage:

with open('/path/to/file.txt') as f:
    sections = list(per_section(f))  # default delimiter

with open('/path/to/file.txt.txt') as f:
    sections = list(per_section(f, lambda line: line.startswith('#'))) # comment

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...