Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have a list of tuples that has strings in it For instance:

[('this', 'is', 'a', 'foo', 'bar', 'sentences')
('is', 'a', 'foo', 'bar', 'sentences', 'and')
('a', 'foo', 'bar', 'sentences', 'and', 'i')
('foo', 'bar', 'sentences', 'and', 'i', 'want')
('bar', 'sentences', 'and', 'i', 'want', 'to')
('sentences', 'and', 'i', 'want', 'to', 'ngramize')
('and', 'i', 'want', 'to', 'ngramize', 'it')]

Now I wish to concatenate each string in a tuple to create a list of space separated strings. I used the following method:

NewData=[]
for grams in sixgrams:
       NewData.append( (''.join([w+' ' for w in grams])).strip())

which is working perfectly fine.

However, the list that I have has over a million tuples. So my question is that is this method efficient enough or is there some better way to do it. Thanks.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
970 views
Welcome To Ask or Share your Answers For Others

1 Answer

For a lot of data, you should consider whether you need to keep it all in a list. If you are processing each one at a time, you can create a generator that will yield each joined string, but won't keep them all around taking up memory:

new_data = (' '.join(w) for w in sixgrams)

if you can get the original tuples also from a generator, then you can avoid having the sixgrams list in memory as well.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...