Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

i have test.txt file, Find strings and subtring from the wordlist

<aardwolf>
<Aargau>
<Aaronic>
<aac>
<akac>
<abaca>
<abactinal>
<abacus>  

test.py file

import sys  # the sys module
import os
import re
def hasattr(str,list):
    expr = re.compile(str)
    # yield the elements
    return [elem for elem in list if expr.match(elem)]

isword = {}
FH = open(sys.argv[1],'r',encoding="ISO-8859-1")
for strLine in FH.readlines():  isword.setdefault(''.join(sorted(strLine[1:strLine.find('>')].upper())),[]).append(strLine[:-1])
print (isword)
basestring=str()
for ARGV in sys.argv[2:]:
    print ("
*** %s
" %ARGV )#print Argv

diffpatletters = re.compile(u'[a-zA-Z]').findall(ARGV.upper())
#print (diffpatletters)
diffpat = '.*' + '(.*)'.join(sorted(diffpatletters)) + '.*'
#print (diffpat)
for KEY in hasattr(diffpat,isword.keys()):
#       print (KEY)
       SUBKEY = KEY
       for X in diffpatletters:
         #print (X)
         SUBKEY1 = SUBKEY.replace(X,'')
          #print (SUBKEY)
       if SUBKEY1 in isword:
           #print (SUBKEY)
           basestring+=  "%s -> %s" %(isword[KEY], isword[SUBKEY1])
print (basestring + "
")

Below is to run the file in command line

python test.py test.txt  aack aadfl

Expected out is find the matched the string and sub-string of each after second argument.My basestring not printing

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
159 views
Welcome To Ask or Share your Answers For Others

1 Answer

have you had to use regexp? if it doesn't matter, do you want results like this?

with open('test.txt', 'r')as f:
    s = f.read()
s = s.split('
')
s

Out[1]:
['<aardwolf>',
 '<Aargau>',
 '<Aaronic>',
 '<aac>',
 '<akac>',
 '<abaca>',
 '<abactinal>',
 '<abacus>  ']

for list-type result:

ARGVs = ['aard', 'onic', 'abacu']

matches = [x for x in s for arg in ARGVs if arg.lower() in x.lower()]
print(matches)

Out[2]:
['<aardwolf>', '<Aaronic>', '<abacus>  ']

for dict-type result

ARGVs = ['aard', 'onic', 'abacu', 'aaro', 'ac']

{key:[x for x in s if key in x] for key in ARGVs if len([x for x in s if key in x]) != 0}

Out[3]:

{'aard': ['<aardwolf>'],
 'onic': ['<Aaronic>'],
 'abacu': ['<abacus>  '],
 'ac': ['<aac>', '<akac>', '<abaca>', '<abactinal>', '<abacus>  ']}

With RegExp

import re

with open('test.txt', 'r')as f:
    s = f.read()

ARGVs = ['wol','ac']
cond = '|'.join([f'w*{patt}w*' for patt in ARGVs])
re.findall(cond,s)  

Out[4]:
['aardwolf', 'aac', 'akac', 'abaca', 'abactinal', 'abacus']

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...