I have working code that takes a directory of csv files and hashes one column of each line, then aggregates all files together. The issue is the output only displays the first hash value, not re-running the hash for each line. Here is the code:
import glob
import hashlib
files = glob.glob( '*.csv' )
output="combined.csv"
with open(output, 'w' ) as result:
for thefile in files:
f = open(thefile)
m = f.readlines()
for line in m[1:]:
fields = line.split()
hash_object = hashlib.md5(b'(fields[2])')
newline = fields[0],fields[1],hash_object.hexdigest(),fields[3]
joined_line = ','.join(newline)
result.write(joined_line+ '
')
f.close()
See Question&Answers more detail:os