I'm trying to create categories based on multiple columns in pandas but it is taking forever to run so i'm not sure it is correct. I left for 30 mns and was still running so stopped it. I'm trying to create a new column based on several other columns (in my actual data it is about 15 cols). However when I try on a smaller dataset it is very quick. Any suggestions?
other_cols = ['col1', 'col2', 'col3', 'col4', 'col5']
def labels(row):
if ((row['col 6'] > 1) & (row[other_cols] < 1)).all():
return 'Yes'
if ((row['col 6'] >1) & (row['col 7'] >1) & (row[other_cols] <1)).all():
return 'Maybe'
if ((row['col 6'] <1) & (row['col 7']>1) & (row[other_cols] <1)).all():
return 'no'
df['category'] = df.apply(lambda row: labels(row), axis=1)
question from:https://stackoverflow.com/questions/65940888/creating-new-column-based-on-multiple-columns-pandas