i have written a short script that aims to remove some information from one column and place it into a newly created column 'Trainers'. This part works fine and the newly created column is populated with the values. However, i am also trying to filter out some information from the df to not show any columns that have a program status of inactive. I have created a filter, which also works on its own. I want to have both of these desired changes reflected on the newly saved database 'Trainers4', however i am unsure how to do this. I have tried to merge the two seperate databases but i get an error value of AttributeError: 'NoneType'. What would be an easier way to push these changes and then output the new df to a new csv file?
df = pd.read_csv('Trainers.csv')
df['swag'] = None
# Defining indexes for desired columns
index_description = df.columns.get_loc('DESCRIPTION')
index_swag = df.columns.get_loc('swag')
# Creating a pattern to be extracted
swag_pattern = r"s*((61-150)|(1-1,999 SF))s*"
# For loop to iterate through rows to find and extract pattern to 'Swag' column
for row in range(0, len(df)):
score = re.findall(swag_pattern, df.iat[row, index_description])
df.iat[row, index_swag] = score
# Defining characteristics based on clients demand sending digit values to the new column swag and
changing values in description column
df['swag'] = df['DESCRIPTION'].str.extract(swag_pattern, expand=False)
df['DESCRIPTION'] = df['DESCRIPTION'].str.replace(swag_pattern, ' ')
# Message box to inform user that a new file with parameters changed has been created
messagebox.showinfo("Saving new File", "A new file with column 'Swag' has been created")
# Creating output so that all rows with value 'PROGRAM STATUS INACTIVE' are not outputted
df_filtered = df[df['PROGRAM STATUS'] != 'INACTIVE']
df_filtered.reset_index(drop = True, inplace = True)
# Print the shape of the dataframe
print(df_filtered.head(15))
dfmf = pd.merge(df_filtered, df).reset_index(drop = True, inplace = True)
print(dfmf.head(15))
# Saving a new file with the name 'trainers4'
dfmf.to_csv('Trainers4.csv')