I have the following data frame:
data <- data.frame(id = c(1,2,3,4,5,6),
exposure = c("BMI", "BMI etc.", "BMI neuronal", "WHRadjBMI", "WHR", "BF"))
id exposure
1 1 BMI
2 2 BMI etc.
3 3 BMI neuronal
4 4 WHRadjBMI
5 5 WHR
6 6 BF
I want to remove all rows from this data frame which have "BMI" but not "adj" in the exposure
column so that I can group all of the BMI related rows into a single factor level called "BMI. The real data frame is ~2500 rows by 50 columns.
Subsetting would therefore result in the following data frame, here rows 1, 2, and 3 have been removed because they contain "BMI" but do not contain "adj":
id exposure
4 4 WHRadjBMI
5 5 WHR
6 6 BF
I can then combine the "BMI" but not "adj" containing rows into a single factor level such that rows 1, 2, and 3 would become:
id exposure
1 1 BMI
2 2 BMI
3 3 BMI
I can do this final part as follows:
data$exposure <- "BMI"