Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

For a dataframe in the below format:

Name    Surename    A   B   C   D   E   F   G   H   I   J   K   L   
John    Rose        2   3   4   5   3   4   5   6   80  3   3   0
Smith   Red         4   5   2   4   5   5   2   4   4   0   3   56
Karl    Joe         2   33  4   44  3   4   0   6   80  3   2   5

How can I apply laaply to run the below code for each row(column A:L), and add the in a new column at the end of each row as "New".

H <-  3 * IQR(x, na.rm = T)
out1 <- round(median(x) - H)
out2 <- round(median(x) + H)
x[x < out1] <- out1
x[x > out2] <- out2
x$`New` <- round(mean(x)))

So the expected output would be as below:

Name    Surename    A   B   C   D   E   F   G   H   I   A   B   C   New
John    Rose        2   3   4   5   3   4   5   6   80  3   3   0   4.4
Smith   Red         4   5   2   4   5   5   2   4   4   0   3   56  4.3
Karl    Joe         2   33  4   44  3   4   0   6   80  3   2   5   14.5
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
136 views
Welcome To Ask or Share your Answers For Others

1 Answer

You can create a function

change_outlier <- function(x) {
   H <-  3 * IQR(x, na.rm = T)
   out1 <- round(median(x) - H)
   out2 <- round(median(x) + H)
   x[x < out1] <- out1
   x[x > out2] <- out2
   mean(x)
}

and apply it by row.

df$new <- apply(df[-c(1:2)], 1, change_outlier)
df
#   Name Surename A  B C  D E F G H  I J K  L   new
#1  John     Rose 2  3 4  5 3 4 5 6 80 3 3  0  4.00
#2 Smith      Red 4  5 2  4 5 5 2 4  4 0 3 56  4.08
#3  Karl      Joe 2 33 4 44 3 4 0 6 80 3 2  5 10.83

data

df <- structure(list(Name = structure(c(1L, 3L, 2L), .Label = c("John", 
"Karl", "Smith"), class = "factor"), Surename = structure(3:1, .Label = c("Joe", 
"Red", "Rose"), class = "factor"), A = c(2L, 4L, 2L), B = c(3L, 
5L, 33L), C = c(4L, 2L, 4L), D = c(5L, 4L, 44L), E = c(3L, 5L, 
3L), F = c(4L, 5L, 4L), G = c(5L, 2L, 0L), H = c(6L, 4L, 6L), 
I = c(80L, 4L, 80L), J = c(3L, 0L, 3L), K = c(3L, 3L, 2L), 
L = c(0L, 56L, 5L)), class = "data.frame", row.names = c(NA, -3L))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...