Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have the following problem: I have a data frame df with many variables.

(我有以下问题:我有一个带有许多变量的数据框df 。)

One variable is df$size (non-numeric).

(一个变量是df $ size (非数字)。)

Now I want to replace all sizes with less than 20 observations by the term "other".

(现在,我要用“其他”一词替换少于20个观察值的所有大小。)

sort(table(df$size))

This gives me an overview of the values I want to replace.

(这为我提供了我要替换的值的概述。)

But how do I replace them in my df?

(但是,如何在df中替换它们?)

df$size[sort(table(df$size))<20]="other"

That does not work.

(那行不通。)

Thank you!

(谢谢!)

  ask by Christopher translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
372 views
Welcome To Ask or Share your Answers For Others

1 Answer

Works with something along this

(与此一起工作)

set.seed(123)
df <- data.frame(size = as.character(sample(1:5, size = 100, replace = TRUE)),
                 stringsAsFactors = FALSE)
tabs <- sort(table(df$size))
tab <- tabs[tabs < 20]

df$size[which(df$size %in% names(tab))] <- "other"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...