Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

What is the fastest way to perform concate-like operation over a data.frame in R? Suppose I have the following table:

df <- data.frame(content = c("c1", "c2", "c3", "c4", "c5"),
                 groups = c("g1", "g1", "g1", "g2", "g2"),
                 stringsAsFactors = F)

df$groups <- as.factor(df$groups)

I want to concate the content of cells in content column, by groups, efficiently, to receive the equivalent to:

df2 <- data.frame(content = c("c1 c2 c3", "c4 c5"),
                  groups = c("g1", "g2"),
                  stringsAsFactors = F)

df2 $groups <- as.factor(df2 $groups)

I would prefer some dplyr operation, but have no good idea how to apply it.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
200 views
Welcome To Ask or Share your Answers For Others

1 Answer

A close relative to tapply is aggregate, which lets you do this:

aggregate(content ~ groups, df, paste, collapse = " ")
#   groups  content
# 1     g1 c1 c2 c3
# 2     g2    c4 c5

Factors are retained:

str(.Last.value)
# 'data.frame':  2 obs. of  2 variables:
#  $ groups : Factor w/ 2 levels "g1","g2": 1 2
#  $ content: chr  "c1 c2 c3" "c4 c5"

Since you mention that you are looking for a dplyr approach, you can try something like this:

library(dplyr)
df %>% group_by(groups) %>% summarise(content = paste(content, collapse = " "))
# Source: local data frame [2 x 2]
# 
#   groups  content
# 1     g1 c1 c2 c3
# 2     g2    c4 c5

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...