Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I've some problems with my data frame in R. My data frame looks something like this:

ID  TIME    DAY        URL_NAME      VALUE  TIME_SPEND
1    12:15  Monday      HOME         4        30
1    13:15  Tuesday     CUSTOMERS    5        21  
1    15:00  Thursday    PLANTS       8        8    
1    16:21  Friday      MANAGEMENT   1        6
....

So, I want to write the rows, containing the same "ID" into one single row. Looking something like this:

ID  TIME    DAY         URL_NAME     VALUE    TIME_SPEND  TIME1  DAY1        URL_NAME1      VALUE1  TIME_SPEND1  TIME2    DAY2        URL_NAME2      VALUE2  TIME_SPEND2  TIME3    DAY3        URL_NAME3      VALUE3  TIME_SPEND3
1    12:15  Monday      HOME         4        30          13:15  Tuesday     CUSTOMERS      5       21           15:00    Thursday    PLANTS         8       8            16:21    Friday      MANAGEMENT     1       6

My second problem is, that there are about 1.500.00 unique IDs and i would like to do this for the whole data frame.

I did not find any solution fitting to my problem. I would be happy about any solutions or links to handle my problem.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
753 views
Welcome To Ask or Share your Answers For Others

1 Answer

I'd recommend using dcast from the "data.table" package, which would allow you to reshape multiple measure variables at once.

Example:

library(data.table)
as.data.table(mydf)[, dcast(.SD, ID ~ rowid(ID), value.var = names(mydf)[-1])]
#    ID TIME_1 TIME_2 TIME_3   DAY_1   DAY_2    DAY_3 URL_NAME_1 URL_NAME_2 URL_NAME_3 VALUE_1 VALUE_2
# 1:  1  12:15  13:15  15:00  Monday Tuesday Thursday       HOME  CUSTOMERS     PLANTS       4       5
# 2:  2  14:15  10:19     NA Tuesday  Monday       NA  CUSTOMERS  CUSTOMERS         NA       2       9
#    VALUE_3 TIME_SPEND_1 TIME_SPEND_2 TIME_SPEND_3
# 1:       8           30           19           40
# 2:      NA           21            8           NA

Here's the sample data used:

mydf <- data.frame(
  ID = c(1, 1, 1, 2, 2),
  TIME = c("12:15", "13:15", "15:00", "14:15", "10:19"),
  DAY = c("Monday", "Tuesday", "Thursday", "Tuesday", "Monday"),
  URL_NAME = c("HOME", "CUSTOMERS", "PLANTS", "CUSTOMERS", "CUSTOMERS"),
  VALUE = c(4, 5, 8, 2, 9),
  TIME_SPEND = c(30, 19, 40, 21, 8)
)
mydf
#   ID  TIME      DAY  URL_NAME VALUE TIME_SPEND
# 1  1 12:15   Monday      HOME     4         30
# 2  1 13:15  Tuesday CUSTOMERS     5         19
# 3  1 15:00 Thursday    PLANTS     8         40
# 4  2 14:15  Tuesday CUSTOMERS     2         21
# 5  2 10:19   Monday CUSTOMERS     9          8

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...