Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am working in R. I have typed in the command :

table(shoppingdata$Identifier, shoppingdata$Coupon)

I have the following data:

           FALSE TRUE
  197386     0    5

  197388     0    2

  197390     2    0

  197392     0    3

  197394     1    0

  197397     0    1

  197398     1    1

  197400     0    4

  197402     1    5

  197406     0    5
  1. First of all, I cannot name the vectors FALSE and TRUE by something else, e.g couponused.

  2. Most importantly, I want to create a third column which is the sum of FALSE+TRUE( Coupon used+coupon not used= number of visits). The actual columns contain hundreds of entries.

The solution is not obvious at all.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
229 views
Welcome To Ask or Share your Answers For Others

1 Answer

You have stumbled into the abyss of R data types, through no fault of your own.

Assuming that shoppingdata is a data frame,

table(shoppingdata$Identifier, shoppingdata$Coupon)

creates an object of type "table". One would think that using, e.g.

as.data.frame(table(shoppingdata$Identifier, shoppingdata$Coupon))

would turn this into a data frame with the same format as in the printout, but, as the example below shows, it does not!

# example
data <- data.frame(ID=rep(1:5,each=10),coupon=(sample(c(T,F),50,replace=T)))
# creates "contingency table", not a data frame.
t <- table(data)
t
#    coupon
# ID  FALSE TRUE
#   1     5    5
#   2     3    7
#   3     4    6
#   4     6    4
#   5     3    7

as.data.frame(t)  # not useful!!
#    ID coupon Freq
# 1   1  FALSE    5
# 2   2  FALSE    3
# 3   3  FALSE    4
# 4   4  FALSE    6
# 5   5  FALSE    3
# 6   1   TRUE    5
# 7   2   TRUE    7
# 8   3   TRUE    6
# 9   4   TRUE    4
# 10  5   TRUE    7

# this works...
coupons  <- data.frame(ID=rownames(t),not.used=t[,1],used=t[,2])
# add two columns to make a third
coupons$total <- coupons$used + coupons$not.used
# or, less typing
coupons$ total <- with(coupons,not.used+used)

FWIW, I think yours is a perfectly reasonable question. The reason more people don't use R is that it has an extremely steep learning curve, and the documentation is not very good. On the other hand, once you've climbed that learning curve, R is astonishingly powerful.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...