Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I would like to be able to plot each of "X1 by grpA", "X2 by grpA", "X3 by grpB", "X1 by grpB", "X2 by grpB", and "x3 by grpB" using ggplot2::ggplot() in conjunction with a for loop.

So far, I can get it to almost work, but the argument for the column of the grouping variable in the facet_grid() function does not resolve correctly when I try to use tidy_eval properties. It does work, however, when I type the column name explicitly, but of course, having to type the name explicitly would make it so I would not be able to dynamically change the grouping variable.

I provide the following data-set returned by the following code snippet to give context to my question:

set.seed(1)
dfr <- tibble(x1 = factor(sample(letters[1:7], 50, replace = T), levels=letters[1:7]),
             x2 = factor(sample(letters[1:7], 50, replace = T), levels=letters[1:7]),
             x3 = factor(sample(letters[1:7], 50, replace = T), levels=letters[1:7]),
             grpA = factor(sample(c("grp1","grp2"),50, prob=c(0.3, 0.7) ,replace=T), levels = c("grp1", "grp2")),
             grpB = factor(sample(c("grp1","grp2"),50, prob=c(0.6, 0.4) ,replace=T), levels = c("grp1", "grp2"))
             )

head(df)

I also provide a function that creates the plotting data I need to make the grouped plots. It accepts strings as arguments for the parameters 'groupvar' and 'mainvar':

plot_data_prepr <- function(dat, groupvar, mainvar){
  
  groupvar <- sym(groupvar)
  mainvar <- sym(mainvar)
  
  plot_data <- dat %>% 
    group_by(!!groupvar) %>% 
    count(!!mainvar, .drop = F) %>% drop_na() %>% 
    mutate(pct = n/sum(n),
         pct2 = ifelse(n == 0, 0.005, n/sum(n)),
         grp_tot = sum(n),
         pct_lab = paste0(format(pct*100, digits = 1),'%'),
         pct_pos = pct2 + .02)
  
  return(plot_data)
}

here is normal usage of the function:


plot_data_prepr(dat = dfr, groupvar = "grpA", mainvar = "x1")

Now I share my for loop that fails when I try to use tidy_eval in the facet_grid() function in the context of ggplot(); the returned error = "Error in !sgvar : invalid argument type"

"FAILING EXAMPLE:"

for (i in seq_along(names(dfr)[1:3])){
  mvar <- names(dfr)[i]
  print(mvar)
  
  gvar <- names(dfr[4])
  print(gvar)
  
  smvar <- sym(mvar)
  sgvar <- sym(gvar)
  
  plot <- ggplot(data=plot_data_prepr(dfr, gvar, mvar),
         mapping = aes(x=!!smvar, y = pct2, fill = !!smvar)) +
    geom_bar(stat = 'identity') +
    ylim(0,1) +
    geom_text(aes(x=!!smvar, label=pct_lab, y = pct_pos + .02)) +
    facet_grid(. ~ !!sgvar) +
    ggtitle(paste0(mvar," by ",gvar))

  print(plot)
  
}

When I run the loop by explicitly typing grpA in place of !!sgvar in the facet_grid() function, it works for some reason:

"FUNCTIONING BUT NOT WHAT I WANT EXAMPLE:"

for (i in seq_along(names(dfr)[1:3])){
  mvar <- names(dfr)[i]
  print(mvar)
  
  gvar <- names(dfr[4])
  print(gvar)
  
  smvar <- sym(mvar)
  sgvar <- sym(gvar)
  
  plot <- ggplot(data=plot_data_prepr(dfr, gvar, mvar),
         mapping = aes(x=!!smvar, y = pct2, fill = !!smvar)) +
    geom_bar(stat = 'identity') +
    ylim(0,1) +
    geom_text(aes(x=!!smvar, label=pct_lab, y = pct_pos + .02)) +
    facet_grid(. ~ grpA) +
    ggtitle(paste0(mvar," by ",gvar))

  print(plot)
  
}

Of course, if I wanted to loop through a set of grouping variables, then needing to explicitly type each one would not allow for looping. Could someone explain why my code with the 'bang bang' operator inside facet_gric() doesn't work properly in the 'FAILING EXAMPLE' and also suggest how to remedy this error?

Thank you.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
137 views
Welcome To Ask or Share your Answers For Others

1 Answer

It's difficult to piece together exactly what you're looking for, since your example code has errors, unassigned variable names and pieces of code missing. However, I think you're wanting the loop to print all of the pairs of grouping variables and main variables by cycling through the names of your data frame.

So that there is no dubiety, here is a full reprex:

Load packages and create reproducible data:

library(dplyr)
library(ggplot2)

set.seed(1)
df <- tibble(x1 = factor(sample(letters[1:7], 50, replace = TRUE)),
             x2 = factor(sample(letters[1:7], 50, replace = TRUE)),
             x3 = factor(sample(letters[1:7], 50, replace = TRUE)),
             grpA = factor(sample(c("grp1", "grp2"), 50, 
                                  prob = c(0.3, 0.7), replace=TRUE)),
             grpB = factor(sample(c("grp1", "grp2"), 50, 
                                  prob = c(0.6, 0.4), replace=TRUE)))

Define data preparation function

plot_data_prepr <- function(dat, groupvar, mainvar)
{
  groupvar <- sym(groupvar)
  mainvar <- sym(mainvar)
  
  plot_data <- dat %>% 
    group_by(!!groupvar) %>% 
    count(!!mainvar, .drop = F) %>% tidyr::drop_na() %>% 
    mutate(pct = n/sum(n),
           pct2 = ifelse(n == 0, 0.005, n/sum(n)),
           grp_tot = sum(n),
           pct_lab = paste0(format(pct*100, digits = 1),'%'),
           pct_pos = pct2 + .02)
  
  return(plot_data)
}

Loop to create all 6 plots

for(gvar in names(df)[4:5]){
  for(mvar in names(df)[1:3])
  {
    print(ggplot(plot_data_prepr(df, gvar, mvar),
                 aes(x = !!sym(mvar), y = pct2, fill = !!sym(mvar))) +
      geom_bar(stat = 'identity') +
      ylim(0,1) +
      geom_text(aes(label=pct_lab, y = pct_pos + .02)) +
      facet_grid(as.formula(paste0(".~", gvar))) +
      ggtitle(paste0(mvar, " by ", gvar))
  )
  }
}

Output:

Created on 2020-06-30 by the reprex package (v0.3.0)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...