I have a grouped tibble. I would like to use map()
to iterate over the columns of the tibble. And within each column, I would like map()
to act separately on each group. In other words, I would like map()
to respect the grouping structure of the tibble.
But map()
doesn't seem to respect the grouping structures of tibbles. Here is a minimal example:
library(dplyr)
library(purrr)
data(iris)
iris %>%
group_by(Species) %>%
map(length)
In the iris dataset, there are three species and four columns (not counting "Species"). I would therefore like map()
to return a list of 3 × 4 = 12 lengths, or else to return a nested list that has, in total, 12 lengths. But it returns a list of 5 elements: one for each column, counting the grouping column. Each of these five elements is simply the total length of a column (150). How can I adapt the code above to provide the result that I want?
In this minimal example, a satisfactory alternative to using map()
is
iris %>%
group_by(Species) %>%
summarize(
mutate(across(everything(), length))
)
which returns
# A tibble: 3 x 5
Species Sepal.Length Sepal.Width Petal.Length Petal.Width
* <fct> <int> <int> <int> <int>
1 setosa 50 50 50 50
2 versicolor 50 50 50 50
3 virginica 50 50 50 50
But in most cases, this alternative won't work. The problem is that I'll usually want summarize()
and mutate
to return loess()
objects, not integers. And when I try to get them to return loess()
objects, they choke with errors like
Error: Problem with `summarise()` input `..1`.
x Input must be a vector, not a `loess` object.
question from:https://stackoverflow.com/questions/66065866/map-over-tibble-columns-while-preserving-groups