Difference between if_any(any_of(vars)) and if_any(all_of(vars))

Take the following MWE:
df <- data.frame(a=c(TRUE, TRUE, FALSE), b=c(FALSE, TRUE, FALSE))
myvars <- c("a","b")
The aim is to build a column c
which is row-wise TRUE if one or both of a
and b
are TRUE.
It is required that the list of variables to use is held by vector character myvars
.
With
df %>% mutate(c=if_any(myvars))
I get:
! Using an external vector in selections was deprecated in tidyselect 1.1.0.
ℹ Please use `all_of()` or `any_of()` instead.
# Was:
data %>% select(myvars)
# Now:
data %>% select(all_of(myvars))
See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
Given that, I have to do:
df %>% mutate(c=if_any(any_of(myvars)))
df %>% mutate(c=if_any(all_of(myvars)))
But I don't understand the difference between those two.
if_any
should imply any_of
.
What is the difference between the two?
Answer
The difference is whether or not those columns are required to be in the data frame. all_of()
will throw an error if you include any columns that aren't in the data frame. any_of()
will happily work with whatever columns happen to be present in the data frame, ignoring any that aren't.
For code safety, you should use all_of()
whenever you can, but sometimes, for example, you're writing scripts to work on similar inputs that have slightly different columns, and any_of()
is very useful in this case.
To illustrate, let's add the non-existent "x"
column to myvars
:
## any_of still works
df %>% mutate(c=if_any(any_of(c(myvars, "x"))))
# a b c
# 1 TRUE FALSE TRUE
# 2 TRUE TRUE TRUE
# 3 FALSE FALSE FALSE
## all_of throws an error: ✖ Element `x` doesn't exist.
df %>% mutate(c=if_any(all_of(c(myvars, "x"))))
# Error in `mutate()`:
# ℹ In argument: `c = if_any(all_of(c(myvars, "x")))`.
# Caused by error in `if_any()`:
# ℹ In argument: `all_of(c(myvars, "x"))`.
# Caused by error in `all_of()`:
# ! Can't subset elements that don't exist.
# ✖ Element `x` doesn't exist.
This is explained at the shared help page accessible at ?any_of
or ?all_of
:
all_of()
is for strict selection. If any of the variables in the character vector is missing, an error is thrown.
any_of()
doesn't check for missing variables. It is especially useful with negative selections, when you would like to make sure a variable is removed.
Enjoyed this article?
Check out more content on our blog or follow us on social media.
Browse more articles