dplyr 1.1.0 join_by functionality introduction

packages
tidyverse
dplyr
Author

Jakob Koch

Published

February 9, 2023

dplyr update 1.1.0

Introduction

The recent dplyr update introduced a few interesting changes to the *_join functionality
The general usage of the functions:

dplyr::left_join(x = ,y = ,
                 by = "<specify-here-how-to-join,e.g. wich comparisons>")

check also the associated help page ?dplyr::left_join() .

So the new thing is the function join_by as an argument to by = .

dplyr::left_join(x = ,y = ,
                 by = dplyr::join_by("<specify-here-how-to-join,
                                     e.g. wich comparisons>"))

Changes / comparisons

Comparisons between old and new synthax for the standard equal joining.

NEW dplyr::join_by() fuction reduces necessary quotation.

dplyr::left_join(x = ,y = , by = c("compA", "compB", "compC"))
dplyr::left_join(x = ,y = , by = join_by(compA, compB, compC))

If column names are not identical in x and y those can be either wrapped in a equals character vector c("nameA"="nameB"), or the join_by function. Here the general R synthax is respected, using == as equal comparson argument.

dplyr::left_join(x = ,y = ,by = c("nameA"="nameB"))
dplyr::left_join(x = ,y = ,by = join_by(nameA == nameB))

For multiple arguments comparisons:

dplyr::left_join(x = ,y = ,by = c("compA", "compB", "compC"))
dplyr::left_join(x = ,y = ,by = join_by(compA, compB, compC))

but this of course now also enables all other inequal comparsons defined in R already:

dplyr::left_join(x = ,y = ,b = join_by(nameA == nameB))
dplyr::left_join(x = ,y = ,b = join_by(nameA >= nameB))

For further info on that I can recomend the great video by Davis Vaughan.