Unify two columns skipping NAs

Unify two columns skipping NAs

Having a data.table like the following:

a <- data.table(col1 = c(1, 2, 3, NA, NA),
                col2 = c(NA, NA, NA, 4, 5),
                col3 = c("a", "b", "c", NA, NA),
                col4 = c(NA, NA, NA, "d", "e"))

I would like to find a way to unify col1 with col2, and col3 with col4 by skipping the NAs and keeping only the values, with an output like the following:

    col1   col2
   <num> <char>
1:     1      a
2:     2      b
3:     3      c
4:     4      d
5:     5      e

Is there any way to achieve that? I was thinking to use the sum, but of course it doesn't work with character columns then.

Answer

Using coalesce over 2 columns:

a[, .(col1 = fcoalesce(col1, col2),
      col2 = fcoalesce(col3, col4)) ]

#     col1   col2
#    <num> <char>
# 1:     1      a
# 2:     2      b
# 3:     3      c
# 4:     4      d
# 5:     5      e

A bit more automated way to use coalesce on every n number of columns:

cc <- split(colnames(a), seq(ncol(a)) %/% 3)

for(i in seq_along(cc)){
  a[, (paste0("newCol", i)) := fcoalesce( .SD ), .SDcols = cc[[ i ]] ]
  }

#     col1  col2   col3   col4 newCol1 newCol2
#    <num> <num> <char> <char>   <num>  <char>
# 1:     1    NA      a   <NA>       1       a
# 2:     2    NA      b   <NA>       2       b
# 3:     3    NA      c   <NA>       3       c
# 4:    NA     4   <NA>      d       4       d
# 5:    NA     5   <NA>      e       5       e

Enjoyed this article?

Check out more content on our blog or follow us on social media.

Browse more articles