Question Description
I’ve got two data.table
s, both of which share one variable; I’m trying to add a variable that’s missing from the second, but which is tied oneforone to the shared variable.
This is clearly a merge, but because the shared variable has multiple instances, I’m having to use what feels like a workaround to merge the new variable.
Let’s get specific.
x < data.table(let = rep(letters[1:3], 2:4),
num = rep(1:3, 2:4), other = rnorm(9))
y < data.table(let = rep(c("a", "c"), c(10, 6)))
x:
let num other
1: a 1 0.41695882
2: a 1 0.59875888
3: b 2 0.19433915
4: b 2 0.58406046
5: b 2 0.33922321
6: c 3 0.63076561
7: c 3 1.06987710
8: c 3 0.08869372
9: c 3 1.31196123
y:
let
1: a
2: a
3: a
4: a
5: a
6: a
7: a
8: a
9: a
10: a
11: c
12: c
13: c
14: c
15: c
16: c
I just want to add the num
column to y
; since num
is matched 11 with let
, it doesn't really matter that there's duplicates.
Here's an approach that works; I've just got a feeling there's something simpler.
setkey(x, let)
setkey(y, let)
y < x[!duplicated(let), c("let", "num"), with = FALSE][y]
Practice As Follows
The only improvements that I could think of is that

You could skip the
setkey(x, let)
part 
You could also update
y
by reference (rather than creating a copy using<
and then assigning back toy
)
If you are using the current stable version version of data.table
(v <= 1.9.4) you will have to use allow.cartesian = TRUE
setkey(y,let)
y[x[!duplicated(let)], num := i.num, allow.cartesian = TRUE][]
You could alternatively use unique
instead of duplicated
(they both have data.table
methods)
y[unique(x, by = "let"), num := i.num, allow.cartesian = TRUE]
Here's another possibility using the new .EACHI
method, although there's no need for the use of by=.EACHI
here. I've shown you just to expose this feature for you. Have a look at this post for a detailed explanation of what this does and when it's useful.
y[x, num := unique(i.num), by = .EACHI, allow.cartesian = TRUE]
Edit: (Thanks to @Arun for pointing this out)
We shouldn't need allow.cartesian
argument here, as there are no duplicates in i
. In fact, it's a bug, #742 that has been fixed in the current development version (1.9.5). So you just need to do:
y[x[!duplicated(let)], num := i.num]
# or
y[unique(x, by = "let"), num := i.num]
# or (though not recommended in this specific case)
y[x, num := unique(i.num), by = .EACHI]