R: Creates a Join data table

J {data.table}

R Documentation

Creates a Join data table

Description

Creates a data.table to be passed in as the i to a [.data.table join.

Usage

# DT[J(...)]                           # J() only for use inside DT[...].
SJ(...)                                # DT[SJ(...)]
CJ(..., sorted = TRUE, unique = FALSE)  # DT[CJ(...)]

Arguments

`...`	Each argument is a vector. Generally each vector is the same length but if they are not then the usual silent repetition is applied.
`sorted`	logical. Should the input order be retained?
`unique`	logical. When `TRUE`, only unique values of each vectors are used (automatically).

Details

SJ and CJ are convenience functions for creating a data.table in the context of a data.table 'query' on x.

x[data.table(id)] is the same as x[J(id)] but the latter is more readable. Identical alternatives are x[list(id)] and x[.(id)].

x must have a key when passing in a join table as the i. See [.data.table

Value

J : the same result as calling list. J is a direct alias for list but results in clearer more readable code.

SJ : (S)orted (J)oin. The same value as J() but additionally setkey() is called on all the columns in the order they were passed in to SJ. For efficiency, to invoke a binary merge rather than a repeated binary full search for each row of i.

CJ : (C)ross (J)oin. A data.table is formed from the cross product of the vectors. For example, 10 ids, and 100 dates, CJ returns a 1000 row table containing all the dates for all the ids. It gains sorted, which by default is TRUE for backwards compatibility. FALSE retains input order.

Examples

DT = data.table(A=5:1,B=letters[5:1])
setkey(DT,B)    # re-orders table and marks it sorted.
DT[J("b")]      # returns the 2nd row
DT[.("b")]      # same. Style of package plyr.
DT[list("b")]   # same

# CJ usage examples
CJ(c(5,NA,1), c(1,3,2)) # sorted and keyed data.table
do.call(CJ, list(c(5,NA,1), c(1,3,2))) # same as above
CJ(c(5,NA,1), c(1,3,2), sorted=FALSE) # same order as input, unkeyed
# use for 'unique=' argument
x = c(1,1,2)
y = c(4,6,4)
CJ(x, y, unique=TRUE) # unique(x) and unique(y) are computed automatically

[Package data.table version 1.10.4-3 Index]