| Grouping-class {IRanges} | R Documentation |
In this man page, we call "grouping" the action of dividing a collection of NO objects into NG groups (some of which may be empty). The Grouping class and subclasses are containers for representing groupings.
Let's give a formal description of the Grouping core API:
Groups G_i are indexed from 1 to NG (1 <= i <= NG).
Objects O_j are indexed from 1 to NO (1 <= j <= NO).
Every object must belong to one group and only one.
Given that empty groups are allowed, NG can be greater than NO.
Grouping an empty collection of objects (NO = 0) is supported. In that case, all the groups are empty. And only in that case, NG can be zero too (meaning there are no groups).
If x is a Grouping object:
length(x):
Returns the number of groups (NG).
names(x):
Returns the names of the groups.
nobj(x):
Returns the number of objects (NO). Equivalent to length(togroup(x)).
Going from groups to objects:
x[[i]]:
Returns the indices of the objects (the j's) that belong to G_i.
The j's are returned in ascending order.
This provides the mapping from groups to objects (one-to-many mapping).
grouplength(x, i=NULL):
Returns the number of objects in G_i.
Works in a vectorized fashion (unlike x[[i]]).
grouplength(x) is equivalent to
grouplength(x, seq_len(length(x))).
If i is not NULL, grouplength(x, i) is equivalent to
sapply(i, function(ii) length(x[[ii]])).
members(x, i):
Equivalent to x[[i]] if i is a single integer.
Otherwise, if i is an integer vector of arbitrary length, it's
equivalent to sort(unlist(sapply(i, function(ii) x[[ii]]))).
vmembers(x, L):
A version of members that works in a vectorized fashion with
respect to the L argument (L must be a list of integer
vectors). Returns lapply(L, function(i) members(x, i)).
Going from objects to groups:
togroup(x, j=NULL):
Returns the index i of the group that O_j belongs to.
This provides the mapping from objects to groups (many-to-one mapping).
Works in a vectorized fashion. togroup(x) is equivalent to
togroup(x, seq_len(nobj(x))): both return the entire mapping in
an integer vector of length NO.
If j is not NULL, togroup(x, j) is equivalent to
y <- togroup(x); y[j].
togrouplength(x, j=NULL):
Returns the number of objects that belong to the same group as O_j
(including O_j itself).
Equivalent to grouplength(x, togroup(x, j)).
Given that length, names and [[ are defined
for Grouping objects, those objects can be considered List
objects. In particular, as.list works out-of-the-box on them.
One important property of any Grouping object x is
that unlist(as.list(x)) is always a permutation of
seq_len(nobj(x)). This is a direct consequence of the fact
that every object in the grouping belongs to one group and only
one.
[DOCUMENT ME]
A Partitioning container represents a block-grouping, i.e. a grouping
where each group contains objects that are neighbors in the original
collection of objects. More formally, a grouping x is a
block-grouping iff togroup(x) is sorted in increasing order
(not necessarily strictly increasing).
A block-grouping object can also be seen (and manipulated) as a Ranges object where all the ranges are adjacent starting at 1 (i.e. it covers the 1:NO interval with no overlap between the ranges).
Note that a Partitioning object is both: a particular type of Grouping
object and a particular type of Ranges object. Therefore all the
methods that are defined for Grouping and Ranges objects can also
be used on a Partitioning object. See ?Ranges for a description of
the Ranges API.
The Partitioning class is virtual with 2 concrete subclasses: PartitioningByEnd (only stores the end of the groups, allowing fast mapping from groups to objects), and PartitioningByWidth (only stores the width of the groups).
H2LGrouping(high2low=integer()):
[DOCUMENT ME]
Dups(high2low=integer()):
[DOCUMENT ME]
PartitioningByEnd(x=integer(), NG=NULL, names=NULL):
x must be either a list-like object or a sorted integer vector.
NG must be either NULL or a single integer.
names must be either NULL or a character vector of
length NG (if supplied) or length(x) (if NG
is not supplied).
Returns the following PartitioningByEnd object y:
If x is a list-like object, then the returned object
y has the same length as x and is such that
width(y) is identical to elementLengths(x).
If x is an integer vector and NG is not supplied,
then x must be sorted (checked) and contain non-NA
non-negative values (NOT checked).
The returned object y has the same length as x
and is such that end(y) is identical to x.
If x is an integer vector and NG is supplied,
then x must be sorted (checked) and contain values
>= 1 and <= NG (checked).
The returned object y is of length NG and is
such that togroup(y) is identical to x.
If the names argument is supplied, it is used to name the
partitions.
PartitioningByWidth(x=integer(), NG=NULL, names=NULL):
x must be either a list-like object or an integer vector.
NG must be either NULL or a single integer.
names must be either NULL or a character vector of
length NG (if supplied) or length(x) (if NG
is not supplied).
Returns the following PartitioningByWidth object y:
If x is a list-like object, then the returned object
y has the same length as x and is such that
width(y) is identical to elementLengths(x).
If x is an integer vector and NG is not supplied,
then x must contain non-NA non-negative values (NOT
checked).
The returned object y has the same length as x
and is such that width(y) is identical to x.
If x is an integer vector and NG is supplied,
then x must be sorted (checked) and contain values
>= 1 and <= NG (checked).
The returned object y is of length NG and is
such that togroup(y) is identical to x.
If the names argument is supplied, it is used to name the
partitions.
Note that these constructors don't recycle their names argument
(to remain consistent with what `names<-` does on standard
vectors).
H. Pages
List-class, Ranges-class, IRanges-class, successiveIRanges, cumsum, diff
showClass("Grouping") # shows (some of) the known subclasses
## ---------------------------------------------------------------------
## A. H2LGrouping OBJECTS
## ---------------------------------------------------------------------
high2low <- c(NA, NA, 2, 2, NA, NA, NA, 6, NA, 1, 2, NA, 6, NA, NA, 2)
h2l <- H2LGrouping(high2low)
h2l
## The Grouping core API:
length(h2l)
nobj(h2l) # same as 'length(h2l)' for H2LGrouping objects
h2l[[1]]
h2l[[2]]
h2l[[3]]
h2l[[4]]
h2l[[5]]
grouplength(h2l) # same as 'unname(sapply(h2l, length))'
grouplength(h2l, 5:2)
members(h2l, 5:2) # all the members are put together and sorted
togroup(h2l)
togroup(h2l, 5:2)
togrouplength(h2l) # same as 'grouplength(h2l, togroup(h2l))'
togrouplength(h2l, 5:2)
## The List API:
as.list(h2l)
sapply(h2l, length)
## ---------------------------------------------------------------------
## B. Dups OBJECTS
## ---------------------------------------------------------------------
dups1 <- as(h2l, "Dups")
dups1
duplicated(dups1) # same as 'duplicated(togroup(dups1))'
### The purpose of a Dups object is to describe the groups of duplicated
### elements in a vector-like object:
x <- c(2, 77, 4, 4, 7, 2, 8, 8, 4, 99)
x_high2low <- high2low(x)
x_high2low # same length as 'x'
dups2 <- Dups(x_high2low)
dups2
togroup(dups2)
duplicated(dups2)
togrouplength(dups2) # frequency for each element
table(x)
## ---------------------------------------------------------------------
## C. Partitioning OBJECTS
## ---------------------------------------------------------------------
pbe1 <- PartitioningByEnd(c(4, 7, 7, 8, 15), names=LETTERS[1:5])
pbe1 # the 3rd partition is empty
## The Grouping core API:
length(pbe1)
nobj(pbe1)
pbe1[[1]]
pbe1[[2]]
pbe1[[3]]
grouplength(pbe1) # same as 'unname(sapply(pbe1, length))' and 'width(pbe1)'
togroup(pbe1)
togrouplength(pbe1) # same as 'grouplength(pbe1, togroup(pbe1))'
names(pbe1)
## The Ranges core API:
start(pbe1)
end(pbe1)
width(pbe1)
## The List API:
as.list(pbe1)
sapply(pbe1, length)
## Replacing the names:
names(pbe1)[3] <- "empty partition"
pbe1
## Coercion to an IRanges object:
as(pbe1, "IRanges")
## Other examples:
PartitioningByEnd(c(0, 0, 19), names=LETTERS[1:3])
PartitioningByEnd() # no partition
PartitioningByEnd(integer(9)) # all partitions are empty
x <- c(1L, 5L, 5L, 6L, 8L)
pbe2 <- PartitioningByEnd(x, NG=10L)
stopifnot(identical(togroup(pbe2), x))
pbw2 <- PartitioningByWidth(x, NG=10L)
stopifnot(identical(togroup(pbw2), x))
## ---------------------------------------------------------------------
## D. RELATIONSHIP BETWEEN Partitioning OBJECTS AND successiveIRanges()
## ---------------------------------------------------------------------
mywidths <- c(4, 3, 0, 1, 7)
## The 3 following calls produce the same ranges:
ir <- successiveIRanges(mywidths) # IRanges instance.
pbe <- PartitioningByEnd(cumsum(mywidths)) # PartitioningByEnd instance.
pbw <- PartitioningByWidth(mywidths) # PartitioningByWidth instance.
stopifnot(identical(as(ir, "PartitioningByEnd"), pbe))
stopifnot(identical(as(ir, "PartitioningByWidth"), pbw))