| GRanges-class {GenomicRanges} | R Documentation |
The GRanges class is a container for the genomic locations and their associated annotations.
The GRanges class stores the sequences of genomic locations and associated annotations. Each element in the sequence is comprised of a sequence name, an interval, a strand, and optional metadata columns (e.g. score, GC content, etc.). This information is stored in four components:
seqnamesa 'factor' Rle object containing the sequence names.
rangesan IRanges object containing the ranges.
strandmcolsa DataFrame object
containing the metadata columns. Columns cannot be named
"seqnames", "ranges", "strand",
"seqlevels", "seqlengths", "isCircular",
"start", "end", "width", or "element".
seqinfoa Seqinfo object containing information about the set of genomic sequences present in the GRanges object.
GRanges(seqnames = Rle(), ranges = IRanges(),
strand = Rle("*", length(seqnames)),
...,
seqlengths = NULL, seqinfo = NULL):
Creates a GRanges object.
seqnamesRle object, character vector, or factor containing the sequence names.
rangesIRanges object containing the ranges.
strandRle object, character vector, or factor containing the strand information.
...Optional metadata columns.
These columns cannot be named
"start", "end", "width", or
"element". A named integer vector "seqlength"
can be used instead of seqinfo.
seqlengthsan integer vector named with the
sequence names and containing the lengths (or NA) for each
level(seqnames).
seqinfoa Seqinfo object containing allowed
sequence names and lengths (or NA) for each
level(seqnames).
In the code snippets below, x is a GRanges object.
as(from, "GRanges"): Creates a GRanges object from a
RangedData, RangesList, RleList or RleViewsList object.
as(from, "RangedData"):
Creates a RangedData object from a GRanges
object. The strand and metadata columns become columns
in the result. The seqlengths(from), isCircular(from),
and genome(from) vectors are stored in the metadata columns
of ranges(rd).
as(from, "RangesList"):
Creates a RangesList object from a GRanges
object. The strand and metadata columns become inner
metadata columns (i.e. metadata columns on the ranges).
The seqlengths(from), isCircular(from), and
genome(from) vectors become the metadata columns.
as(from, "GappedAlignments"):
Creates a GappedAlignments object from a GRanges object. The metadata
columns are propagated. cigar values are created from the sequence
width unless a "cigar" metadata column already exists in from.
as.data.frame(x, row.names = NULL, optional = FALSE, ...):
Creates a data.frame with columns seqnames (factor),
start (integer), end (integer), width (integer),
strand (factor), as well as the additional metadata columns
stored in mcols(x). Pass an explicit
stringsAsFactors=TRUE/FALSE argument via ... to
override the default conversions for the metadata columns in
mcols(x).
In the following code snippets, x is a GRanges object.
length(x):
Get the number of elements.
seqnames(x), seqnames(x) <- value:
Get or set the sequence names.
value can be an Rle object, a character vector,
or a factor.
ranges(x), ranges(x) <- value:
Get or set the ranges. value can be a Ranges object.
names(x), names(x) <- value:
Get or set the names of the elements.
strand(x), strand(x) <- value:
Get or set the strand. value can be an Rle object, character
vector, or factor.
mcols(x, use.names=FALSE), mcols(x) <- value:
Get or set the metadata columns.
If use.names=TRUE and the metadata columns are not NULL,
then the names of x are propagated as the row names of the
returned DataFrame object.
When setting the metadata columns, the supplied value must be NULL
or a data.frame-like object (i.e. DataTable or data.frame)
object holding element-wise metadata.
elementMetadata(x), elementMetadata(x) <- value,
values(x), values(x) <- value:
Alternatives to mcols functions. Their use is discouraged.
seqinfo(x), seqinfo(x) <- value:
Get or set the information about the underlying sequences.
value must be a Seqinfo object.
seqlevels(x), seqlevels(x, force=FALSE) <- value:
Get or set the sequence levels.
seqlevels(x) is equivalent to seqlevels(seqinfo(x))
or to levels(seqnames(x)), those 2 expressions being
guaranteed to return identical character vectors on a GRanges object.
value must be a character vector with no NAs.
See ?seqlevels for more information.
seqlengths(x), seqlengths(x) <- value:
Get or set the sequence lengths.
seqlengths(x) is equivalent to seqlengths(seqinfo(x)).
value can be a named non-negative integer or numeric vector
eventually with NAs.
isCircular(x), isCircular(x) <- value:
Get or set the circularity flags.
isCircular(x) is equivalent to isCircular(seqinfo(x)).
value must be a named logical vector eventually with NAs.
genome(x), genome(x) <- value:
Get or set the genome identifier or assembly name for each sequence.
genome(x) is equivalent to genome(seqinfo(x)).
value must be a named character vector eventually with NAs.
seqnameStyle(x):
Get or set the seqname style for x.
Note that this information is not stored in x but inferred
by looking up seqnames(x) against a seqname style database
stored in the seqnames.db metadata package (required).
seqnameStyle(x) is equivalent to seqnameStyle(seqinfo(x))
and can return more than 1 seqname style (with a warning)
in case the style cannot be determined unambiguously.
score(x): Get the “score” column from the element
metadata, if any.
In the following code snippets, x is a GRanges object.
start(x), start(x) <- value:
Get or set start(ranges(x)).
end(x), end(x) <- value:
Get or set end(ranges(x)).
width(x), width(x) <- value:
Get or set width(ranges(x)).
In the code snippets below, x is a GRanges object.
append(x, values, after = length(x)):
Inserts the values into x at the position given by
after, where x and values are of the same
class.
c(x, ...):
Combines x and the GRanges objects in ... together.
Any object in ... must belong to the same class as x,
or to one of its subclasses, or must be NULL.
The result is an object of the same class as x.
c(x, ..., ignore.mcols=FALSE)
If the GRanges objects have metadata columns (represented as one
DataFrame per object), each such DataFrame must have the
same columns in order to combine successfully. In order to circumvent
this restraint, you can pass in an ignore.mcols=TRUE argument
which will combine all the objects into one and drop all of their
metadata columns.
split(x, f, drop=FALSE):
Splits x according to f to create a
GRangesList object.
If f is a list-like object then drop is ignored
and f is treated as if it was
rep(seq_len(length(f)), sapply(f, length)),
so the returned object has the same shape as f (it also
receives the names of f).
Otherwise, if f is not a list-like object, empty list
elements are removed from the returned object if drop is
TRUE.
In the code snippets below, x is a GRanges object.
x[i, j], x[i, j] <- value:
Get or set elements i with optional metadata columns
mcols(x)[,j], where i can be missing; an NA-free
logical, numeric, or character vector; or a 'logical' Rle object.
x[i,j] <- value:
Replaces elements i and optional metadata columns j
with value.
head(x, n = 6L):
If n is non-negative, returns the first n elements of the
GRanges object.
If n is negative, returns all but the last abs(n) elements
of the GRanges object.
rep(x, times, length.out, each):
Repeats the values in x through one of the following conventions:
timesVector giving the number of times to repeat each
element if of length length(x), or to repeat the whole vector
if of length 1.
length.outNon-negative integer. The desired length of the output vector.
eachNon-negative integer. Each element of x is
repeated each times.
seqselect(x, start=NULL, end=NULL, width=NULL):
Similar to window, except that multiple consecutive subsequences
can be requested for concatenation. As such two of the three start,
end, and width arguments can be used to specify the
consecutive subsequences. Alternatively, start can take a Ranges
object or something that can be converted to a Ranges object like an
integer vector, logical vector or logical Rle. If the concatenation of
the consecutive subsequences is undesirable, consider using
Views.
seqselect(x, start=NULL, end=NULL, width=NULL) <- value:
Similar to window<-, except that multiple consecutive subsequences
can be replaced with a value whose length is a divisor of the
number of elements it is replacing. As such two of the three start,
end, and width arguments can be used to specify the
consecutive subsequences. Alternatively, start can take a Ranges
object or something that can be converted to a Ranges object like an
integer vector, logical vector or logical Rle.
subset(x, subset):
Returns a new object of the same class as x made of the subset
using logical vector subset, where missing values are taken as
FALSE.
tail(x, n = 6L):
If n is non-negative, returns the last n elements of the
GRanges object.
If n is negative, returns all but the first abs(n) elements
of the GRanges object.
window(x, start = NA, end = NA, width = NA, frequency = NULL, delta = NULL, ...):
Extracts the subsequence window from the GRanges object using:
start, end, widthThe start, end, or width of the window. Two of the three are required.
frequency, deltaOptional arguments that specify the sampling frequency and increment within the window.
In general, this is more efficient than using "[" operator.
window(x, start = NA, end = NA, width = NA, keepLength = TRUE) <- value:
Replaces the subsequence window specified on the left (i.e. the subsequence
in x specified by start, end and width)
by value.
value must either be of class class(x), belong to a subclass
of class(x), be coercible to class(x), or be NULL.
If keepLength is TRUE, the elements of value are
repeated to create a GRanges object with the same number of elements
as the width of the subsequence window it is replacing.
If keepLength is FALSE, this replacement method can modify
the length of x, depending on how the length of the left
subsequence window compares to the length of value.
x$name, x$name <- value:
Shortcuts for mcols(x)$name and mcols(x)$name <- value,
respectively. Provided as a convenience, for GRanges objects *only*,
and as the result of strong popular demand.
Note that those methods are not consistent with the other $
and $<- methods in the IRanges/GenomicRanges infrastructure,
and might confuse some users by making them believe that a GRanges
object can be manipulated as a data.frame-like object.
Therefore we recommend using them only interactively, and we discourage
their use in scripts or packages. For the latter, use
mcols(x)$name and mcols(x)$name <- value, instead
of x$name and x$name <- value, respectively.
show(x):
By default the show method displays 5 head and 5 tail
lines. The number of lines can be altered by setting the global
options showHeadLines and showTailLines. If the
object length is less than the sum of the options, the full object
is displayed. These options affect GRanges, GappedAlignments,
Ranges, DataTable and XString objects.
P. Aboyoun
GRangesList-class,
seqinfo,
Vector-class,
Ranges-class,
Rle-class,
DataFrame-class,
intra-range-methods,
inter-range-methods,
setops-methods,
findOverlaps-methods,
nearest-methods,
coverage-methods
seqinfo <- Seqinfo(paste0("chr", 1:3), c(1000, 2000, 1500), NA, "mock1")
gr <-
GRanges(seqnames =
Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
ranges = IRanges(
1:10, width = 10:1, names = head(letters,10)),
strand = Rle(
strand(c("-", "+", "*", "+", "-")),
c(1, 2, 2, 3, 2)),
score = 1:10,
GC = seq(1, 0, length=10),
seqinfo=seqinfo)
gr
## Summarizing elements
table(seqnames(gr))
sum(width(gr))
summary(mcols(gr)[,"score"])
## Renaming the underlying sequences
seqlevels(gr)
seqlevels(gr) <- sub("chr", "Chrom", seqlevels(gr))
gr
seqlevels(gr) <- sub("Chrom", "chr", seqlevels(gr)) # revert
## Combining objects
gr2 <- GRanges(seqnames=Rle(c('chr1', 'chr2', 'chr3'), c(3, 3, 4)),
IRanges(1:10, width=5), strand='-',
score=101:110, GC = runif(10),
seqinfo=seqinfo)
gr3 <- GRanges(seqnames=Rle(c('chr1', 'chr2', 'chr3'), c(3, 4, 3)),
IRanges(101:110, width=10), strand='-',
score=21:30,
seqinfo=seqinfo)
some.gr <- c(gr, gr2)
## all.gr <- c(gr, gr2, gr3) ## (This would fail)
all.gr <- c(gr, gr2, gr3, ignore.mcols=TRUE)
## The number of lines displayed in the 'show' method
## are controlled with two global options.
longGR <- c(gr[,"score"], gr2[,"score"], gr3)
longGR
options("showHeadLines"=7)
options("showTailLines"=2)
longGR
## Revert to default values
options("showHeadLines"=NULL)
options("showTailLines"=NULL)