| OverlapEncodings-class {IRanges} | R Documentation |
The OverlapEncodings class is a container for storing the
"overlap encodings" returned by the encodeOverlaps
function.
## OverlapEncodings accessors: ## S4 method for signature 'OverlapEncodings' length(x) ## S4 method for signature 'OverlapEncodings' Loffset(x) ## S4 method for signature 'OverlapEncodings' Roffset(x) ## S4 method for signature 'OverlapEncodings' encoding(x) ## Coercing an OverlapEncodings object: ## S4 method for signature 'OverlapEncodings' as.data.frame(x, row.names=NULL, optional=FALSE, ...)
x |
An OverlapEncodings object. |
row.names |
|
optional, ... |
Ignored. |
Given a query and a subject of the same length, both
list-like objects with top-level elements typically containing multiple
ranges (e.g. RangesList objects), the "overlap encoding" of the
i-th element in query and i-th element in subject is a
character string describing how the ranges in query[[i]] are
qualitatively positioned relatively to the ranges in
subject[[i]].
The encodeOverlaps function computes those overlap
encodings and returns them in an OverlapEncodings object of the same
length as query and subject.
In the following code snippets, x is an OverlapEncodings object
typically obtained by a call to encodeOverlaps(query, subject).
length(x):
Get the number of elements (i.e. encodings) in x.
This is equal to length(query) and length(subject).
Loffset(x), Roffset(x):
Get the "left-offsets" and "right-offsets" of the encodings,
respectively. Both are integer vectors of the same length as x.
Let's denote Qi = query[[i]], Si = subject[[i]],
and [q1,q2] the range covered by Qi i.e.
q1 = min(start(Qi)) and q2 = max(end(Qi)),
then Loffset(x)[i] is the number L of ranges at the
head of Si that are strictly to the left of all
the ranges in Qi i.e. L is the greatest value such that
end(Si)[k] < q1 - 1 for all k in seq_len(L).
Similarly, Roffset(x)[i] is the number R of ranges at the
tail of Si that are strictly to the right of all
the ranges in Qi i.e. R is the greatest value such that
start(Si)[length(Si) + 1 - k] > q2 + 1 for all k
in seq_len(L).
encoding(x):
Factor of the same length as x where the i-th element is
the encoding obtained by comparing each range in Qi with
all the ranges in tSi = Si[(1+L):(length(Si)-R)] (tSi
stands for "trimmed Si").
More precisely, here is how this encoding is obtained:
All the ranges in Qi are compared with tSi[1],
then with tSi[2], etc...
At each step (one step per range in tSi), comparing
all the ranges in Qi with tSi[k] is done with
rangeComparisonCodeToLetter(compare(Qi, tSi[k])).
So at each step, we end up with a vector of M
single letters (where M is length(Qi)).
Each vector obtained previously (1 vector per range in
tSi, all of them of length M) is turned
into a single string by pasting its individual letters together.
All the strings obtained previously (1 per range in tSi)
are pasted together into a single long string and separated
by colons (":"). An additional colon is prepended to
the long string and another one appended to it.
Finally, the value of M is prepended to the long
string. The final string is the encoding.
In the following code snippets, x is an OverlapEncodings object.
as.data.frame(x):
Return x as a data frame with columns "Loffset",
"Roffset" and "encoding".
H. Pages
encodeOverlaps, compare, RangesList-class
example(encodeOverlaps) # to make 'ovenc' length(ovenc) Loffset(ovenc) Roffset(ovenc) encoding(ovenc) as.data.frame(ovenc)