| AlignedXStringSet-class {Biostrings} | R Documentation |
The AlignedXStringSet and QualityAlignedXStringSet classes are
containers for storing an aligned XStringSet.
Before we define the notion of alignment, we introduce the notion of "filled-with-gaps subsequence". A "filled-with-gaps subsequence" of a string string1 is obtained by inserting 0 or any number of gaps in a subsequence of s1. For example L-A–ND and A–N-D are "filled-with-gaps subsequences" of LAND. An alignment between two strings string1 and string2 results in two strings (align1 and align2) that have the same length and are "filled-with-gaps subsequences" of string1 and string2.
For example, this is an alignment between LAND and LEAVES:
L-A
LEA
An alignment can be seen as a compact representation of one set of basic operations that transforms string1 into align1. There are 3 different kinds of basic operations: "insertions" (gaps in align1), "deletions" (gaps in align2), "replacements". The above alignment represents the following basic operations:
insert E at pos 2
insert V at pos 4
insert E at pos 5
replace by S at pos 6 (N is replaced by S)
delete at pos 7 (D is deleted)
Note that "insert X at pos i" means that all letters at a position >= i are moved 1 place to the right before X is actually inserted.
There are many possible alignments between two given strings string1 and string2 and a common problem is to find the one (or those ones) with the highest score, i.e. with the lower total cost in terms of basic operations.
In the code snippets below,
x is a AlignedXStringSet or QualityAlignedXStringSet object.
unaligned(x):
The original string.
aligned(x, degap = FALSE):
If degap = FALSE, the "filled-with-gaps subsequence" representing
the aligned substring. If degap = TRUE, the "gap-less subsequence"
representing the aligned substring.
ranges(x): The bounds of the aligned substring.
start(x):
The start of the aligned substring.
end(x):
The end of the aligned substring.
width(x):
The width of the aligned substring, ignoring gaps.
indel(x):
The positions, in the form of an IRanges object, of the insertions or
deletions (depending on what x represents).
nindel(x):
A two-column matrix containing the length and sum of the widths for each of
the elements returned by indel.
length(x):
The length of the aligned(x).
nchar(x):
The nchar of the aligned(x).
alphabet(x):
Equivalent to alphabet(unaligned(x)).
as.character(x):
Converts aligned(x) to a character vector.
toString(x):
Equivalent to toString(as.character(x)).
x[i]:
Returns a new AlignedXStringSet or QualityAlignedXStringSet
object made of the selected elements.
rep(x, times):
Returns a new AlignedXStringSet or QualityAlignedXStringSet
object made of the repeated elements.
P. Aboyoun
pairwiseAlignment,
PairwiseAlignments-class,
XStringSet-class
pattern <- AAString("LAND")
subject <- AAString("LEAVES")
nw1 <- pairwiseAlignment(pattern, subject, substitutionMatrix = "BLOSUM50", gapOpening = 3, gapExtension = 1)
alignedPattern <- pattern(nw1)
unaligned(alignedPattern)
aligned(alignedPattern)
as.character(alignedPattern)
nchar(alignedPattern)