Reference Sequence
The reference sequence is saved as a RefseqIO object.
Reference sequence: Structure
A RefseqIO object stores a reference sequence as a space-efficient
CompressedSeq object in the private attribute _s.
(The original sequence is available via the cached property refseq.)
A CompressedSeq object has the following attributes:
b(bytes): sequence of bytes, each byte encoding 4 nucleotidess(int): length of the sequence (number of nucleotides)n(tuple[int, ...]): 0-indexed position of eachNr(bool):Trueif the sequence is RNA,Falseif DNA
Reference sequence: Example
Suppose that a DNA sequence is CAGNTTCGAN.
This sequence would be compressed into
b = b'!\x9f\x00'
s = 10
n = (3, 9)
r = False
For details on the algorithm, see Algorithm for Sequence Compression/Decompression.