seismicrna.core.ngs package



seismicrna.core.ngs.phred.decode_phred(quality_code: str, phred_encoding: int)

Decode the ASCII character for a Phred quality score to an integer.

  • quality_code (str) – The Phred score encoded as an ASCII character.

  • phred_encoding (int) – The encoding offset for Phred scores. A Phred score is encoded as the character whose ASCII value is the sum of the phred score and the encoding offset.


The Phred quality score represented by the ASCII character.

Return type:


seismicrna.core.ngs.phred.encode_phred(phred_score: int, phred_encoding: int)

Encode a numeric Phred quality score as an ASCII character.

  • phred_score (int) – The Phred score as an integer.

  • phred_encoding (int) – The encoding offset for Phred scores. A Phred score is encoded as the character whose ASCII value is the sum of the phred score and the encoding offset.


The character whose ASCII code, in the encoding scheme of the FASTQ file, represents valid quality.

Return type:


exception seismicrna.core.ngs.xam.DuplicateSampleReferenceError

Bases: DuplicateValueError

A sample-reference pair occurred more than once.

seismicrna.core.ngs.xam.calc_extra_threads(num_cpus: int)

Calculate the number of extra threads to use (option -@).

seismicrna.core.ngs.xam.collate_xam_cmd(xam_inp: Path | None, xam_out: Path | None, *, tmp_pfx: Path | None = None, fast: bool = False, num_cpus: int = 1)

Collate a SAM or BAM file using samtools collate.

seismicrna.core.ngs.xam.count_single_paired(flagstats: dict)

Count the records in a SAM/BAM file given an output dict from get_flagstats().

seismicrna.core.ngs.xam.count_total_reads(flagstats: dict)

Count the total records in a SAM/BAM file.

seismicrna.core.ngs.xam.flagstat_cmd(xam_inp: Path | None, *, num_cpus: int = 1)

Compute the statistics with samtools flagstat.

seismicrna.core.ngs.xam.idxstats_cmd(xam_inp: Path)

Count the number of reads aligning to each reference.

seismicrna.core.ngs.xam.index_xam_cmd(bam: Path, *, num_cpus: int = 1)

Build an index of a XAM file using samtools index.

seismicrna.core.ngs.xam.parse_flagstat(process: CompletedProcess)

Convert the output into a dict with one entry per line.

seismicrna.core.ngs.xam.parse_idxstats(process: CompletedProcess)

Map each reference to the number of reads aligning to it.

seismicrna.core.ngs.xam.parse_ref_header(process: CompletedProcess)

Map each reference to its header line.

seismicrna.core.ngs.xam.ref_header_cmd(xam_inp: Path, *, num_cpus: int)

Get the header line for each reference.

seismicrna.core.ngs.xam.sort_xam_cmd(xam_inp: Path | None, xam_out: Path | None, *, tmp_pfx: Path | None = None, name: bool = False, num_cpus: int = 1)

Sort a SAM or BAM file using samtools sort.

seismicrna.core.ngs.xam.view_xam_cmd(xam_inp: Path | None, xam_out: Path | None, *, sam: bool = False, bam: bool = False, cram: bool = False, with_header: bool = False, only_header: bool = False, min_mapq: int = 0, flags_req: int = 0, flags_exc: int = 0, ref: str | None = None, end5: int | None = None, end3: int | None = None, refs_file: Path | None = None, num_cpus: int = 1)

Convert between SAM and BAM formats, extract reads aligning to a specific reference/region, and filter by flag and mapping quality using samtools view.

seismicrna.core.ngs.xam.xam_paired(flagstats: dict)

Determine if the reads are single-end or paired-end.

seismicrna.core.ngs.xam.xam_to_fastq_cmd(xam_inp: Path | None, fq_out: Path | None, *, flags_req: int | None = None, flags_exc: int | None = None, label_12: bool = False, num_cpus: int = 1)

Convert XAM format to FASTQ format, and filter by flags.