seismicrna.core.ngs package

Subpackages

Submodules

seismicrna.core.ngs.phred.decode_phred(quality_code: str, phred_encoding: int)

Decode the ASCII character for a Phred quality score to an integer.

Parameters:
  • quality_code (str) – The Phred score encoded as an ASCII character.

  • phred_encoding (int) – The encoding offset for Phred scores. A Phred score is encoded as the character whose ASCII value is the sum of the phred score and the encoding offset.

Returns:

The Phred quality score represented by the ASCII character.

Return type:

int

seismicrna.core.ngs.phred.encode_phred(phred_score: int, phred_encoding: int)

Encode a numeric Phred quality score as an ASCII character.

Parameters:
  • phred_score (int) – The Phred score as an integer.

  • phred_encoding (int) – The encoding offset for Phred scores. A Phred score is encoded as the character whose ASCII value is the sum of the phred score and the encoding offset.

Returns:

The character whose ASCII code, in the encoding scheme of the FASTQ file, represents valid quality.

Return type:

str

exception seismicrna.core.ngs.xam.DuplicateSampleReferenceError

Bases: ValueError

A sample-reference pair occurred more than once.

seismicrna.core.ngs.xam.calc_extra_threads(n_procs: int)

Calculate the number of extra threads to use (option -@).

seismicrna.core.ngs.xam.collate_xam_cmd(xam_inp: Path | None, xam_out: Path | None, *, tmp_pfx: Path | None = None, fast: bool = False, n_procs: int = 1)

Collate a SAM or BAM file using samtools collate.

seismicrna.core.ngs.xam.count_single_paired(flagstats: dict)

Count the records in a SAM/BAM file given an output dict from get_flagstats().

seismicrna.core.ngs.xam.count_total_reads(flagstats: dict)

Count the total records in a SAM/BAM file.

seismicrna.core.ngs.xam.flagstat_cmd(xam_inp: Path | None, *, n_procs: int = 1)

Compute the statistics with samtools flagstat.

seismicrna.core.ngs.xam.idxstats_cmd(xam_inp: Path)

Count the number of reads aligning to each reference.

seismicrna.core.ngs.xam.index_xam_cmd(bam: Path, *, n_procs: int = 1)

Build an index of a XAM file using samtools index.

seismicrna.core.ngs.xam.parse_flagstat(process: CompletedProcess)

Convert the output into a dict with one entry per line.

seismicrna.core.ngs.xam.parse_idxstats(process: CompletedProcess)

Map each reference to the number of reads aligning to it.

seismicrna.core.ngs.xam.parse_ref_header(process: CompletedProcess)

Map each reference to its header line.

seismicrna.core.ngs.xam.ref_header_cmd(xam_inp: Path, *, n_procs: int)

Get the header line for each reference.

seismicrna.core.ngs.xam.sort_xam_cmd(xam_inp: Path | None, xam_out: Path | None, *, tmp_pfx: Path | None = None, name: bool = False, n_procs: int = 1)

Sort a SAM or BAM file using samtools sort.

seismicrna.core.ngs.xam.view_xam_cmd(xam_inp: Path | None, xam_out: Path | None, *, sam: bool = False, bam: bool = False, cram: bool = False, with_header: bool = False, only_header: bool = False, min_mapq: int = 0, flags_req: int = 0, flags_exc: int = 0, ref: str | None = None, end5: int | None = None, end3: int | None = None, refs_file: Path | None = None, n_procs: int = 1)

Convert between SAM and BAM formats, extract reads aligning to a specific reference/region, and filter by flag and mapping quality using samtools view.

seismicrna.core.ngs.xam.xam_paired(flagstats: dict)

Determine if the reads are single-end or paired-end.

seismicrna.core.ngs.xam.xam_to_fastq_cmd(xam_inp: Path | None, fq_out: Path | None, *, flags_req: int | None = None, flags_exc: int | None = None, label_12: bool = False, n_procs: int = 1)

Convert XAM format to FASTQ format, and filter by flags.