seismicrna.core.ngs package
Subpackages
Submodules
- seismicrna.core.ngs.phred.decode_phred(quality_code: str, phred_encoding: int)
Decode the ASCII character for a Phred quality score to an integer.
- Parameters:
- Returns:
The Phred quality score represented by the ASCII character.
- Return type:
- seismicrna.core.ngs.phred.encode_phred(phred_score: int, phred_encoding: int)
Encode a numeric Phred quality score as an ASCII character.
- Parameters:
- Returns:
The character whose ASCII code, in the encoding scheme of the FASTQ file, represents valid quality.
- Return type:
- exception seismicrna.core.ngs.xam.DuplicateSampleReferenceError
Bases:
DuplicateValueErrorA sample-reference pair occurred more than once.
- seismicrna.core.ngs.xam.calc_extra_threads(num_cpus: int)
Calculate the number of extra threads to use (option -@).
- seismicrna.core.ngs.xam.collate_xam_cmd(xam_inp: Path | None, xam_out: Path | None, *, tmp_pfx: Path | None = None, fast: bool = False, num_cpus: int = 1)
Collate a SAM or BAM file using samtools collate.
- Parameters:
xam_inp (
Path | None) – Input SAM/BAM file; reads from stdin (“-”) if None.xam_out (
Path | None) – Output SAM/BAM file; writes to stdout if None.tmp_pfx (
Path | None) – Prefix for temporary files; defaults to xam_out without its extension if xam_out is not None.fast (
bool = False) – Use fast mode, which outputs primary alignments only.num_cpus (
int = 1) – Number of threads to use.
- Returns:
Shell command string.
- Return type:
- seismicrna.core.ngs.xam.count_single_paired(flagstats: dict)
Count the records in a SAM/BAM file given an output dict from get_flagstats().
- seismicrna.core.ngs.xam.count_total_reads(flagstats: dict)
Count the total records in a SAM/BAM file.
- seismicrna.core.ngs.xam.flagstat_cmd(xam_inp: Path | None, *, num_cpus: int = 1)
Compute the statistics with samtools flagstat.
- seismicrna.core.ngs.xam.idxstats_cmd(xam_inp: Path)
Count the number of reads aligning to each reference.
- seismicrna.core.ngs.xam.index_xam_cmd(bam: Path, *, num_cpus: int = 1)
Build an index of a XAM file using samtools index.
- seismicrna.core.ngs.xam.parse_flagstat(process: CompletedProcess)
Convert the output into a dict with one entry per line.
- seismicrna.core.ngs.xam.parse_idxstats(process: CompletedProcess)
Map each reference to the number of reads aligning to it.
- seismicrna.core.ngs.xam.parse_ref_header(process: CompletedProcess)
Map each reference to its header line.
- seismicrna.core.ngs.xam.ref_header_cmd(xam_inp: Path, *, num_cpus: int)
Get the header line for each reference.
- seismicrna.core.ngs.xam.sort_xam_cmd(xam_inp: Path | None, xam_out: Path | None, *, tmp_pfx: Path | None = None, name: bool = False, num_cpus: int = 1)
Sort a SAM or BAM file using samtools sort.
- Parameters:
xam_inp (
Path | None) – Input SAM/BAM file; reads from stdin if None.xam_out (
Path | None) – Output SAM/BAM file; writes to stdout if None.tmp_pfx (
Path | None) – Prefix for temporary files; defaults to xam_out without its extension if xam_out is not None.name (
bool = False) – Sort by read name instead of by coordinate.num_cpus (
int = 1) – Number of threads to use.
- Returns:
Shell command string.
- Return type:
- seismicrna.core.ngs.xam.view_xam_cmd(xam_inp: Path | None, xam_out: Path | None, *, sam: bool = False, bam: bool = False, cram: bool = False, with_header: bool = False, only_header: bool = False, min_mapq: int = 0, flags_req: int = 0, flags_exc: int = 0, ref: str | None = None, end5: int | None = None, end3: int | None = None, refs_file: Path | None = None, num_cpus: int = 1)
Convert between SAM and BAM formats, extract reads aligning to a specific reference/region, and filter by flag and mapping quality using samtools view.
- Parameters:
xam_inp (
Path | None) – Input SAM/BAM file; reads from stdin if None.xam_out (
Path | None) – Output SAM/BAM file; writes to stdout if None.sam (
bool = False) – Output in SAM format.bam (
bool = False) – Output in BAM format.cram (
bool = False) – Output in CRAM format (requires refs_file).with_header (
bool = False) – Include the header in the output.only_header (
bool = False) – Output only the header.min_mapq (
int = 0) – Minimum mapping quality score required.flags_req (
int = 0) – Bitwise flag; only output reads with all these bits set.flags_exc (
int = 0) – Bitwise flag; skip reads with any of these bits set.ref (
str | None) – Only output reads aligning to this reference.end5 (
int | None) – 5’ coordinate of the region to extract (requires ref and end3).end3 (
int | None) – 3’ coordinate of the region to extract (requires ref and end5).refs_file (
Path | None) – FASTA file of reference sequences (required for CRAM output).num_cpus (
int = 1) – Number of threads to use.
- Returns:
Shell command string.
- Return type:
- seismicrna.core.ngs.xam.xam_paired(flagstats: dict)
Determine if the reads are single-end or paired-end.
- seismicrna.core.ngs.xam.xam_to_fastq_cmd(xam_inp: Path | None, fq_out: Path | None, *, flags_req: int | None = None, flags_exc: int | None = None, label_12: bool = False, num_cpus: int = 1)
Convert XAM format to FASTQ format, and filter by flags.
- Parameters:
xam_inp (
Path | None) – Input SAM/BAM file; reads from stdin if None.fq_out (
Path | None) – Output FASTQ file; writes to stdout if None.flags_req (
int | None) – Bitwise flag; only output reads with all these bits set.flags_exc (
int | None) – Bitwise flag; skip reads with any of these bits set.label_12 (
bool = False) – Add /1 and /2 suffixes to the names of paired-end reads.num_cpus (
int = 1) – Number of threads to use.
- Returns:
Shell command string.
- Return type: