seismicrna.align package

Subpackages

Submodules

class seismicrna.align.fqunit.FastqUnit(*, fastqz: Path | None = None, fastqy: Path | None = None, fastq1: Path | None = None, fastq2: Path | None = None, phred_enc: int, one_ref: bool)

Bases: object

Unified interface for the following sets of sequencing reads:

  • One FASTQ file of single-end reads from one sample

  • One FASTQ file of interleaved, paired-end reads from one sample

  • Two FASTQ files of mate 1 and 2 paired-end reads from one sample

  • One FASTQ file of single-end reads originating from one reference sequence in one sample

  • One FASTQ file of interleaved, paired-end reads originating from one reference sequence in one sample

  • Two FASTQ files of mate 1 and mate 2 paired-end reads originating from one reference sequence in one sample

BOWTIE2_FLAGS = {'fastq1': '-1', 'fastq2': '-2', 'fastqy': '--interleaved', 'fastqz': '-U'}
KEY_DINTER = 'dmfastqy'
KEY_DMATED = 'dmfastqx'
KEY_DSINGLE = 'dmfastqz'
KEY_INTER = 'fastqy'
KEY_MATE1 = 'fastq1'
KEY_MATE2 = 'fastq2'
KEY_MATED = 'fastqx'
KEY_SINGLE = 'fastqz'
MAX_PHRED_ENC = 127
property bowtie2_inputs

Return input file arguments for Bowtie2.

fields(key: str)
classmethod from_paths(*, phred_enc: int, **fastq_args: Iterable[str | Path])

Yield a FastqUnit for each FASTQ file (or each pair of mate 1 and mate 2 FASTQ files) whose paths are given as strings.

Parameters:
  • phred_enc (int) – ASCII offset for encoding Phred scores

  • fastq_args (list[Path]) – FASTQ files, given as iterables of paths: - fastqz: single-end - fastqy: interleaved paired-end - fastqx: mated paired-end - dmfastqz: demultiplexed single-end - dmfastqy: demultiplexed interleaved paired-end - dmfastqx: demultiplexed mated paired-end

Yields:

FastqUnit – FastqUnit representing the FASTQ or pair of FASTQ files. The order is determined primarily by the order of keyword arguments; within each keyword argument, by the order of file or directory paths; and for directories, by the order in which os.path.listdir returns file paths.

get_sample_ref_exts()

Return the sample and reference of the FASTQ file(s).

property kind
property n_reads: int

Number of reads in the FASTQ file(s).

property parent

Return the parent directory of the FASTQ file(s).

property phred_arg
property seg_types: dict[str, tuple[Segment, ...]]
to_new(*new_segments: Segment, **new_fields)

Return a new FASTQ unit with updated path fields.

exception seismicrna.align.fqunit.MissingFastqMate

Bases: FileNotFoundError

Missing a file in a pair of paired-end FASTQ files.

exception seismicrna.align.fqunit.MissingFastqMate1

Bases: MissingFastqMate

Missing mate 1 in a pair of paired-end FASTQ files.

exception seismicrna.align.fqunit.MissingFastqMate2

Bases: MissingFastqMate

Missing mate 2 in a pair of paired-end FASTQ files.

seismicrna.align.fqunit.count_fastq_reads(fastq_file: Path)

Count the reads in a FASTQ file.

seismicrna.align.fqunit.fastq_gz(fastq_file: Path)

Return whether a FASTQ file is compressed with gzip.

seismicrna.align.fqunit.format_phred_arg(phred_enc: int)

Format a Phred score argument for Bowtie2.

seismicrna.align.fqunit.get_args_count_fastq_reads(fastq_file: Path)

Count the reads in a FASTQ file.

seismicrna.align.fqunit.parse_stdout_count_fastq_reads(process: CompletedProcess)

Parse the output of word count to find the number of reads.

seismicrna.align.main.run(fasta: str | Path, *, fastqz: Iterable[str | Path] = (), fastqy: Iterable[str | Path] = (), fastqx: Iterable[str | Path] = (), dmfastqz: Iterable[str | Path] = (), dmfastqy: Iterable[str | Path] = (), dmfastqx: Iterable[str | Path] = (), phred_enc: int = 33, out_dir: str | Path = './out', keep_tmp: bool = False, fastp: bool = True, fastp_5: bool = False, fastp_3: bool = True, fastp_w: int = 6, fastp_m: int = 25, fastp_poly_g: str = 'auto', fastp_poly_g_min_len: int = 10, fastp_poly_x: bool = False, fastp_poly_x_min_len: int = 10, fastp_adapter_trimming: bool = True, fastp_adapter_1: str = '', fastp_adapter_2: str = '', fastp_adapter_fasta: str | None = None, fastp_detect_adapter_for_pe: bool = True, fastp_min_length: int = 9, bt2_local: bool = True, bt2_discordant: bool = False, bt2_mixed: bool = False, bt2_dovetail: bool = False, bt2_contain: bool = True, bt2_score_min_e2e: str = 'L,-1,-0.8', bt2_score_min_loc: str = 'L,1,0.8', bt2_i: int = 0, bt2_x: int = 600, bt2_gbar: int = 4, bt2_l: int = 20, bt2_s: str = 'L,1,0.1', bt2_d: int = 4, bt2_r: int = 2, bt2_dpad: int = 2, bt2_orient: str = 'fr', bt2_un: bool = True, min_mapq: int = 25, min_reads: int = 1000, sep_strands: bool = False, f1r2_fwd: bool = False, rev_label: str = '-rev', max_procs: int = 4, force: bool = False, tmp_pfx='./tmp') list[Path]

Trim FASTQ files and align them to reference sequences.

Parameters:
  • fastqz (Iterable) – FASTQ file(s) of single-end reads [keyword-only, default: ()]

  • fastqy (Iterable) – FASTQ file(s) of paired-end reads with mates 1 and 2 interleaved [keyword-only, default: ()]

  • fastqx (Iterable) – FASTQ files of paired-end reads with mates 1 and 2 in separate files [keyword-only, default: ()]

  • dmfastqz (Iterable) – Demultiplexed FASTQ files of single-end reads [keyword-only, default: ()]

  • dmfastqy (Iterable) – Demultiplexed FASTQ files of paired-end reads interleaved in one file [keyword-only, default: ()]

  • dmfastqx (Iterable) – Demultiplexed FASTQ files of mate 1 and mate 2 reads [keyword-only, default: ()]

  • phred_enc (int) – Specify the Phred score encoding of FASTQ and SAM/BAM/CRAM files [keyword-only, default: 33]

  • out_dir (str | pathlib._local.Path) – Write all output files to this directory [keyword-only, default: ‘./out’]

  • keep_tmp (bool) – Keep temporary files after finishing [keyword-only, default: False]

  • fastp (bool) – Use fastp to QC, filter, and trim reads before alignment [keyword-only, default: True]

  • fastp_5 (bool) – Trim low-quality bases from the 5’ ends of reads [keyword-only, default: False]

  • fastp_3 (bool) – Trim low-quality bases from the 3’ ends of reads [keyword-only, default: True]

  • fastp_w (int) – Use this window size (nt) for –fastp-5 and –fastp-3 [keyword-only, default: 6]

  • fastp_m (int) – Use this mean quality threshold for –fastp-5 and –fastp-3 [keyword-only, default: 25]

  • fastp_poly_g (str) – Trim poly(G) tails (two-color sequencing artifacts) from the 3’ end [keyword-only, default: ‘auto’]

  • fastp_poly_g_min_len (int) – Minimum number of Gs to consider a poly(G) tail for –fastp-poly-g [keyword-only, default: 10]

  • fastp_poly_x (bool) – Trim poly(X) tails (i.e. of any nucleotide) from the 3’ end [keyword-only, default: False]

  • fastp_poly_x_min_len (int) – Minimum number of bases to consider a poly(X) tail for –fastp-poly-x [keyword-only, default: 10]

  • fastp_adapter_trimming (bool) – Trim adapter sequences from the 3’ ends of reads [keyword-only, default: True]

  • fastp_adapter_1 (str) – Trim this adapter sequence from the 3’ ends of read 1s [keyword-only, default: ‘’]

  • fastp_adapter_2 (str) – Trim this adapter sequence from the 3’ ends of read 2s [keyword-only, default: ‘’]

  • fastp_adapter_fasta (str | None) – Trim adapter sequences in this FASTA file from the 3’ ends of reads [keyword-only, default: None]

  • fastp_detect_adapter_for_pe (bool) – Automatically detect the adapter sequences for paired-end reads [keyword-only, default: True]

  • fastp_min_length (int) – Discard reads shorter than this length [keyword-only, default: 9]

  • bt2_local (bool) – Align reads in local mode rather than end-to-end mode [keyword-only, default: True]

  • bt2_discordant (bool) – Output paired-end reads whose mates align discordantly [keyword-only, default: False]

  • bt2_mixed (bool) – Attempt to align individual mates of pairs that fail to align [keyword-only, default: False]

  • bt2_dovetail (bool) – Consider dovetailed mate pairs to align concordantly [keyword-only, default: False]

  • bt2_contain (bool) – Consider nested mate pairs to align concordantly [keyword-only, default: True]

  • bt2_score_min_e2e (str) – Discard alignments that score below this threshold in end-to-end mode [keyword-only, default: ‘L,-1,-0.8’]

  • bt2_score_min_loc (str) – Discard alignments that score below this threshold in local mode [keyword-only, default: ‘L,1,0.8’]

  • bt2_i (int) – Discard paired-end alignments shorter than this many bases [keyword-only, default: 0]

  • bt2_x (int) – Discard paired-end alignments longer than this many bases [keyword-only, default: 600]

  • bt2_gbar (int) – Do not place gaps within this many bases from the end of a read [keyword-only, default: 4]

  • bt2_l (int) – Use this seed length for Bowtie2 [keyword-only, default: 20]

  • bt2_s (str) – Seed Bowtie2 alignments at this interval [keyword-only, default: ‘L,1,0.1’]

  • bt2_d (int) – Discard alignments if over this many consecutive seed extensions fail [keyword-only, default: 4]

  • bt2_r (int) – Re-seed reads with repetitive seeds up to this many times [keyword-only, default: 2]

  • bt2_dpad (int) – Pad the alignment matrix with this many bases (to allow gaps) [keyword-only, default: 2]

  • bt2_orient (str) – Require paired mates to have this orientation [keyword-only, default: ‘fr’]

  • bt2_un (bool) – Output unaligned reads to a FASTQ file [keyword-only, default: True]

  • min_mapq (int) – Discard reads with mapping qualities below this threshold [keyword-only, default: 25]

  • min_reads (int) – Discard alignment maps with fewer than this many reads [keyword-only, default: 1000]

  • sep_strands (bool) – Separate each alignment map into forward- and reverse-strand reads [keyword-only, default: False]

  • f1r2_fwd (bool) – With –sep-strands, consider forward mate 1s and reverse mate 2s to be forward-stranded [keyword-only, default: False]

  • rev_label (str) – With –sep-strands, add this label to each reverse-strand reference [keyword-only, default: ‘-rev’]

  • max_procs (int) – Run up to this many processes simultaneously [keyword-only, default: 4]

  • force (bool) – Force all tasks to run, overwriting any existing output files [keyword-only, default: False]

  • tmp_pfx – Write all temporary files to a directory with this prefix [keyword-only, default: ‘./tmp’]

class seismicrna.align.report.AlignRefReport(ref: str, **kwargs)

Bases: AlignReport

classmethod fields()

All fields of the report.

classmethod file_seg_type()

Type of the last segment in the path.

class seismicrna.align.report.AlignReport(**kwargs: Any | Callable[[Report], Any])

Bases: Report, ABC

classmethod auto_fields()

Names and automatic values of selected fields.

classmethod dir_seg_types()

Types of the directory segments in the path.

abstract classmethod fields()

All fields of the report.

class seismicrna.align.report.AlignSampleReport(ref: str | None = None, **kwargs)

Bases: AlignReport

classmethod fields()

All fields of the report.

classmethod file_seg_type()

Type of the last segment in the path.

seismicrna.align.split.run(fasta: str | Path, *, input_path: Iterable[str | Path], phred_enc: int = 33, out_dir: str | Path = './out', keep_tmp: bool = False, bt2_local: bool = True, bt2_discordant: bool = False, bt2_mixed: bool = False, bt2_dovetail: bool = False, bt2_contain: bool = True, bt2_score_min_e2e: str = 'L,-1,-0.8', bt2_score_min_loc: str = 'L,1,0.8', bt2_i: int = 0, bt2_x: int = 600, bt2_gbar: int = 4, bt2_l: int = 20, bt2_s: str = 'L,1,0.1', bt2_d: int = 4, bt2_r: int = 2, bt2_dpad: int = 2, bt2_orient: str = 'fr', min_mapq: int = 25, min_reads: int = 1000, sep_strands: bool = False, f1r2_fwd: bool = False, rev_label: str = '-rev', max_procs: int = 4, force: bool = False, tmp_pfx='./tmp') list[Path]

Trim FASTQ files and align them to reference sequences.

Parameters:
  • phred_enc (int) – Specify the Phred score encoding of FASTQ and SAM/BAM/CRAM files [keyword-only, default: 33]

  • out_dir (str | pathlib._local.Path) – Write all output files to this directory [keyword-only, default: ‘./out’]

  • keep_tmp (bool) – Keep temporary files after finishing [keyword-only, default: False]

  • bt2_local (bool) – Align reads in local mode rather than end-to-end mode [keyword-only, default: True]

  • bt2_discordant (bool) – Output paired-end reads whose mates align discordantly [keyword-only, default: False]

  • bt2_mixed (bool) – Attempt to align individual mates of pairs that fail to align [keyword-only, default: False]

  • bt2_dovetail (bool) – Consider dovetailed mate pairs to align concordantly [keyword-only, default: False]

  • bt2_contain (bool) – Consider nested mate pairs to align concordantly [keyword-only, default: True]

  • bt2_score_min_e2e (str) – Discard alignments that score below this threshold in end-to-end mode [keyword-only, default: ‘L,-1,-0.8’]

  • bt2_score_min_loc (str) – Discard alignments that score below this threshold in local mode [keyword-only, default: ‘L,1,0.8’]

  • bt2_i (int) – Discard paired-end alignments shorter than this many bases [keyword-only, default: 0]

  • bt2_x (int) – Discard paired-end alignments longer than this many bases [keyword-only, default: 600]

  • bt2_gbar (int) – Do not place gaps within this many bases from the end of a read [keyword-only, default: 4]

  • bt2_l (int) – Use this seed length for Bowtie2 [keyword-only, default: 20]

  • bt2_s (str) – Seed Bowtie2 alignments at this interval [keyword-only, default: ‘L,1,0.1’]

  • bt2_d (int) – Discard alignments if over this many consecutive seed extensions fail [keyword-only, default: 4]

  • bt2_r (int) – Re-seed reads with repetitive seeds up to this many times [keyword-only, default: 2]

  • bt2_dpad (int) – Pad the alignment matrix with this many bases (to allow gaps) [keyword-only, default: 2]

  • bt2_orient (str) – Require paired mates to have this orientation [keyword-only, default: ‘fr’]

  • min_mapq (int) – Discard reads with mapping qualities below this threshold [keyword-only, default: 25]

  • min_reads (int) – Discard alignment maps with fewer than this many reads [keyword-only, default: 1000]

  • sep_strands (bool) – Separate each alignment map into forward- and reverse-strand reads [keyword-only, default: False]

  • f1r2_fwd (bool) – With –sep-strands, consider forward mate 1s and reverse mate 2s to be forward-stranded [keyword-only, default: False]

  • rev_label (str) – With –sep-strands, add this label to each reverse-strand reference [keyword-only, default: ‘-rev’]

  • max_procs (int) – Run up to this many processes simultaneously [keyword-only, default: 4]

  • force (bool) – Force all tasks to run, overwriting any existing output files [keyword-only, default: False]

  • tmp_pfx – Write all temporary files to a directory with this prefix [keyword-only, default: ‘./tmp’]

seismicrna.align.split.split_xam_file(xam_file: Path, out_dir: Path, tmp_dir: Path, fasta: Path, phred_enc: int, force: bool, n_procs: int, **kwargs)
seismicrna.align.write.align_samples(fq_units: list[FastqUnit], fasta: Path, *, out_dir: Path, force: bool, **kwargs) list[Path]

Run the alignment pipeline and return a tuple of all XAM files from the pipeline.

seismicrna.align.write.calc_flags_sep_strands(f1r2_fwd: bool, paired: bool, bt2_mixed: bool)

Calculate flags for separating strands.

seismicrna.align.write.check_fqs_xams(alignments: dict[tuple[str, str], FastqUnit], out_dir: Path)

Return every FASTQ unit on which alignment must be run and every expected XAM file that already exists.

seismicrna.align.write.extract_reference(ref: str, header: str, *, xam_whole: Path, fasta: Path, sample: str, top: Path, min_reads: int, sep_strands: bool, n_procs: int = 1, **kwargs)

Extract one reference from a XAM file.

seismicrna.align.write.format_ref_reverse(ref: str, rev_label: str)

Name the reverse strand of a reference.

seismicrna.align.write.fq_pipeline(fq_inp: FastqUnit, fasta: Path, bowtie2_index: Path, *, out_dir: Path, tmp_dir: Path, keep_tmp: bool, fastp: bool, fastp_5: bool, fastp_3: bool, fastp_w: int, fastp_m: int, fastp_poly_g: str, fastp_poly_g_min_len: int, fastp_poly_x: bool, fastp_poly_x_min_len: int, fastp_adapter_trimming: bool, fastp_adapter_1: str, fastp_adapter_2: str, fastp_adapter_fasta: Path | None, fastp_detect_adapter_for_pe: bool, fastp_min_length: int, bt2_local: bool, bt2_discordant: bool, bt2_mixed: bool, bt2_dovetail: bool, bt2_contain: bool, bt2_score_min_e2e: str, bt2_score_min_loc: str, bt2_i: int, bt2_x: int, bt2_gbar: int, bt2_l: int, bt2_s: str, bt2_d: int, bt2_r: int, bt2_dpad: int, bt2_orient: str, bt2_un: bool, min_mapq: int, min_reads: int, sep_strands: bool, f1r2_fwd: bool, rev_label: str, n_procs: int = 1) list[Path]

Run all stages of the alignment pipeline for one FASTQ file or one pair of mated FASTQ files.

seismicrna.align.write.fqs_pipeline(fq_units: list[FastqUnit], main_fasta: Path, *, max_procs: int, out_dir: Path, tmp_dir: Path, keep_tmp: bool, **kwargs) list[Path]

Run all stages of alignment for one or more FASTQ files or pairs of mated FASTQ files.

seismicrna.align.write.list_alignments(fq_units: list[FastqUnit], refs: set[str])

List every expected alignment of a sample to a reference.

seismicrna.align.write.list_fqs_xams(alignments: dict[tuple[str, str], FastqUnit], out_dir: Path)

List every FASTQ to align and every extant XAM file.

seismicrna.align.write.merge_nondemult_fqs(fq_units: Iterable[FastqUnit])

For every FASTQ that is not demultiplexed, merge all the keys that map to the FASTQ into one key: (sample, None). Merging ensures that every non-demultiplexed FASTQ is aligned only once to the whole set of references, not once for every reference in the set. This function is essentially the inverse of figure_alignments.

seismicrna.align.write.separate_strands(xam_file: Path, fasta: Path, *, paired: bool | None, bt2_mixed: bool, rev_label: str, f1r2_fwd: bool, min_mapq: int, keep_tmp: bool, n_procs: int = 1, **kwargs)

Separate a XAM file into two XAM files of reads that aligned to the forward and reverse strands, respectively.

seismicrna.align.write.split_references(xam_whole: Path, *, fasta: Path, paired: bool | None, phred_arg: str, top: Path, keep_tmp: bool, bt2_local: bool, bt2_discordant: bool, bt2_mixed: bool, bt2_dovetail: bool, bt2_contain: bool, bt2_score_min_e2e: str, bt2_score_min_loc: str, bt2_i: int, bt2_x: int, bt2_gbar: int, bt2_l: int, bt2_s: str, bt2_d: int, bt2_r: int, bt2_dpad: int, bt2_orient: str, min_mapq: int, min_reads: int, sep_strands: bool, f1r2_fwd: bool, rev_label: str, n_procs: int = 1)

Split a XAM file into one file per reference.

seismicrna.align.write.write_tmp_ref_files(tmp_dir: Path, refset_path: Path, refs: set[str], n_procs: int = 1)

Write temporary FASTA files, each containing one reference that corresponds to a FASTQ file from demultiplexing.

Alignment XAM Generation Module


Alignment Score Parameters for Bowtie2

Consider this example: Ref = ACGT, Read = AG

Assume that we want to minimize the number of edits needed to convert the reference into the read sequence. The smallest number of edits is two, specifically these two deletions (/) from the reference: [A/G/] which gets a score of (2 * match - 2 * gap_open - 2 * gap_extend).

But there are two alternative alignments, each with 3 edits: [Ag//] and [A//g] (substitutions marked in lowercase). Each gets the score (match - substitution - gap_open - 2 * gap_extend).

In order to favor the simpler alignment with two edits, (2 * match - 2 * gap_open - 2 * gap_extend) must be greater than (match - substitution - gap_open - 2 * gap_extend). This inequality simplifies to (substitution > gap_open - match).

Thus, the substitution penalty and match bonus must be relatively large, and the gap open penalty small. We want to avoid introducing too many gaps, especially to prevent the introduction of an insertion and a deletion from scoring better than one substitution.

Consider this example: Ref = ATAT, Read = ACTT

The simplest alignment (the smallest number of mutations) is ActT, which gets a score of (2 * match - 2 * substitution). Another alignment with indels is A{C}T/T, where {C} means a C was inserted into the read and the / denotes an A deleted from the read. This alignment scores (3 * match - 2 * gap_open - 2 * gap_extend).

Thus, (2 * match - 2 * substitution) must be greater than (3 * match - 2 * gap_open - 2 * gap_extend), which simplifies to (2 * gap_open + 2 * gap_extend > match + 2 * substitution).

There are two easy solutions to these inequalities: - Bowtie v2.5 defaults: 6 > 5 - 2 and 2*5 + 2*3 > 2 + 2*6 - Set every score to 1: 1 > 1 - 1 and 2*1 + 2*1 > 1 + 2*1

seismicrna.align.xamops.bowtie2_build_cmd(fasta: Path, prefix: Path, *, n_procs: int = 1)

Build a Bowtie2 index of a FASTA file.

seismicrna.align.xamops.bowtie2_cmd(fq_inp: FastqUnit | None, sam_out: Path | None, *, paired: bool | None = None, phred_arg: str | None = None, index_pfx: Path, bt2_local: bool, bt2_discordant: bool, bt2_mixed: bool, bt2_dovetail: bool, bt2_contain: bool, bt2_score_min_e2e: str, bt2_score_min_loc: str, bt2_i: int, bt2_x: int, bt2_gbar: int, bt2_l: int, bt2_s: str, bt2_d: int, bt2_r: int, bt2_dpad: int, bt2_orient: str, fq_unal: Path | None = None, n_procs: int)
seismicrna.align.xamops.export_cmd(xam_in: Path, xam_out: Path, *, ref: str, header: str, ref_file: Path | None = None, n_procs: int = 1)

Select and export one reference to its own XAM file.

seismicrna.align.xamops.fastp_cmd(fq_inp: FastqUnit, fq_out: FastqUnit | None, *, fastp_html: Path, fastp_json: Path, fastp_5: bool, fastp_3: bool, fastp_w: int, fastp_m: int, fastp_poly_g: str, fastp_poly_g_min_len: int, fastp_poly_x: bool, fastp_poly_x_min_len: int, fastp_adapter_trimming: bool, fastp_adapter_1: str, fastp_adapter_2: str, fastp_adapter_fasta: Path | None, fastp_detect_adapter_for_pe: bool, fastp_min_length: int, n_procs: int)
seismicrna.align.xamops.flags_cmd(*args, **kwargs)

Filter a XAM file based on flags, then collate the output.

seismicrna.align.xamops.flags_cmds(xam_inp: Path, xam_out: Path | None, *, tmp_pfx: Path | None = None, flags_req: int | Iterable[int] = (), flags_exc: int | Iterable[int] = (), n_procs: int = 1)

Filter a XAM file based on flags, then collate the output.

seismicrna.align.xamops.get_bowtie2_index_paths(prefix: Path)

Return the Bowtie 2 index paths for a FASTA file.

seismicrna.align.xamops.parse_bowtie2(process: CompletedProcess)

Get the number of reads input and aligned.

seismicrna.align.xamops.realign_cmd(xam_inp: Path, xam_out: Path, *, paired: bool | None = None, tmp_pfx: Path | None = None, flags_req: int | Iterable[int] = (), flags_exc: int | Iterable[int] = (), min_mapq: int | None = None, n_procs: int = 1, **kwargs)

Re-align reads that are already in a XAM file.

seismicrna.align.xamops.xamgen_cmd(fq_inp: FastqUnit, bam_out: Path, *, fastp: bool, fastp_dir: Path, fastp_5: bool, fastp_3: bool, fastp_w: int, fastp_m: int, fastp_poly_g: str, fastp_poly_g_min_len: int, fastp_poly_x: bool, fastp_poly_x_min_len: int, fastp_adapter_trimming: bool, fastp_adapter_1: str, fastp_adapter_2: str, fastp_adapter_fasta: Path | None, fastp_detect_adapter_for_pe: bool, fastp_min_length: int, index_pfx: Path, bt2_local: bool, bt2_discordant: bool, bt2_mixed: bool, bt2_dovetail: bool, bt2_contain: bool, bt2_score_min_e2e: str, bt2_score_min_loc: str, bt2_i: int, bt2_x: int, bt2_gbar: int, bt2_l: int, bt2_s: str, bt2_d: int, bt2_r: int, bt2_dpad: int, bt2_orient: str, fq_unal: Path | None = None, min_mapq: int | None = None, n_procs: int = 1)

Wrap QC, alignment, and post-processing into one pipeline.