seismicrna.importmm package

Subpackages

Submodules

seismicrna.importmm.main.run(input_path: Iterable[str | Path] = Sentinel.UNSET, *, sample: str = 'sim-sample', out_dir: str | Path = './out', branch: str = '', min_reads: int = 1000, batch_size: int = 65536, insert3: bool = True, write_read_names: bool = False, relate_pos_table: bool = True, relate_read_table: bool = False, brotli_level: int = 10, num_cpus: int = 4, force: bool = False, tmp_pfx='./tmp', keep_tmp=False)

Import RNA Framework Mutation Map (MM) files as relate outputs.

Parameters:
  • sample (str) – Give this name to the simulated sample [keyword-only, default: ‘sim-sample’]

  • out_dir (str | pathlib._local.Path) – Write all output files to this directory [keyword-only, default: ‘./out’]

  • branch (str) – Create a new branch of the workflow with this name [keyword-only, default: ‘’]

  • min_reads (int) – Discard alignment maps with fewer than this many reads [keyword-only, default: 1000]

  • batch_size (int) – Limit batches to at most this many reads [keyword-only, default: 65536]

  • insert3 (bool) – Mark each insertion on the base to its 3’ (True) or 5’ (False) side [keyword-only, default: True]

  • write_read_names (bool) – Write the name of each read in a second set of batches (necessary for the options –mask-read or –mask-read-file) [keyword-only, default: False]

  • relate_pos_table (bool) – Tabulate relationships per position for relate data [keyword-only, default: True]

  • relate_read_table (bool) – Tabulate relationships per read for relate data [keyword-only, default: False]

  • brotli_level (int) – Compress pickle files with this level of Brotli (0 - 11) [keyword-only, default: 10]

  • num_cpus (int) – Use up to this many CPUs simultaneously [keyword-only, default: 4]

  • force (bool) – Force all tasks to run, overwriting any existing output files [keyword-only, default: False]

  • tmp_pfx – Write all temporary files to a directory with this prefix [keyword-only, default: ‘./tmp’]

  • keep_tmp – Keep temporary files after finishing [keyword-only, default: False]

seismicrna.importmm.mm.iter_mm_file(mm_path: Path) Iterator[tuple[str, DNA, list[tuple[int, int, list[int]]]]]

Iterate over transcript blocks in an MM file.

Yields (ref_id, refseq, reads) for each transcript, where reads is a list of (start, end, mut_positions) tuples (all positions 0-based).

seismicrna.importmm.write.import_mm(mm_path: Path, *, sample: str, out_dir: Path, tmp_dir: Path, branch: str, min_reads: int, batch_size: int, insert3: bool, write_read_names: bool, relate_pos_table: bool, relate_read_table: bool, brotli_level: int, force: bool) list[Path]

Convert all transcript blocks in one MM file into relate outputs.

Returns a list of output directories (one per reference that was successfully imported).