seismicrna.core.table package

Submodules

class seismicrna.core.table.base.AbundanceTable

Bases: Table, ABC

Table of abundances.

classmethod by_read()

Whether the table contains data for each read.

property data: Series

Table’s data.

classmethod index_depth()

Number of columns in the index.

classmethod path_segs()

Table’s path segments.

property proportions: Series

Proportion of each item.

class seismicrna.core.table.base.PositionTable

Bases: RelTypeTable, ABC

Table indexed by position.

MASK = 'pos-mask'
classmethod by_read()

Whether the table contains data for each read.

ci_count(confidence: float, **kwargs)

Confidence intervals of counts, under these simplifications:

  • Counts are independent of each other.

  • Counts follow binomial distributions.

  • Coverage counts are constant.

Parameters:
  • confidence (float) – Confidence level; must be in [0, 1).

  • **kwargs – Keyword arguments for fetch methods.

Returns:

Lower and upper bounds of the confidence interval.

Return type:

tuple[pandas.DataFrame, pandas.DataFrame]

ci_ratio(confidence: float, **kwargs)

Confidence intervals of ratios, under these simplifications:

  • Ratios are independent of each other.

  • Ratios follow beta distributions.

  • Coverage counts are constant.

Parameters:
  • confidence (float) – Confidence level; must be in [0, 1).

  • **kwargs – Keyword arguments for fetch methods.

Returns:

Lower and upper bounds of the confidence interval.

Return type:

tuple[pandas.DataFrame, pandas.DataFrame]

property end3
property end5
classmethod index_depth()

Number of columns in the index.

iter_profiles(*, regions: Iterable[Region] | None = None, quantile: float = 0.0, rel: str = 'Mutated', k: int | None = None, clust: int | None = None)

Yield RNA mutational profiles from the table.

property range
property range_int
property region

Region covered by the table.

resample(fraction: float = 1.0, *, exclude_masked: bool = False, seed: int | None = None, max_seed: int = 4294967296)

Resample the reads and return a new DataFrame.

Parameters:
  • fraction (float = 1.) – Number of reads to resample, expressed as a fraction of the original number of reads. Must be ≥ 0; may be > 1.

  • exclude_masked (bool = False) – Exclude positions that have been masked.

  • seed (int | None = None) – Seed for the random number generator.

  • max_seed (int = 2 ** 32) – Maximum seed to pass to the next random number generator.

property seq
class seismicrna.core.table.base.ReadTable

Bases: RelTypeTable, ABC

Table indexed by read.

classmethod by_read()

Whether the table contains data for each read.

classmethod index_depth()

Number of columns in the index.

property reads
class seismicrna.core.table.base.RelTypeTable

Bases: Table, ABC

Table with multiple types of relationships.

property data: DataFrame

Table’s data.

fetch_count(*, exclude_masked: bool = False, squeeze: bool = False, **kwargs) Series | DataFrame

Fetch counts of one or more columns.

fetch_ratio(*, exclude_masked: bool = False, squeeze: bool = False, precision: int | None = None, quantile: float = 0.0, **kwargs) Series | DataFrame

Fetch ratios of one or more columns.

classmethod header_rows() list[int]

Row(s) of the file to use as the columns.

class seismicrna.core.table.base.Table

Bases: ABC

Table base class.

classmethod build_path(**path_fields)

Build the path of a table’s CSV file using the fields.

abstract classmethod by_read() bool

Whether the table contains data for each read.

property data: DataFrame | Series

Table’s data.

classmethod default_path_fields()

Default values of the path fields.

classmethod ext()

Table’s file extension: either “.csv” or “.csv.gz”.

classmethod gzipped()

Whether the table’s file is compressed with gzip.

property header

Header for the table’s data.

classmethod header_depth()
abstract classmethod header_type() type[Header]

Type of the header for the table.

classmethod index_cols() list[int]

Column(s) of the file to use as the index.

abstract classmethod index_depth() int

Number of columns in the index.

abstract classmethod kind() str

Kind of table.

property path

Path of the table’s CSV file (possibly gzipped).

abstract property path_fields: dict[str, Any]

Table’s path fields.

abstract classmethod path_segs() tuple[Segment, ...]

Table’s path segments.

abstract property ref: str

Name of the table’s reference.

property refseq: DNA

Reference sequence.

abstract property reg: str

Name of the table’s region.

abstract property sample: str

Name of the table’s sample.

abstract property top: Path

Path of the table’s output directory.

seismicrna.core.table.base.all_patterns(subpattern: RelPattern | None = None)

Every RelPattern, keyed by its name.

seismicrna.core.table.base.get_pattern(rel: str)

Get a RelPattern from the name of its relationship.

seismicrna.core.table.base.get_rel_name(rel_code: str)

Get the name of a relationship from its code.

seismicrna.core.table.base.get_subpattern(rel: str, subpattern: RelPattern | None = None)

Get a RelPattern, optionally with masking.

class seismicrna.core.table.write.AbundanceTableWriter(tabulator: Tabulator)

Bases: TableWriter, AbundanceTable, ABC

property data

Table’s data.

class seismicrna.core.table.write.BatchTabulator(*, get_batch_count_all: Callable, num_batches: int, max_procs: int = 1, **kwargs)

Bases: Tabulator, ABC

Tabulator that accepts batches as the input data.

class seismicrna.core.table.write.CountTabulator(*, batch_counts: Iterable[tuple[Any, Any, Any, Any]], **kwargs)

Bases: Tabulator, ABC

Tabulator that accepts pre-counted data from batches.

class seismicrna.core.table.write.DatasetTabulator(*, dataset: MutsDataset, validate: bool = False, **kwargs)

Bases: BatchTabulator, ABC

Tabulator made from one dataset.

classmethod dataset_types()

Types of Dataset this Tabulator can process.

classmethod init_kws()

Attributes of the dataset to use as keyword arguments in super().__init__().

abstract classmethod load_function() LoadFunction

LoadFunction for all Dataset types for this Tabulator.

class seismicrna.core.table.write.PositionTableWriter(tabulator: Tabulator)

Bases: TableWriter, PositionTable, ABC

property data

Table’s data.

class seismicrna.core.table.write.ReadTableWriter(tabulator: Tabulator)

Bases: TableWriter, ReadTable, ABC

property data

Table’s data.

class seismicrna.core.table.write.TableWriter(tabulator: Tabulator)

Bases: Table, ABC

Write a table to a file.

property columns
property ref

Name of the table’s reference.

property refseq

Reference sequence.

property reg

Name of the table’s region.

property sample

Name of the table’s sample.

property top

Path of the table’s output directory.

write(force: bool)

Write the table’s rounded data to the table’s CSV file.

class seismicrna.core.table.write.Tabulator(*, top: Path, sample: str, region: Region, count_ends: bool, count_pos: bool, count_read: bool, validate: bool = True)

Bases: ABC

Base class for tabulating data for one or more tables.

property counts_per_pos

Raw counts per position.

property counts_per_read

Raw counts per read.

property data_per_clust: Series | None

Series of per-cluster data (or None if no clusters).

property data_per_pos

DataFrame of per-position data.

property data_per_read

DataFrame of per-read data.

property end_counts

Raw counts for each pair of end coordinates.

generate_tables(*, pos: bool = True, read: bool = True, clust: bool = True)

Generate the tables from this data.

abstract classmethod get_null_value() int | float

The null value for a count: either 0 or NaN.

property num_reads

Raw number of reads.

property pos_header

Header of the per-position data.

property read_header

Header of the per-read data.

property ref

Name of the reference.

abstract classmethod table_types() list[type[TableWriter]]

Types of tables that this tabulator can write.

write_tables(*, force: bool = False, **kwargs)