seismicrna.core.table package

Submodules

class seismicrna.core.table.base.AbundanceTable

Bases: Table, ABC

Table of abundances.

property data: Series

Table’s data.

classmethod get_by_read()

Whether the table contains data for each read.

classmethod get_file_seg_type()

Type of the last segment in the path.

classmethod get_index_depth()

Number of columns in the index.

property proportions: Series

Proportion of each item.

class seismicrna.core.table.base.PositionTable

Bases: RelTypeTable, ABC

Table indexed by position.

MASK = 'pos-mask'
ci_count(confidence: float, **kwargs)

Confidence intervals of counts, under these simplifications:

  • Counts are independent of each other.

  • Counts follow binomial distributions.

  • Coverage counts are constant.

Parameters:
  • confidence (float) – Confidence level; must be in [0, 1).

  • **kwargs – Keyword arguments for fetch methods.

Returns:

Lower and upper bounds of the confidence interval.

Return type:

tuple[pandas.DataFrame, pandas.DataFrame]

ci_ratio(confidence: float, **kwargs)

Confidence intervals of ratios, under these simplifications:

  • Ratios are independent of each other.

  • Ratios follow beta distributions.

  • Coverage counts are constant.

Parameters:
  • confidence (float) – Confidence level; must be in [0, 1).

  • **kwargs – Keyword arguments for fetch methods.

Returns:

Lower and upper bounds of the confidence interval.

Return type:

tuple[pandas.DataFrame, pandas.DataFrame]

property end3
property end5
classmethod get_by_read()

Whether the table contains data for each read.

classmethod get_file_seg_type()

Type of the last segment in the path.

classmethod get_index_depth()

Number of columns in the index.

iter_profiles(*, regions: Iterable[Region] | None = None, quantile: float = 0.0, rel: str = 'Mutated', k: int | None = None, clust: int | None = None)

Yield RNA mutational profiles from the table.

property range
property range_int
property region

Region covered by the table.

resample(fraction: float = 1.0, *, exclude_masked: bool = False, seed: int | None = None, max_seed: int = 4294967296)

Resample the reads and return a new DataFrame.

Parameters:
  • fraction (float = 1.) – Number of reads to resample, expressed as a fraction of the original number of reads. Must be ≥ 0; may be > 1.

  • exclude_masked (bool = False) – Exclude positions that have been masked.

  • seed (int | None = None) – Seed for the random number generator.

  • max_seed (int = 2 ** 32) – Maximum seed to pass to the next random number generator.

property seq
class seismicrna.core.table.base.ReadTable

Bases: RelTypeTable, ABC

Table indexed by read.

classmethod get_by_read()

Whether the table contains data for each read.

classmethod get_file_seg_type()

Type of the last segment in the path.

classmethod get_index_depth()

Number of columns in the index.

property reads
class seismicrna.core.table.base.RelTypeTable

Bases: Table, ABC

Table with multiple types of relationships.

property data: DataFrame

Table’s data.

fetch_count(*, exclude_masked: bool = False, squeeze: bool = False, **kwargs) Series | DataFrame

Fetch counts of one or more columns.

fetch_ratio(*, exclude_masked: bool = False, squeeze: bool = False, precision: int | None = None, quantile: float = 0.0, **kwargs) Series | DataFrame

Fetch ratios of one or more columns.

classmethod get_header_rows() list[int]

Row(s) of the file to use as the columns.

class seismicrna.core.table.base.Table

Bases: HasRefFilePath, ABC

Table base class.

property branches: dict[str, str]

Branches of the workflow.

property data: DataFrame | Series

Table’s data.

classmethod get_auto_path_fields()

Names and path fields that have automatic values.

abstractmethod classmethod get_by_read() bool

Whether the table contains data for each read.

classmethod get_ext()

File extension.

classmethod get_header_depth()
abstractmethod classmethod get_header_type() type[Header]

Type of the header for the table.

classmethod get_index_cols() list[int]

Column(s) of the file to use as the index.

abstractmethod classmethod get_index_depth() int

Number of columns in the index.

abstractmethod classmethod get_load_function() LoadFunction

LoadFunction for all Dataset types for this Table.

property header

Header for the table’s data.

property path

Path of the table’s file.

property ref: str

Name of the table’s reference.

property refseq: DNA

Reference sequence.

property reg: str

Name of the table’s region.

property sample: str

Name of the table’s sample.

property top: Path

Path of the table’s output directory.

seismicrna.core.table.base.all_patterns(subpattern: RelPattern | None = None)

Every RelPattern, keyed by its name.

seismicrna.core.table.base.get_pattern(rel: str)

Get a RelPattern from the name of its relationship.

seismicrna.core.table.base.get_rel_name(rel_code: str)

Get the name of a relationship from its code.

seismicrna.core.table.base.get_subpattern(rel: str, subpattern: RelPattern | None = None)

Get a RelPattern, optionally with masking.

class seismicrna.core.table.load.PositionTableLoader(table_file: str | Path, **kwargs)

Bases: RelTypeTableLoader, PositionTable, ABC

Load data indexed by position.

class seismicrna.core.table.load.ReadTableLoader(table_file: str | Path, **kwargs)

Bases: RelTypeTableLoader, ReadTable, ABC

Load data indexed by read.

class seismicrna.core.table.load.RelTypeTableLoader(table_file: str | Path, **kwargs)

Bases: TableLoader, RelTypeTable, ABC

Load a table of relationship types.

property data: DataFrame

Table’s data.

class seismicrna.core.table.load.TableLoader(table_file: str | Path, **kwargs)

Bases: Table, ABC

Load a table from a file.

classmethod find_tables(paths: Iterable[str | Path])

Yield files of the tables within the given paths.

classmethod load_tables(paths: Iterable[str | Path], **kwargs)

Yield tables within the given paths.

class seismicrna.core.table.write.AbundanceTableWriter(tabulator: Tabulator)

Bases: TableWriter, AbundanceTable, ABC

property data

Table’s data.

class seismicrna.core.table.write.BatchTabulator(*, get_batch_count_all: Callable, num_batches: int, num_cpus: int = 1, **kwargs)

Bases: Tabulator, ABC

Tabulator that accepts batches as the input data.

class seismicrna.core.table.write.CountTabulator(*, batch_counts: Iterable[tuple[Any, Any, Any, Any]], **kwargs)

Bases: Tabulator, ABC

Tabulator that accepts pre-counted data from batches.

class seismicrna.core.table.write.DatasetTabulator(*, dataset: MutsDataset, validate: bool = False, **kwargs)

Bases: BatchTabulator, ABC

Tabulator made from one dataset.

classmethod get_dataset_types()

Types of Dataset this Tabulator can process.

classmethod init_kws()

Attributes of the dataset to use as keyword arguments in super().__init__().

class seismicrna.core.table.write.PositionTableWriter(tabulator: Tabulator)

Bases: TableWriter, PositionTable, ABC

property data

Table’s data.

class seismicrna.core.table.write.ReadTableWriter(tabulator: Tabulator)

Bases: TableWriter, ReadTable, ABC

property data

Table’s data.

class seismicrna.core.table.write.TableWriter(tabulator: Tabulator)

Bases: Table, ABC

Write a table to a file.

write(force: bool)

Write the table’s rounded data to the table’s CSV file.

class seismicrna.core.table.write.Tabulator(*, top: Path, branches: dict[str, str], sample: str, region: Region, count_ends: bool, count_pos: bool, count_read: bool, validate: bool = True)

Bases: ABC

Base class for tabulating data for one or more tables.

property counts_per_pos

Raw counts per position.

property counts_per_read

Raw counts per read.

property data_per_clust: Series | None

Series of per-cluster data (or None if no clusters).

property data_per_pos

DataFrame of per-position data.

property data_per_read

DataFrame of per-read data.

property end_counts

Raw counts for each pair of end coordinates.

generate_tables(*, pos: bool = True, read: bool = True, clust: bool = True)

Generate the tables from this data.

classmethod get_load_function()

LoadFunction for all Dataset types for this Tabulator.

abstractmethod classmethod get_null_value() int | float

The null value for a count: either 0 or NaN.

property num_reads

Raw number of reads.

property pos_header

Header of the per-position data.

property read_header

Header of the per-read data.

property ref

Name of the reference.

abstractmethod classmethod table_types() list[type[TableWriter]]

Types of tables that this tabulator can write.

write_tables(*, force: bool = False, **kwargs)