seismicrna.core.table package
Submodules
- class seismicrna.core.table.base.AbundanceTable
-
Table of abundances.
- property data: Series
Table’s data.
- classmethod get_by_read()
Whether the table contains data for each read.
- classmethod get_file_seg_type()
Type of the last segment in the path.
- classmethod get_index_depth()
Number of columns in the index.
- property proportions: Series
Proportion of each item.
- class seismicrna.core.table.base.PositionTable
Bases:
RelTypeTable
,ABC
Table indexed by position.
- MASK = 'pos-mask'
- ci_count(confidence: float, **kwargs)
Confidence intervals of counts, under these simplifications:
Counts are independent of each other.
Counts follow binomial distributions.
Coverage counts are constant.
- Parameters:
confidence (
float
) – Confidence level; must be in [0, 1).**kwargs – Keyword arguments for fetch methods.
- Returns:
Lower and upper bounds of the confidence interval.
- Return type:
tuple[pandas.DataFrame
,pandas.DataFrame]
- ci_ratio(confidence: float, **kwargs)
Confidence intervals of ratios, under these simplifications:
Ratios are independent of each other.
Ratios follow beta distributions.
Coverage counts are constant.
- Parameters:
confidence (
float
) – Confidence level; must be in [0, 1).**kwargs – Keyword arguments for fetch methods.
- Returns:
Lower and upper bounds of the confidence interval.
- Return type:
tuple[pandas.DataFrame
,pandas.DataFrame]
- property end3
- property end5
- classmethod get_by_read()
Whether the table contains data for each read.
- classmethod get_file_seg_type()
Type of the last segment in the path.
- classmethod get_index_depth()
Number of columns in the index.
- iter_profiles(*, regions: Iterable[Region] | None = None, quantile: float = 0.0, rel: str = 'Mutated', k: int | None = None, clust: int | None = None)
Yield RNA mutational profiles from the table.
- property range
- property range_int
- property region
Region covered by the table.
- resample(fraction: float = 1.0, *, exclude_masked: bool = False, seed: int | None = None, max_seed: int = 4294967296)
Resample the reads and return a new DataFrame.
- Parameters:
fraction (
float = 1.
) – Number of reads to resample, expressed as a fraction of the original number of reads. Must be ≥ 0; may be > 1.exclude_masked (
bool = False
) – Exclude positions that have been masked.seed (
int | None = None
) – Seed for the random number generator.max_seed (
int = 2 ** 32
) – Maximum seed to pass to the next random number generator.
- property seq
- class seismicrna.core.table.base.ReadTable
Bases:
RelTypeTable
,ABC
Table indexed by read.
- classmethod get_by_read()
Whether the table contains data for each read.
- classmethod get_file_seg_type()
Type of the last segment in the path.
- classmethod get_index_depth()
Number of columns in the index.
- property reads
- class seismicrna.core.table.base.RelTypeTable
-
Table with multiple types of relationships.
- property data: DataFrame
Table’s data.
- fetch_count(*, exclude_masked: bool = False, squeeze: bool = False, **kwargs) Series | DataFrame
Fetch counts of one or more columns.
- class seismicrna.core.table.base.Table
Bases:
HasRefFilePath
,ABC
Table base class.
- property data: DataFrame | Series
Table’s data.
- classmethod get_auto_path_fields()
Names and path fields that have automatic values.
- classmethod get_ext()
File extension.
- classmethod get_header_depth()
- abstractmethod classmethod get_load_function() LoadFunction
LoadFunction for all Dataset types for this Table.
- property header
Header for the table’s data.
- property path
Path of the table’s file.
- seismicrna.core.table.base.all_patterns(subpattern: RelPattern | None = None)
Every RelPattern, keyed by its name.
- seismicrna.core.table.base.get_pattern(rel: str)
Get a RelPattern from the name of its relationship.
- seismicrna.core.table.base.get_rel_name(rel_code: str)
Get the name of a relationship from its code.
- seismicrna.core.table.base.get_subpattern(rel: str, subpattern: RelPattern | None = None)
Get a RelPattern, optionally with masking.
- class seismicrna.core.table.load.PositionTableLoader(table_file: str | Path, **kwargs)
Bases:
RelTypeTableLoader
,PositionTable
,ABC
Load data indexed by position.
- class seismicrna.core.table.load.ReadTableLoader(table_file: str | Path, **kwargs)
Bases:
RelTypeTableLoader
,ReadTable
,ABC
Load data indexed by read.
- class seismicrna.core.table.load.RelTypeTableLoader(table_file: str | Path, **kwargs)
Bases:
TableLoader
,RelTypeTable
,ABC
Load a table of relationship types.
- property data: DataFrame
Table’s data.
- class seismicrna.core.table.load.TableLoader(table_file: str | Path, **kwargs)
-
Load a table from a file.
- class seismicrna.core.table.write.AbundanceTableWriter(tabulator: Tabulator)
Bases:
TableWriter
,AbundanceTable
,ABC
- property data
Table’s data.
- class seismicrna.core.table.write.BatchTabulator(*, get_batch_count_all: Callable, num_batches: int, num_cpus: int = 1, **kwargs)
-
Tabulator that accepts batches as the input data.
- class seismicrna.core.table.write.CountTabulator(*, batch_counts: Iterable[tuple[Any, Any, Any, Any]], **kwargs)
-
Tabulator that accepts pre-counted data from batches.
- class seismicrna.core.table.write.DatasetTabulator(*, dataset: MutsDataset, validate: bool = False, **kwargs)
Bases:
BatchTabulator
,ABC
Tabulator made from one dataset.
- classmethod get_dataset_types()
Types of Dataset this Tabulator can process.
- classmethod init_kws()
Attributes of the dataset to use as keyword arguments in super().__init__().
- class seismicrna.core.table.write.PositionTableWriter(tabulator: Tabulator)
Bases:
TableWriter
,PositionTable
,ABC
- property data
Table’s data.
- class seismicrna.core.table.write.ReadTableWriter(tabulator: Tabulator)
Bases:
TableWriter
,ReadTable
,ABC
- property data
Table’s data.
- class seismicrna.core.table.write.Tabulator(*, top: Path, branches: dict[str, str], sample: str, region: Region, count_ends: bool, count_pos: bool, count_read: bool, validate: bool = True)
Bases:
ABC
Base class for tabulating data for one or more tables.
- property counts_per_pos
Raw counts per position.
- property counts_per_read
Raw counts per read.
- property data_per_pos
DataFrame of per-position data.
- property data_per_read
DataFrame of per-read data.
- property end_counts
Raw counts for each pair of end coordinates.
- generate_tables(*, pos: bool = True, read: bool = True, clust: bool = True)
Generate the tables from this data.
- classmethod get_load_function()
LoadFunction for all Dataset types for this Tabulator.
- abstractmethod classmethod get_null_value() int | float
The null value for a count: either 0 or NaN.
- property num_reads
Raw number of reads.
- property pos_header
Header of the per-position data.
- property read_header
Header of the per-read data.
- property ref
Name of the reference.
- abstractmethod classmethod table_types() list[type[TableWriter]]
Types of tables that this tabulator can write.