seismicrna.core.rna package
Subpackages
- seismicrna.core.rna.tests package
- Submodules
TestFormatDbStructure
TestFormatDbStructure.test_deep_pairs()
TestFormatDbStructure.test_excess_pseudoknot()
TestFormatDbStructure.test_invalid_pair()
TestFormatDbStructure.test_invalid_pos3()
TestFormatDbStructure.test_invalid_pos5()
TestFormatDbStructure.test_multi_pseudoknot()
TestFormatDbStructure.test_no_pairs()
TestFormatDbStructure.test_one_pair()
TestFormatDbStructure.test_pseudoknot()
TestFormatDbStructure.test_repeat_pos3()
TestFormatDbStructure.test_repeat_pos5()
TestFormatDbStructure.test_shallow_pairs()
TestPairedMarks
TestParseDbStructure
TestParseDbStructure.test_dangling_closer()
TestParseDbStructure.test_dangling_opener()
TestParseDbStructure.test_deep_pairs()
TestParseDbStructure.test_multi_marks()
TestParseDbStructure.test_multi_pseudoknot()
TestParseDbStructure.test_no_pairs()
TestParseDbStructure.test_pseudoknot()
TestParseDbStructure.test_shallow_pairs()
TestConstants
TestDictToPairs
TestDictToTable
TestFindEnclosingPairs
TestPairsToDict
TestPairsToTable
TestTableToDict
TestTableToPairs
- Submodules
Submodules
- class seismicrna.core.rna.base.RNARegion(*, region: Region, **kwargs)
Bases:
object
Region of an RNA sequence.
- property end3
Position of the 3’ end of the region.
- property end5
Position of the 5’ end of the region.
- property init_args
Arguments needed to initialize a new instance.
- property ref
Name of the reference sequence.
- property reg
Name of the region.
- property seq
Sequence of the region as RNA.
- property seq_record
- seismicrna.core.rna.convert.run_ct_to_db(input_path: Iterable[str | Path], *, force: bool = False, max_procs: int = 4)
Convert connectivity table (CT) to dot-bracket (DB) files.
- seismicrna.core.rna.convert.run_db_to_ct(input_path: Iterable[str | Path], *, force: bool = False, max_procs: int = 4)
Convert dot-bracket (DB) to connectivity table (CT) files.
- seismicrna.core.rna.ct.parse_ct(ct_path: Path)
Yield the title, region, and base pairs for each structure in a connectivity table (CT) file.
- Parameters:
ct_path (
Path
) – Path of the CT file.- Return type:
Generator[tuple[str
,Region
,list[tuple[int
,int]]]
,Any
,None]
- seismicrna.core.rna.db.format_db_structure(pairs: Iterable[tuple[int, int]], length: int, seq5: int = 1)
Create a dot-bracket string from a list of base pairs.
- seismicrna.core.rna.db.parse_db(db_path: Path, seq5: int = 1)
Yield the title, region, and base pairs for each structure in a dot-bracket (DB) file.
- Parameters:
db_path (
Path
) – Path of the DB file.seq5 (
int = 1
) – Number to give the 5’ position of the sequence.
- Return type:
Generator[tuple[str
,Region
,list[tuple[int
,int]]]
,Any
,None]
- seismicrna.core.rna.db.parse_db_strings(db_path: Path)
Return the sequence and structures from a dot-bracket file.
- seismicrna.core.rna.db.parse_db_structure(struct: str, seq5: int = 1)
Parse a dot-bracket structure into a list of base pairs.
- seismicrna.core.rna.io.ct_to_db(ct_path: Path, db_path: Path | None = None, force: bool = False)
Write a dot-bracket (DB) file of structures in a connectivity table (CT) file.
- seismicrna.core.rna.io.db_to_ct(db_path: Path, ct_path: Path | None = None, force: bool = False)
Write a connectivity table (CT) file of structures in a dot-bracket (DB) file.
- seismicrna.core.rna.io.find_ct_region(ct_path: Path) Region
Region shared among all structures in a CT file.
- seismicrna.core.rna.io.from_ct(ct_path: Path)
Yield an instance of an RNAStructure for each structure in a connectivity table (CT) file.
- Parameters:
ct_path (
Path
) – Path of the CT file.- Returns:
RNA secondary structures from the CT file.
- Return type:
Generator[RNAStructure
,Any
,None]
- seismicrna.core.rna.io.from_db(db_path: Path, seq5: int = 1)
Yield an instance of an RNAStructure for each structure in a dot-bracket (DB) file.
- Parameters:
db_path (
Path
) – Path of the DB file.seq5 (
int = 1
) – Number to give the 5’ position of the sequence.
- Returns:
RNA secondary structures from the CT file.
- Return type:
Generator[RNAStructure
,Any
,None]
- seismicrna.core.rna.io.renumber_ct(ct_in: Path, ct_out: Path, seq5: int, force: bool = False)
Renumber the last column of a connectivity table (CT) file.
- Parameters:
ct_in (
Path
) – Path of the input CT file.ct_out (
Path
) – Path of the output CT file.seq5 (
int
) – Number to give the 5’ position in the renumbered CT file.force (
bool = False
) – Overwrite the output CT file if it already exists.
- seismicrna.core.rna.io.to_ct(structures: Iterable[RNAStructure], ct_path: Path, force: bool = False)
Write a connectivity table (CT) file of RNA structures.
- Parameters:
structures (
Iterable[RNAStructure]
) – RNA structures to write to the CT file.ct_path (
Path
) – Path of the CT file.force (
bool = False
) – Overwrite the output CT file if it already exists.
- seismicrna.core.rna.io.to_db(structures: Iterable[RNAStructure], db_path: Path, force: bool = False)
Write a dot-bracket (DB) file of RNA structures.
- Parameters:
structures (
Iterable[RNAStructure]
) – RNA structures to write to the CT file.db_path (
Path
) – Path of the DB file.force (
bool = False
) – Overwrite the output DB file if it already exists.
- seismicrna.core.rna.pair.dict_to_pairs(pair_dict: dict[int, int])
Tuples of the 5’ and 3’ position in each pair.
- seismicrna.core.rna.pair.dict_to_table(pair_dict: dict[int, int], region: Region)
Series of every position in the region and the base to which it pairs, or 0 if it does not pair.
- seismicrna.core.rna.pair.find_enclosing_pairs(table: Series)
Find the base pair that encloses each position.
- seismicrna.core.rna.pair.find_root_pairs(pairs: Iterable[tuple[int, int]], assume_nested: bool = False)
Return all pairs which are not contained any other pair.
- seismicrna.core.rna.pair.map_nested(pairs: Iterable[tuple[int, int]])
Map each pair to the pair in which it is nested.
- seismicrna.core.rna.pair.pairs_to_dict(pairs: Iterable[tuple[int, int]])
Return a dictionary that maps each position to the base to which it pairs and contains no key for unpaired positions.
- seismicrna.core.rna.pair.pairs_to_table(pairs: Iterable[tuple[int, int]], region: Region)
Series of every position in the region and the base to which it pairs, or 0 if it does not pair.
- seismicrna.core.rna.pair.renumber_pairs(pairs: Iterable[tuple[int, int]], offset: int)
Renumber pairs by offsetting each number.
- Parameters:
pairs (
Iterable[tuple[int
,int]]
) – Pairs to renumber.offset (
int
) – Offset by which to chage the numbering.
- Returns:
Renumbered pairs, in the same order as given.
- Return type:
Generator[tuple[int
,int]
,Any
,None]
- seismicrna.core.rna.pair.table_to_dict(table: Series)
Dictionary of the 5’ and 3’ position in each pair.
- seismicrna.core.rna.pair.table_to_pairs(table: Series)
Tuples of the 5’ and 3’ position in each pair.
- class seismicrna.core.rna.profile.RNAProfile(*, sample: str, data_reg: str, data_name: str, data: Series, **kwargs)
Bases:
RNARegion
Mutational profile of an RNA.
- get_ct_file(top: Path)
Get the path to the connectivity table (CT) file.
- Parameters:
top (
pathlib.Path
) – Top-level directory.- Returns:
Path of the file.
- Return type:
- get_db_file(top: Path)
Get the path to the dot-bracket (DB) file.
- Parameters:
top (
pathlib.Path
) – Top-level directory.- Returns:
Path of the file.
- Return type:
- get_dms_file(top: Path)
Get the path to the DMS data file.
- Parameters:
top (
pathlib.Path
) – Top-level directory.- Returns:
DMS data file.
- Return type:
- get_fasta(top: Path)
Get the path to the FASTA file.
- Parameters:
top (
pathlib.Path
) – Top-level directory.- Returns:
Path of the file.
- Return type:
- get_varna_color_file(top: Path)
Get the path to the VARNA color file.
- Parameters:
top (
pathlib.Path
) – Top-level directory.- Returns:
Path of the file.
- Return type:
- property init_args
Arguments needed to initialize a new instance.
- property profile
Name of the mutational profile.
- to_dms(top: Path)
Write the DMS reactivities to a DMS file.
- Parameters:
top (
pathlib.Path
) – Top-level directory.- Returns:
File into which the DMS reactivities were written.
- Return type:
- to_fasta(top: Path)
Write the RNA sequence to a FASTA file.
- Parameters:
top (
pathlib.Path
) – Top-level directory.- Returns:
File into which the RNA sequence was written.
- Return type:
- to_varna_color_file(top: Path)
Write the VARNA colors to a file.
- Parameters:
top (
pathlib.Path
) – Top-level directory.- Returns:
File into which the VARNA colors were written.
- Return type:
- seismicrna.core.rna.roc.compute_auc(fpr: ndarray, tpr: ndarray)
Compute the area under the curve (AUC) of the receiver operating characteristic (ROC).
- Parameters:
fpr (
numpy.ndarray
) – False positive rate (FPR) of the ROC curve.tpr (
numpy.ndarray
) – True positive rate (TPR) of the ROC curve.
- Returns:
AUC-ROC
- Return type:
- seismicrna.core.rna.roc.compute_auc_roc(paired: Series, profile: Series)
Compute the receiver operating characteristic (ROC) and the area under the curve (AUC) to indicate how well mutation data agree with a structure.
- Parameters:
paired (
pandas.Series
) – Boolean series with one index per position, where each value is True if the base at the position is paired, otherwise False.profile (
pandas.Series
) – Mutational profile with one index per position, where each value is the mutation rate at the position.
- Returns:
AUC-ROC
- Return type:
- seismicrna.core.rna.roc.compute_roc_curve(paired: Series, profile: Series)
Compute the receiver operating characteristic (ROC) curve to indicate how well mutation data agree with a structure.
- Parameters:
paired (
pandas.Series
) – Boolean series with one index per position, where each value is True if the base at the position is paired, otherwise False.profile (
pandas.Series
) – Mutational profile with one index per position, where each value is the mutation rate at the position.
- Returns:
FPR and TPR axes, respectively, of the ROC curve.
- Return type:
tuple[numpy.ndarray
,numpy.ndarray]
- seismicrna.core.rna.roc.compute_rolling_auc(paired: Series, profile: Series, size: int, min_data: int = 2)
Compute the area under the curve (AUC) of the receiver operating characteristic (ROC) at each position using a sliding window.
- Parameters:
paired (
pandas.Series
) – Boolean series with one index per position, where each value is True if the base at the position is paired, otherwise False.profile (
pandas.Series
) – Mutational profile with one index per position, where each value is the mutation rate at the position.size (
int
) – Size of the window.min_data (
int = 2
) – Minimum number of data in a window to use it (otherwise NaN).
- Returns:
AUC-ROC at each position.
- Return type:
pandas.Series
- class seismicrna.core.rna.state.RNAState(*, title: str, pairs: Iterable[tuple[int, int]], **kwargs)
Bases:
RNAStructure
,RNAProfile
RNA secondary structure with mutation rates.
- property auc
- classmethod from_struct_profile(struct: RNAStructure, profile: RNAProfile)
Make an RNAState from an RNAStructure and an RNAProfile.
- property roc
- class seismicrna.core.rna.struct.RNAStructure(*, title: str, pairs: Iterable[tuple[int, int]], **kwargs)
Bases:
RNARegion
Secondary structure of an RNA.
- property ct_data
Convert the connectivity table to a DataFrame.
- property ct_text
Connectivity table as text.
- property ct_title
Header line for the CT file.
- property db_structure
Dot-bracket string (structure only).
- property db_title
Header line for the DB file.
- property dict
- property init_args
Arguments needed to initialize a new instance.
- property is_paired
Series where each index is a position and each value is True if the corresponding base is paired, otherwise False.
- iter_root_modules()
- property pairs
Base pairs in the structure.
- property roots
- class seismicrna.core.rna.struct.Rna2dPart(*regions: RNARegion, **kwargs)
Bases:
object
Part of an RNA secondary structure.
- class seismicrna.core.rna.struct.Rna2dStem(side1: RNARegion, side2: RNARegion, **kwargs)
Bases:
Rna2dPart
An RNA stem (contiguous double helix).
- property region3
- property region5
- class seismicrna.core.rna.struct.Rna2dStemLoop(region: RNARegion, **kwargs)
Bases:
RnaJunction
An RNA loop at the end of a stem.
- property region