seismicrna package
Subpackages
- seismicrna.align package
- Subpackages
- Submodules
FastqUnitFastqUnit.BOWTIE2_FLAGSFastqUnit.KEY_DINTERFastqUnit.KEY_DMATEDFastqUnit.KEY_DSINGLEFastqUnit.KEY_INTERFastqUnit.KEY_MATE1FastqUnit.KEY_MATE2FastqUnit.KEY_MATEDFastqUnit.KEY_SINGLEFastqUnit.MAX_PHRED_ENCFastqUnit.bowtie2_inputsFastqUnit.checksumsFastqUnit.exists()FastqUnit.fields()FastqUnit.from_paths()FastqUnit.get_sample_ref_exts()FastqUnit.iter_records()FastqUnit.kindFastqUnit.n_readsFastqUnit.parentFastqUnit.phred_argFastqUnit.seg_typesFastqUnit.to_new()
MissingFastqMateMissingFastqMate1MissingFastqMate2count_fastq_reads()fastq_gz()format_phred_arg()get_args_count_fastq_reads()parse_stdout_count_fastq_reads()safe_slice()run()AlignRefReportAlignSampleReportBaseAlignReportSplitReportDuplicateSampleErrorcheck_duplicate_samples()run()split_xam_file()align_samples()calc_flags_sep_strands()check_fqs_xams()extract_reference()format_ref_reverse()fq_pipeline()fqs_pipeline()list_alignments()list_fqs_xams()merge_nondemult_fqs()separate_strands()split_references()write_tmp_ref_files()bowtie2_build_cmd()bowtie2_cmd()export_cmd()fastp_cmd()flags_cmd()flags_cmds()get_bowtie2_index_paths()parse_bowtie2()realign_cmd()xamgen_cmd()
- seismicrna.cleanfa package
- seismicrna.cluster package
- Subpackages
- Submodules
ClusterMutsBatchClusterReadBatchClusterAbundanceTableClusterAbundanceTableLoaderClusterAbundanceTableWriterClusterBatchTabulatorClusterCountTabulatorClusterDatasetClusterDatasetTabulatorClusterMutsDatasetClusterMutsDataset.best_kClusterMutsDataset.get_dataset1_load_func()ClusterMutsDataset.get_dataset2_type()ClusterMutsDataset.ksClusterMutsDataset.min_mut_gapClusterMutsDataset.mut_collisionsClusterMutsDataset.patternClusterMutsDataset.probeClusterMutsDataset.quick_unbiasClusterMutsDataset.quick_unbias_threshClusterMutsDataset.region
ClusterPositionTableClusterPositionTableLoaderClusterPositionTableWriterClusterReadDatasetClusterTableClusterTabulatorJoinClusterMutsDatasetget_clust_params()EMRunEMRunsKassign_clusterings()calc_mean_arcsine_distance_clusters()calc_mean_pearson_clusters()find_best_k()get_common_k()sort_runs()ClusterBatchIOClusterBatchWriterClusterFileClusterIObootstrap_jackpot_scores()calc_jackpot_quotient()calc_jackpot_score()calc_jackpot_score_ci()calc_semi_g_anomaly()linearize_ends_matrix()sim_obs_exp()run()calc_marginal()calc_marginal_resps()assemble_log_obs_exp()format_exp_count_col()graph_log_obs_exp()table_log_obs_exp()write_obs_exp_counts()get_table_path()write_mus()write_pis()write_single_run_table()BaseClusterReportClusterReportJoinClusterReportgraph_attr()graph_attrs()tabulate()tabulate_attr()write_summaries()write_table()UniqReadsUniqReads.from_dataset()UniqReads.from_dataset_contig()UniqReads.get_cov_matrix()UniqReads.get_mut_matrix()UniqReads.log_obsUniqReads.num_batchesUniqReads.num_nonuniqUniqReads.num_obsUniqReads.num_uniqUniqReads.read_end3s_zeroUniqReads.read_end5s_zeroUniqReads.refUniqReads.seg_end3s_zeroUniqReads.seg_end5s_zeroUniqReads.uniq_names
get_uniq_reads()cluster()run_k()run_ks()
- seismicrna.collate package
- seismicrna.core package
- Subpackages
- seismicrna.core.arg package
- seismicrna.core.batch package
- seismicrna.core.extern package
- seismicrna.core.io package
- seismicrna.core.mu package
- seismicrna.core.ngs package
- seismicrna.core.rel package
- seismicrna.core.rna package
- seismicrna.core.seq package
- seismicrna.core.table package
- seismicrna.core.tests package
- Submodules
calc_inverse()check_naturals()ensure_order()ensure_same_length()find_dims()get_length()intersect1d_unique_sorted()list_naturals()locate_elements()sanitize_values()triangular()BadTimeStampErrorDatasetDataset.batch_numsDataset.best_kDataset.branchesDataset.data_dirsDataset.dirDataset.get_batch()Dataset.get_report_type()Dataset.is_clusteredDataset.iter_batches()Dataset.ksDataset.link_data_dirs_to_tmp()Dataset.num_batchesDataset.num_readsDataset.patternDataset.refDataset.sampleDataset.time_beganDataset.time_endedDataset.top
FailedToLoadDatasetErrorLoadFunctionLoadedDatasetMergedDatasetMergedRegionDatasetMergedUnbiasDatasetMissingBatchErrorMissingBatchTypeErrorMultistepDatasetMultistepDataset.data_dirsMultistepDataset.get_batch()MultistepDataset.get_dataset1_load_func()MultistepDataset.get_dataset1_report_file()MultistepDataset.get_dataset2_load_func()MultistepDataset.get_dataset2_type()MultistepDataset.get_report_type()MultistepDataset.load_dataset1()MultistepDataset.load_dataset2()MultistepDataset.num_batchesMultistepDataset.refseq
MutsDatasetRegionDatasetTallDatasetUnbiasDatasetWideDatasetWideMutsDatasetDuplicateValueErrorIncompatibleOptionsErrorIncompatibleValuesErrorInconsistentValueErrorNoDataErrorOutOfBoundsErrorClustHeaderHeaderHeader.clustsHeader.get_clust_header()Header.get_is_clustered()Header.get_level_keys()Header.get_level_names()Header.get_levels()Header.get_num_levels()Header.get_rel_header()Header.indexHeader.iter_clust_indexes()Header.ksHeader.modified()Header.namesHeader.select()Header.signatureHeader.size
RelClustHeaderRelHeaderdeduplicate_rels()format_clust_name()format_clust_names()list_clusts()list_k_clusts()list_ks_clusts()make_header()parse_header()validate_k_clust()validate_ks()JoinMutsDatasetJoinReportListPositionListReadListAnsiCodeConsoleStreamFileStreamFiltererFormatterLevelLoggerLoggerConfigMessageStreamerase_config()exc_info()format_console_color()format_console_plain()format_logfile()get_config()log_exceptions()restore_config()set_config()BranchesPathFieldHasFilePathHasRefFilePathHasRegFilePathHasSampleFilePathPathPathErrorPathExistsErrorPathFieldPathNotFoundErrorPathSegmentPathTypeErrorPathValueErrorWrongFileExtensionErroradd_branch()build()builddir()buildpar()cast_path()check_file_extension()create_path_type()deduplicate()deduplicated()fill_whitespace()find_files()find_files_chain()flatten_branches()get_ancestors()get_fields_in_seg_types()get_seismicrna_project_dir()get_seismicrna_source_dir()mkdir_if_needed()parse()parse_top_separate()path_matches()randdir()rmdir_if_needed()sanitize()symlink_if_needed()transpath()transpaths()validate_branch()validate_branches()validate_branches_flat()validate_int()validate_str()validate_top()get_random_integer_generator()stochastic_round()BatchedReportInvalidReportFieldKeyErrorInvalidReportFieldTitleErrorMissingFieldWithNoDefaultErrorOptionReportFieldRefReportRegReportReportReport.__setattr__()Report.from_dict()Report.get_checksum_report_fields()Report.get_field()Report.get_field_keys()Report.get_field_keys_set()Report.get_ident_report_fields()Report.get_meta_report_fields()Report.get_param_report_fields()Report.get_report_fields()Report.get_result_report_fields()Report.load()Report.save()Report.to_dict()
ReportDoesNotHaveFieldErrorReportFieldReportFieldAttributeErrorReportFieldErrorReportFieldKeyErrorReportFieldTypeErrorReportFieldValueErrorcalc_dt_minutes()calc_taken()default_key()field_keys()field_titles()fields()get_oconv_dict()get_oconv_dict_list()get_oconv_float()get_oconv_list()iconv_array_int()iconv_datetime()iconv_dict_str_dict_int_dict_int_int()iconv_dict_str_int()iconv_int_keys()key_to_title()lookup_key()lookup_title()oconv_datetime()log_command()run_func()calc_beta_mv()calc_beta_params()calc_dirichlet_mv()calc_dirichlet_params()double_kumaraswamy_pdf()kumaraswamy_pdf()Taskas_list_of_tuples()calc_pool_size()dispatch()get_release_working_dirs()release_to_out()with_tmp_dir()fit_uint_size()fit_uint_type()get_byte_dtype()get_dtype()get_max_uint()get_max_value()get_uint_dtype()get_uint_size()get_uint_type()calc_n_reads_per_pos()calc_p_clust()calc_p_clust_given_ends_noclose()calc_p_clust_given_noclose()calc_p_ends()calc_p_ends_given_clust_noclose()calc_p_ends_given_noclose()calc_p_ends_observed()calc_p_mut_given_span()calc_p_mut_given_span_dropped()calc_p_noclose()calc_p_noclose_given_clust()calc_p_noclose_given_ends()calc_p_noclose_given_ends_auto()calc_p_nomut_window()calc_params()calc_params_observed()calc_rectangular_sum()require_same_square_atleast2d()require_square_atleast2d()triu_dot()require_allclose()require_array_equal()require_atleast()require_atmost()require_between()require_equal()require_fraction()require_greater()require_index_equals()require_isin()require_isinstance()require_issubclass()require_less()format_version()parse_version()need_write()write_mode()
- Subpackages
- seismicrna.demult package
- Submodules
RefBarcodesRefBarcodes.as_dictRefBarcodes.automatonRefBarcodes.barcodesRefBarcodes.by_posRefBarcodes.countRefBarcodes.get_automaton()RefBarcodes.max_barcode_lenRefBarcodes.name_mapRefBarcodes.namesRefBarcodes.pairsRefBarcodes.rc_automatonRefBarcodes.rc_barcodesRefBarcodes.rc_by_posRefBarcodes.rc_pairsRefBarcodes.rc_read_pos_rangeRefBarcodes.rc_read_positionsRefBarcodes.rc_slice_positionRefBarcodes.read_pos_rangeRefBarcodes.read_positionsRefBarcodes.slice_positionRefBarcodes.uniq_namesRefBarcodes.valid_positions
RegionTuplecoords_to_seq()expand_by_tolerance()get_coords_by_name()get_ref_barcodes()run()decode_barcode_2bit()decode_barcode_3bit()encode_barcode_2bit()encode_barcode_3bit()generate_neighbors_2bit()generate_neighbors_3bit()get_neighbors()rec_neighbors_2bit()rec_neighbors_3bit()check_demult_fqs()check_matches()demult_ahocorasick()demult_fq_pipeline()demult_fqs_pipeline()demult_samples()get_fq_suffix()get_open_func()get_part()get_split_paths()list_demult()list_demulted_fqs()merge_fqs()merge_parts()process_fq_part()remove_suffixes()rename_fq_part()split_fq()strip_all_fq_suffixes()to_range()
- Submodules
- seismicrna.draw package
- Submodules
ColorBlockJinjaDataRNArtistRunRNArtistRun.best_structRNArtistRun.color_dictRNArtistRun.edited_numbersRNArtistRun.get_ct_file()RNArtistRun.get_db_file()RNArtistRun.get_png_file()RNArtistRun.get_script_file()RNArtistRun.get_svg_file()RNArtistRun.get_varna_color_file()RNArtistRun.process_struct()RNArtistRun.run()RNArtistRun.tableRNArtistRun.table_classRNArtistRun.table_classesRNArtistRun.table_fileRNArtistRun.table_loader
build_jinja_data()draw()parse_color_file()run()
- Submodules
- seismicrna.ensembles package
- seismicrna.export package
- Submodules
run()combine_metadata()parse_refs_metadata()parse_samples_metadata()conform_series()export_sample()format_metadata()get_db_structs()get_ref_metadata()get_reg_metadata()get_sample_data()get_sample_metadata()get_table_data()iter_clust_table_data()iter_pos_table_data()iter_pos_table_series()iter_pos_table_struct()iter_read_table_data()iter_table_data()
- Submodules
- seismicrna.fold package
- Submodules
run_datapath()dry_run()check_fold_deps()fold_region()fold_table()load_foldable_tables()resolve_fold_backend()resolve_fold_energy_method()run()RNAFoldProfileRNAFoldProfile.fold_temp_kRNAFoldProfile.from_profile()RNAFoldProfile.get_ct_file()RNAFoldProfile.get_db_file()RNAFoldProfile.get_fasta()RNAFoldProfile.get_mus_file()RNAFoldProfile.get_rnastructure_shape_args()RNAFoldProfile.get_varna_color_file()RNAFoldProfile.get_vienna_file()RNAFoldProfile.mus_normalizedRNAFoldProfile.rnafold_sp_strategyRNAFoldProfile.write_fasta()RNAFoldProfile.write_mus_file()RNAFoldProfile.write_varna_color_file()
celsius_to_kelvin()guess_temperature_to_celsius()kelvin_to_celsius()FoldIOFoldReportConnectivityTableAlreadyRetitledErrorRNAStructureConnectivityTableTitleLineFormatErrorcheck_data_path()format_retitled_ct_line()guess_data_path()make_rnastructure_cmd()parse_energy()parse_rnastructure_ct_title()require_data_path()retitle_ct()run_rnastructure()extract_energies()get_subopt()make_rnafold_cmd()run_rnafold()
- Submodules
- seismicrna.graph package
- Submodules
ClusterAbundanceGraphClusterAbundanceGraph.col_tracksClusterAbundanceGraph.dataClusterAbundanceGraph.detailsClusterAbundanceGraph.get_traces()ClusterAbundanceGraph.graph_kind()ClusterAbundanceGraph.path_subjectClusterAbundanceGraph.predicateClusterAbundanceGraph.row_tracksClusterAbundanceGraph.what()ClusterAbundanceGraph.x_titleClusterAbundanceGraph.y_title
ClusterAbundanceRunnerClusterAbundanceWriterRollingAUCGraphRollingAUCRunnerRollingAUCWriterAnnotationBaseGraphBaseGraph.annotationsBaseGraph.branchesBaseGraph.dataBaseGraph.detailsBaseGraph.figureBaseGraph.get_path()BaseGraph.get_path_fields()BaseGraph.get_path_segs()BaseGraph.get_traces()BaseGraph.graph_filenameBaseGraph.graph_kind()BaseGraph.path_subjectBaseGraph.predicateBaseGraph.refBaseGraph.regBaseGraph.sampleBaseGraph.seqBaseGraph.titleBaseGraph.title_action_sampleBaseGraph.topBaseGraph.what()BaseGraph.write()BaseGraph.write_csv()BaseGraph.write_html()BaseGraph.write_pdf()BaseGraph.write_png()BaseGraph.write_svg()BaseGraph.x_titleBaseGraph.y_title
BaseRunnerBaseWriterget_action_name()make_path_subject()make_title_action_sample()ClusterGroupGraphClusterGroupRunnercgroup_table()get_ks()get_ks_clusts()make_tracks()ColorMapColorMapGraphRelColorMapSeqColorMapget_cmap()get_colormaps()RollingCorrelationGraphRollingCorrelationRunnerRollingCorrelationWriterDatasetGraphDatasetRunnerDatasetWriterDeltaProfileGraphDeltaProfileRunnerDeltaProfileWriterRollingGiniGraphRollingGiniRunnerRollingGiniWriterHistogramGraphHistogramRunnerHistogramWriterget_edges_index()PositionHistogramGraphPositionHistogramRunnerPositionHistogramWriterReadHistogramGraphReadHistogramRunnerReadHistogramWriterMutationDistanceGraphMutationDistanceGraph.dataMutationDistanceGraph.g_testMutationDistanceGraph.get_cmap_type()MutationDistanceGraph.get_traces()MutationDistanceGraph.graph_kind()MutationDistanceGraph.histsMutationDistanceGraph.loc_clustersMutationDistanceGraph.max_read_lengthMutationDistanceGraph.row_tracksMutationDistanceGraph.tableMutationDistanceGraph.tabulatorMutationDistanceGraph.what()MutationDistanceGraph.x_titleMutationDistanceGraph.y_title
MutationDistanceRunnerMutationDistanceWriterget_null_name()OneSourceClusterGroupGraphOneSourceGraphStructOneTableGraphStructOneTableRunnerStructOneTableWriterOneTableGraphOneTableRelClusterGroupGraphOneTableRelClusterGroupRunnerOneTableRelClusterGroupWriterOneTableRunnerOneTableWriterPositionCorrelationGraphPositionCorrelationRunnerPositionCorrelationWriterPositionPairGraphPositionPairRunnerPositionPairWriterMultiRelsProfileGraphOneRelProfileGraphProfileGraphProfileRunnerProfileWriterMultiRelsGraphOneRelGraphRelGraphRelRunnerROCGraphROCRunnerROCWriterrename_columns()RollingGraphRollingRunnerScatterGraphScatterRunnerScatterWriterRollingSNRGraphRollingSNRRunnerRollingSNRWriterRollingStatGraphRollingStatRunnerRollingStatWriterAbundanceTableRunnerPositionTableRunnerReadTableRunnerRelTableGraphRelTableRunnerTableGraphTableRunnerTableWriterload_abundance_tables()load_pos_tables()load_read_tables()get_hist_trace()get_line_trace()get_pairwise_position_trace()get_roc_trace()get_rolling_auc_trace()get_seq_base_bar_trace()get_seq_base_scatter_trace()get_seq_line_trace()get_seq_stack_bar_trace()iter_hist_traces()iter_line_traces()iter_roc_traces()iter_rolling_auc_traces()iter_seq_base_bar_traces()iter_seq_base_scatter_traces()iter_seq_line_traces()iter_seqbar_stack_traces()iter_stack_bar_traces()TwoTableGraphTwoTableMergedClusterGroupGraphTwoTableRelClusterGroupGraphTwoTableRelClusterGroupRunnerTwoTableRelClusterGroupWriterTwoTableRunnerTwoTableWriteriter_table_pairs()
- Submodules
- seismicrna.importmm package
- seismicrna.mask package
- Subpackages
- Submodules
MaskMutsBatchMaskReadBatchPartialReadBatchPartialRegionMutsBatchapply_mask()JoinMaskMutsDatasetMaskDatasetMaskMutsDatasetMaskReadDatasetMaskBatchIOMaskFileMaskIOMaskListMaskPositionListload_regions()run()set_mask_acgu()set_mut_gap_params()BaseMaskReportJoinMaskReportMaskReportMaskBatchTabulatorMaskCountTabulatorMaskDatasetTabulatorMaskPositionTableMaskPositionTableLoaderMaskPositionTableWriterMaskReadTableMaskReadTableLoaderMaskReadTableWriterMaskTableMaskTabulatorPartialDatasetTabulatorPartialPositionTablePartialReadTablePartialTablePartialTabulatoradjust_counts()MaskerMasker.CHECKSUM_KEYMasker.MASK_POS_FMUTMasker.MASK_POS_NINFOMasker.MASK_READ_DISCONTIGMasker.MASK_READ_FCOVMasker.MASK_READ_FINFOMasker.MASK_READ_FMUTMasker.MASK_READ_GAPMasker.MASK_READ_INITMasker.MASK_READ_KEPTMasker.MASK_READ_LISTMasker.MASK_READ_NCOVMasker.PATTERN_KEYMasker.create_report()Masker.mask()Masker.n_reads_discontigMasker.n_reads_initMasker.n_reads_keptMasker.n_reads_listMasker.n_reads_max_fmutMasker.n_reads_min_fcovMasker.n_reads_min_finfoMasker.n_reads_min_gapMasker.n_reads_min_ncovMasker.pos_aMasker.pos_cMasker.pos_gMasker.pos_keptMasker.pos_listMasker.pos_max_fmutMasker.pos_min_ninfoMasker.pos_nMasker.pos_polyaMasker.pos_uMasker.read_names_dataset
mask_region()
- seismicrna.relate package
- Subpackages
- Submodules
FullReadBatchReadNamesBatchRelateMutsBatchRelateRegionMutsBatchformat_read_name()AverageDatasetNamesDatasetPoolDatasetPoolMutsDatasetPoolReadNamesDatasetReadNamesDatasetRelateDatasetRelateMutsDatasetReadNamesBatchIORefseqIORelateBatchIORelateFileRelateIOfrom_reads()RelateListRelatePositionListcheck_duplicates()run()BaseRelateReportPoolReportRelateReportXamViewerget_line_attrs()tmp_xam_cmd()calc_pmut_pattern()make_p_ends_2d()simulate_batch()simulate_batches()simulate_cluster()simulate_relate()generate_both_strands()write_both_strands()AverageTableAverageTabulatorFullTabulatorRelateBatchTabulatorRelateCountTabulatorRelateDatasetTabulatorRelatePositionTableRelatePositionTableLoaderRelatePositionTableWriterRelateReadTableRelateReadTableLoaderRelateReadTableWriterRelateTableRelateTabulatorRelationWritergenerate_batch()relate_records()relate_xam()
- seismicrna.renumct package
- seismicrna.sim package
- Subpackages
- Submodules
abstract_seismicgraph_file()abstract_table()get_acgt_parameters()get_other_parameters()new_parameter_dict()run()load_pclust()run()sim_pclust()sim_pclust_ct()load_pends()run()sim_pends()sim_pends_ct()from_param_dir()from_report()generate_fastq()generate_fastq_record()run()fold_region()get_ct_path()run()load_pmut()make_pmut_means()make_pmut_means_paired()make_pmut_means_unpaired()run()run_struct()sim_pmut()verify_proportions()run()get_fasta_path()run()parse_min_mut_gap_weights()run()run()
- seismicrna.test package
- seismicrna.tests package
- Submodules
TestWorkflowTestWorkflowTwoOutDirsTestWorkflowTwoOutDirs.CJOINEDTestWorkflowTwoOutDirs.MJOINEDTestWorkflowTwoOutDirs.NUMBERSTestWorkflowTwoOutDirs.POOLEDTestWorkflowTwoOutDirs.REFTestWorkflowTwoOutDirs.REFSTestWorkflowTwoOutDirs.SAMPLETestWorkflowTwoOutDirs.check_no_identical()TestWorkflowTwoOutDirs.out_dirTestWorkflowTwoOutDirs.out_dirsTestWorkflowTwoOutDirs.setUp()TestWorkflowTwoOutDirs.sim_dirTestWorkflowTwoOutDirs.sim_dirsTestWorkflowTwoOutDirs.tearDown()TestWorkflowTwoOutDirs.test_wf_two_out_dirs()
list_actions()list_profiles()list_step_dir_contents()
- Submodules
Submodules
- seismicrna.interface.dataset_from_report(report_path: str | Path, verify_times: bool = True) MutsDataset
Load a dataset from a report file.
- Parameters:
report_path (
str | Path) – The path to a report json file from the relate, mask, or cluster steps.verify_times (
bool = True) – Ensure that the report file does not have a timestamp that is earlier than that of one of its constituents.
- Returns:
The type of MutsDataset returned depends on the report file.
- Return type:
RelateMutsDataset | MaskMutsDataset | ClusterMutsDataset
- seismicrna.interface.table_from_dataset(dataset: MutsDataset, table: str = 'pos') TableWriter
Tabulate a dataset to generate a TableWriter
- Parameters:
dataset (
RelateMutsDataset | MaskMutsDataset | ClusterMutsDataset) – A dataset from the Relate, Mask, or Cluster steps.table (str =
"pos") – The type of table to generate. Valid options include “pos” for per-position table, “read” for per-read table, and “abundance” for a cluster abundance table.
- Returns:
The type of TableWriter returned depends on the Dataset type.
- Return type:
TableWriter
- seismicrna.interface.table_from_report(report_path: str | Path, verify_times: bool = True, table: str = 'pos')
Load a dataset from a report file and tabulate it.
- Parameters:
report_path (
str | Path) – Path to a report JSON file from the relate, mask, or cluster step.verify_times (
bool) – If True, ensure the report file does not have a timestamp earlier than any of its constituent files.table (
str) – Type of table to generate: “pos” for per-position, “read” for per-read, or “abundance” for cluster abundance.
- Returns:
The type of TableWriter returned depends on the report file.
- Return type:
TableWriter
- seismicrna.join.join_regions(out_dir: Path, joined_region: str, sample: str, branches_flat: Iterable[str], ref: str, regs: Iterable[str], clustered: bool, *, clusts: dict[str, dict[int, dict[int, int]]], mask_pos_table: bool, mask_read_table: bool, cluster_pos_table: bool, cluster_abundance_table: bool, verify_times: bool, num_cpus: int, force: bool, tmp_pfx, keep_tmp)
Join one or more regions (horizontally).
- Parameters:
out_dir (
pathlib.Path) – Output directory.joined_region (
str) – Name of the joined region.branches_flat (
Iterable[str]) – Branches of the datasets being pooled.sample (
str) – Name of the sample.ref (
str) – Name of the reference.regs (
Iterable[str]) – Names of the regions being joined.clustered (
bool) – Whether the dataset is clustered.tmp_dir (
Path) – Temporary directory.clusts (
dict[str,dict[int,dict[int,int]]]) – For each region, for each number of clusters, the cluster from the original region to use as the cluster in the joined region (ignored if clustered is False).mask_pos_table (
bool) – Tabulate relationships per position for mask data.mask_read_table (
bool) – Tabulate relationships per read for mask datacluster_pos_table (
bool) – Tabulate relationships per position for cluster data.cluster_abundance_table (
bool) – Tabulate number of reads per cluster for cluster data.verify_times (
bool) – Verify that report files from later steps have later timestamps.num_cpus (
bool) – Number of processors to use.force (
bool) – Force the report to be written, even if it exists.
- Returns:
Path of the Join report file.
- Return type:
- seismicrna.join.joined_mask_report_exists(top: Path, sample: str, branches_flat: Iterable[str], ref: str, joined_region: str, regs: Iterable[str])
Return whether a mask report for the joined region exists.
- seismicrna.join.run(joined_region: str = Sentinel.UNSET, input_path: Iterable[str | Path] = Sentinel.UNSET, *, join_clusts: str | None = None, mask_pos_table: bool = True, mask_read_table: bool = True, cluster_pos_table: bool = True, cluster_abundance_table: bool = True, verify_times: bool = True, tmp_pfx: str | Path = './tmp', keep_tmp: bool = False, num_cpus: int = 4, force: bool = False) list[Path]
Merge regions (horizontally) from the Mask or Cluster step.
- Parameters:
join_clusts (
str | None) – Specify which clusters to join clusters using this CSV file [keyword-only, default: None]mask_pos_table (
bool) – Tabulate relationships per position for mask data [keyword-only, default: True]mask_read_table (
bool) – Tabulate relationships per read for mask data [keyword-only, default: True]cluster_pos_table (
bool) – Tabulate relationships per position for cluster data [keyword-only, default: True]cluster_abundance_table (
bool) – Tabulate number of reads per cluster for cluster data [keyword-only, default: True]verify_times (
bool) – Verify that report files from later steps have later timestamps [keyword-only, default: True]tmp_pfx (
str | pathlib._local.Path) – Write all temporary files to a directory with this prefix [keyword-only, default: ‘./tmp’]keep_tmp (
bool) – Keep temporary files after finishing [keyword-only, default: False]num_cpus (
int) – Use up to this many CPUs simultaneously [keyword-only, default: 4]force (
bool) – Force all tasks to run, overwriting any existing output files [keyword-only, default: False]
- seismicrna.join.write_report(report_type: type[JoinReport], out_dir: Path, **kwargs)
Instantiate and save a JoinReport.
- Parameters:
report_type (
type[JoinReport]) – The concrete JoinReport subclass to instantiate.out_dir (
Path) – Directory in which to save the report file.
- seismicrna.lists.find_pos(table: PositionTable, max_fmut_pos: float, complement: bool)
Find positions that pass a mutation-rate filter.
- Parameters:
table (
PositionTable) – Per-position table from which mutation rates are fetched.max_fmut_pos (
float) – Maximum allowed mutation frequency; positions above this threshold are masked (or kept if complement is True).complement (
bool) – If True, invert the filter so that positions above the threshold are kept instead of masked.
- seismicrna.lists.iter_tables(input_path: Iterable[str | Path], **kwargs)
Iterate through all types of List and all Tables from which each type of List can be created.
- seismicrna.lists.run(input_path: Iterable[str | Path] = Sentinel.UNSET, *, branch: str = '', min_ninfo_pos: int = 1000, max_fmut_pos: float = 1.0, force: bool = False, num_cpus: int = 4) list[Path]
List positions to mask.
- Parameters:
branch (
str) – Create a new branch of the workflow with this name [keyword-only, default: ‘’]min_ninfo_pos (
int) – Mask positions with fewer than this many informative base calls [keyword-only, default: 1000]max_fmut_pos (
float) – Mask positions with more than this fraction of mutated base calls [keyword-only, default: 1.0]force (
bool) – Force all tasks to run, overwriting any existing output files [keyword-only, default: False]num_cpus (
int) – Use up to this many CPUs simultaneously [keyword-only, default: 4]
- seismicrna.lists.write_list(table: PositionTableLoader, list_type: type[List], *, branch: str, min_ninfo_pos: int, max_fmut_pos: float, force: bool)
Write a List based on a Table.
- seismicrna.logo.compute_arc_points(center, radius, theta1, theta2, n=100)
Compute x/y coordinates along an arc.
- seismicrna.logo.draw_seismic_logo(report: bool = False, out_svg: str | Path | None = None, dpi: int = 300)
Draw the SEISMIC-RNA logo as an SVG.
- Parameters:
report (
bool) – If True, render the variant of the logo used in reports (different background triangle color).out_svg (
str | Path | None) – Path to write the SVG file; if None the SVG is returned as a string instead of being written to disk.dpi (
int) – Resolution in dots per inch used when saving the figure.
- exception seismicrna.migrate.FindAndReplaceError
Bases:
ValueError
- seismicrna.migrate.find_and_replace(file: Path, find: str, replace: str, count: int = 1)
Replace occurrences of a string in a plain-text or gzipped file.
- seismicrna.migrate.run(input_path: Iterable[str | Path] = Sentinel.UNSET, *, out_dir: str | Path = './out', num_cpus: int = 4)
Migrate output directories from v0.23 to v0.24
- Parameters:
out_dir (
str | pathlib._local.Path) – Write all output files to this directory [keyword-only, default: ‘./out’]num_cpus (
int) – Use up to this many CPUs simultaneously [keyword-only, default: 4]
- seismicrna.pool.pool_samples(out_dir: Path, pooled_sample: str, branches_flat: Iterable[str], ref: str, samples: Iterable[str], *, min_pearson: float, max_marcd: float, relate_pos_table: bool, relate_read_table: bool, verify_times: bool, num_cpus: int, force: bool, tmp_pfx, keep_tmp)
Pool one or more samples (vertically).
- Parameters:
out_dir (
pathlib.Path) – Output directory.pooled_sample (
str) – Name of the pooled sample.branches_flat (
Iterable[str]) – Branches of the datasets being pooled.ref (
str) – Name of the referencesamples (
Iterable[str]) – Names of the samples in the pool.tmp_dir (
Path) – Temporary directory.min_pearson (
float) – Skip pooling if any pair of samples has Pearson r below this.max_marcd (
float) – Skip pooling if any pair of samples has MARCD above this.relate_pos_table (
bool) – Tabulate relationships per position for relate data.relate_read_table (
bool) – Tabulate relationships per read for relate dataverify_times (
bool) – Verify that report files from later steps have later timestamps.num_cpus (
bool) – Number of processors to use.force (
bool) – Force the report to be written, even if it exists.
- Returns:
Path of the Pool report file.
- Return type:
- seismicrna.pool.run(pooled_sample: str = Sentinel.UNSET, input_path: Iterable[str | Path] = Sentinel.UNSET, *, relate_pos_table: bool = True, relate_read_table: bool = False, min_pearson: float = 0.0, max_marcd: float = 1.0, verify_times: bool = True, tmp_pfx: str | Path = './tmp', keep_tmp: bool = False, num_cpus: int = 4, force: bool = False) list[Path]
Merge samples (vertically) from the Relate step.
- Parameters:
relate_pos_table (
bool) – Tabulate relationships per position for relate data [keyword-only, default: True]relate_read_table (
bool) – Tabulate relationships per read for relate data [keyword-only, default: False]min_pearson (
float) – Pool samples only if their Pearson correlation is at least this large [keyword-only, default: 0.0]max_marcd (
float) – Pool samples only if their mean arsince distance is at most this [keyword-only, default: 1.0]verify_times (
bool) – Verify that report files from later steps have later timestamps [keyword-only, default: True]tmp_pfx (
str | pathlib._local.Path) – Write all temporary files to a directory with this prefix [keyword-only, default: ‘./tmp’]keep_tmp (
bool) – Keep temporary files after finishing [keyword-only, default: False]num_cpus (
int) – Use up to this many CPUs simultaneously [keyword-only, default: 4]force (
bool) – Force all tasks to run, overwriting any existing output files [keyword-only, default: False]
- seismicrna.table.get_dataset_flags(dataset: MutsDataset, relate_pos_table: bool, relate_read_table: bool, mask_pos_table: bool, mask_read_table: bool, cluster_pos_table: bool, cluster_abundance_table: bool)
Return the tabulator and table flags for a dataset.
- seismicrna.table.get_tabulator_type(dataset_type: type[Dataset], count: bool = False) type[DatasetTabulator] | type[CountTabulator]
Determine which class of Tabulator can process the dataset.
- seismicrna.table.load_all_datasets(input_path: Iterable[str | Path], **kwargs)
Load datasets from all steps of the workflow.
- seismicrna.table.run(input_path: Iterable[str | Path] = Sentinel.UNSET, *, relate_pos_table: bool = True, relate_read_table: bool = False, mask_pos_table: bool = True, mask_read_table: bool = True, cluster_pos_table: bool = True, cluster_abundance_table: bool = True, verify_times: bool = True, num_cpus: int = 4, force: bool = False) list[Path]
Tabulate counts of relationships per read and position.
- Parameters:
relate_pos_table (
bool) – Tabulate relationships per position for relate data [keyword-only, default: True]relate_read_table (
bool) – Tabulate relationships per read for relate data [keyword-only, default: False]mask_pos_table (
bool) – Tabulate relationships per position for mask data [keyword-only, default: True]mask_read_table (
bool) – Tabulate relationships per read for mask data [keyword-only, default: True]cluster_pos_table (
bool) – Tabulate relationships per position for cluster data [keyword-only, default: True]cluster_abundance_table (
bool) – Tabulate number of reads per cluster for cluster data [keyword-only, default: True]verify_times (
bool) – Verify that report files from later steps have later timestamps [keyword-only, default: True]num_cpus (
int) – Use up to this many CPUs simultaneously [keyword-only, default: 4]force (
bool) – Force all tasks to run, overwriting any existing output files [keyword-only, default: False]
- seismicrna.table.tabulate(dataset: MutsDataset, tabulator_type: type[DatasetTabulator], pos_table: bool, read_table: bool, clust_table: bool, force: bool, num_cpus: int)
Write tables for a dataset using the appropriate tabulator.
- Parameters:
dataset (
MutsDataset) – Dataset to tabulate (from the relate, mask, or cluster step).tabulator_type (
type[DatasetTabulator]) – Tabulator class that can process this dataset type.pos_table (
bool) – If True, write a per-position table.read_table (
bool) – If True, write a per-read table.clust_table (
bool) – If True, write a cluster abundance table.force (
bool) – If True, overwrite existing table files.num_cpus (
int) – Number of CPUs to use for computation.
- seismicrna.wf.flatten(nested)
- seismicrna.wf.run(fasta: str | Path = Sentinel.UNSET, input_path: Iterable[str | Path] = Sentinel.UNSET, *, out_dir: str | Path = './out', tmp_pfx: str | Path = './tmp', keep_tmp: bool = False, brotli_level: int = 10, force: bool = False, num_cpus: int = 4, fastqz: Iterable[str | Path] = (), fastqy: Iterable[str | Path] = (), fastqx: Iterable[str | Path] = (), phred_enc: int = 33, demult: bool = False, barcode_start: int = 0, barcode_end: int = 0, read_pos: int = None, barcode: tuple[tuple[str, DNA, int]] = (), mismatch_tolerance: int = 0, index_tolerance: int = 0, allow_n: bool = False, dmfastqz: Iterable[str | Path] = (), dmfastqy: Iterable[str | Path] = (), dmfastqx: Iterable[str | Path] = (), fastp: bool = True, fastp_5: bool = False, fastp_3: bool = True, fastp_w: int = 6, fastp_m: int = 25, fastp_poly_g: str = 'auto', fastp_poly_g_min_len: int = 10, fastp_poly_x: bool = False, fastp_poly_x_min_len: int = 10, fastp_adapter_trimming: bool = True, fastp_adapter_1: str = '', fastp_adapter_2: str = '', fastp_adapter_fasta: str | None = None, fastp_detect_adapter_for_pe: bool = True, fastp_min_length: int = 9, bt2_local: bool = True, bt2_discordant: bool = False, bt2_mixed: bool = False, bt2_dovetail: bool = False, bt2_contain: bool = True, bt2_score_min_e2e: str = 'L,-1,-0.8', bt2_score_min_loc: str = 'L,1,0.8', bt2_i: int = 0, bt2_x: int = 600, bt2_gbar: int = 4, bt2_l: int = 20, bt2_s: str = 'L,1,0.1', bt2_d: int = 4, bt2_r: int = 2, bt2_dpad: int = 2, bt2_orient: str = 'fr', bt2_un: bool = True, min_mapq: int = 25, sep_strands: bool = False, f1r2_fwd: bool = False, rev_label: str = '-rev', min_phred: int = 25, min_reads: int = 1000, insert3: bool = True, ambindel: bool = True, overhangs: bool = True, clip_end5: int = 4, clip_end3: int = 4, batch_size: int = 65536, write_read_names: bool = False, relate_pos_table: bool = True, relate_read_table: bool = False, relate_cx: bool = True, mask_coords: Iterable[tuple[str, int, int]] = (), mask_primers: Iterable[tuple[str, DNA, DNA]] = (), primer_gap: int = 0, mask_regions_file: str | None = None, count_del: bool = True, count_ins: bool = True, no_mut: Iterable[str] = (), only_mut: Iterable[str] = (), probe: str = 'DMS', mask_a: bool | None = None, mask_c: bool | None = None, mask_g: bool | None = None, mask_u: bool | None = None, mask_polya: int = 5, mask_pos: Iterable[tuple[str, int]] = (), mask_pos_file: Iterable[str | Path] = (), mask_read: Iterable[str] = (), mask_read_file: Iterable[str | Path] = (), mask_discontig: bool = True, min_ncov_read: int = 1, min_fcov_read: float = 0.0, min_finfo_read: float = 0.95, max_fmut_read: float = 1.0, min_mut_gap: int = None, mut_collisions: str = 'auto', min_ninfo_pos: int = 1000, max_fmut_pos: float = 1.0, quick_unbias: bool = True, quick_unbias_thresh: float = 0.001, max_mask_iter: int = 0, mask_pos_table: bool = True, mask_read_table: bool = True, cluster: bool = False, min_clusters: int = 1, max_clusters: int = 0, min_em_runs: int = 6, max_em_runs: int = 30, jackpot: bool = True, jackpot_conf_level: float = 0.95, max_jackpot_quotient: float = 1.1, max_jackpot_sims: int = 12, jackpot_max_data: int = 268435456, min_em_iter: int = 10, max_em_iter: int = 500, em_thresh: float = 0.37, min_marcd_run: float = 0.016, max_pearson_run: float = 0.9, max_arcd_vs_ens_avg: float = 0.2, max_gini_run: float = 0.667, max_loglike_vs_best: float = 0.0, min_pearson_vs_best: float = 0.97, max_marcd_vs_best: float = 0.008, try_all_ks: bool = False, write_all_ks: bool = False, cluster_pos_table: bool = True, cluster_abundance_table: bool = True, verify_times: bool = True, fold: bool = False, fold_coords: Iterable[tuple[str, int, int]] = (), fold_primers: Iterable[tuple[str, DNA, DNA]] = (), fold_regions_file: str | None = None, fold_full: bool = True, fold_dry_run: bool = False, fold_backend: str = 'auto', fold_temp: float = 37.0, fold_energy_method: str = 'auto', deigan_slope: float = 1.8, deigan_intercept: float = -0.6, fold_quantile: float = 0.95, fold_constraint: str | None = None, fold_commands: str | None = None, eddy_prior_paired_file: str | None = None, eddy_prior_unpaired_file: str | None = None, fold_md: int = 0, fold_mfe: bool = False, fold_max: int = 20, fold_percent: float = 20.0, fold_isolated: bool = False, draw: bool = False, struct_num: Iterable[int] = (), color: bool = True, draw_svg: bool = True, draw_png: bool = False, update_rnartistcore: bool = False, export: bool = False, samples_meta: str = None, refs_meta: str = None, all_pos: bool = True, cgroup: str = 'k', graph_quantile: float = 0.0, hist_bins: int = 10, hist_margin: float = 0.1, struct_file: Iterable[str | Path] = (), terminal_pairs: bool = True, window: int = 45, winmin: int = 9, csv: bool = True, html: bool = True, svg: bool = False, pdf: bool = False, png: bool = False, graph_mprof: bool = True, graph_tmprof: bool = True, graph_ncov: bool = True, graph_mhist: bool = True, graph_abundance: bool = True, graph_giniroll: bool = False, graph_roc: bool = True, graph_aucroll: bool = False, graph_poscorr: bool = False, graph_mutdist: bool = False, mutdist_null: bool = True, collate: bool = True, name: str = 'collated', verbose_name: bool = False, include_svg: bool = True, include_graph: bool = True, group: str = 'sample', portable: bool = False, collate_out_dir: str | Path | None = None, seed: int | None = None)
Run the entire workflow.
- Parameters:
out_dir (
str | pathlib._local.Path) – Write all output files to this directory [keyword-only, default: ‘./out’]tmp_pfx (
str | pathlib._local.Path) – Write all temporary files to a directory with this prefix [keyword-only, default: ‘./tmp’]keep_tmp (
bool) – Keep temporary files after finishing [keyword-only, default: False]brotli_level (
int) – Compress pickle files with this level of Brotli (0 - 11) [keyword-only, default: 10]force (
bool) – Force all tasks to run, overwriting any existing output files [keyword-only, default: False]num_cpus (
int) – Use up to this many CPUs simultaneously [keyword-only, default: 4]fastqz (
Iterable) – FASTQ file(s) of single-end reads [keyword-only, default: ()]fastqy (
Iterable) – FASTQ file(s) of paired-end reads with mates 1 and 2 interleaved [keyword-only, default: ()]fastqx (
Iterable) – FASTQ files of paired-end reads with mates 1 and 2 in separate files [keyword-only, default: ()]phred_enc (
int) – Specify the Phred score encoding of FASTQ and SAM/BAM/CRAM files [keyword-only, default: 33]demult (
bool) – Enable demultiplexing [keyword-only, default: False]barcode_start (
int) – Index of start of barcode [keyword-only, default: 0]barcode_end (
int) – Index of end of barcode [keyword-only, default: 0]read_pos (
int) – Expected position of the barcode in the read (1-indexed). Defaults to –barcode-start [keyword-only, default: None]barcode (
tuple) – A list of barcode name, barcode sequence, and barcode position (1-indexed relative to read start) to demultiplex [keyword-only, default: ()]mismatch_tolerance (
int) – Designates the allowable amount of mismatches allowed in a string and still be considered a valid pattern find. will increase non-parallel computation at a factorial rate. use caution going above 2 mismatches. does not apply to clipped sequences. [keyword-only, default: 0]index_tolerance (
int) – Designates the allowable amount of distance you allow the pattern to be found in a read from the reference index [keyword-only, default: 0]allow_n (
bool) – Allow N as a valid mismatch when –mismatch-tolerance ≥ 1. Increases memory consumption. [keyword-only, default: False]dmfastqz (
Iterable) – Demultiplexed FASTQ files of single-end reads [keyword-only, default: ()]dmfastqy (
Iterable) – Demultiplexed FASTQ files of paired-end reads interleaved in one file [keyword-only, default: ()]dmfastqx (
Iterable) – Demultiplexed FASTQ files of mate 1 and mate 2 reads [keyword-only, default: ()]fastp (
bool) – Use fastp to QC, filter, and trim reads before alignment [keyword-only, default: True]fastp_5 (
bool) – Trim low-quality bases from the 5’ ends of reads [keyword-only, default: False]fastp_3 (
bool) – Trim low-quality bases from the 3’ ends of reads [keyword-only, default: True]fastp_w (
int) – Use this window size (nt) for –fastp-5 and –fastp-3 [keyword-only, default: 6]fastp_m (
int) – Use this mean quality threshold for –fastp-5 and –fastp-3 [keyword-only, default: 25]fastp_poly_g (
str) – Trim poly(G) tails (two-color sequencing artifacts) from the 3’ end [keyword-only, default: ‘auto’]fastp_poly_g_min_len (
int) – Minimum number of Gs to consider a poly(G) tail for –fastp-poly-g [keyword-only, default: 10]fastp_poly_x (
bool) – Trim poly(X) tails (i.e. of any nucleotide) from the 3’ end [keyword-only, default: False]fastp_poly_x_min_len (
int) – Minimum number of bases to consider a poly(X) tail for –fastp-poly-x [keyword-only, default: 10]fastp_adapter_trimming (
bool) – Trim adapter sequences from the 3’ ends of reads [keyword-only, default: True]fastp_adapter_1 (
str) – Trim this adapter sequence from the 3’ ends of read 1s [keyword-only, default: ‘’]fastp_adapter_2 (
str) – Trim this adapter sequence from the 3’ ends of read 2s [keyword-only, default: ‘’]fastp_adapter_fasta (
str | None) – Trim adapter sequences in this FASTA file from the 3’ ends of reads [keyword-only, default: None]fastp_detect_adapter_for_pe (
bool) – Automatically detect the adapter sequences for paired-end reads [keyword-only, default: True]fastp_min_length (
int) – Discard reads shorter than this length [keyword-only, default: 9]bt2_local (
bool) – Align reads in local mode rather than end-to-end mode [keyword-only, default: True]bt2_discordant (
bool) – Output paired-end reads whose mates align discordantly [keyword-only, default: False]bt2_mixed (
bool) – Attempt to align individual mates of pairs that fail to align [keyword-only, default: False]bt2_dovetail (
bool) – Consider dovetailed mate pairs to align concordantly [keyword-only, default: False]bt2_contain (
bool) – Consider nested mate pairs to align concordantly [keyword-only, default: True]bt2_score_min_e2e (
str) – Discard alignments that score below this threshold in end-to-end mode [keyword-only, default: ‘L,-1,-0.8’]bt2_score_min_loc (
str) – Discard alignments that score below this threshold in local mode [keyword-only, default: ‘L,1,0.8’]bt2_i (
int) – Discard paired-end alignments shorter than this many bases [keyword-only, default: 0]bt2_x (
int) – Discard paired-end alignments longer than this many bases [keyword-only, default: 600]bt2_gbar (
int) – Do not place gaps within this many bases from the end of a read [keyword-only, default: 4]bt2_l (
int) – Use this seed length for Bowtie2 [keyword-only, default: 20]bt2_s (
str) – Seed Bowtie2 alignments at this interval [keyword-only, default: ‘L,1,0.1’]bt2_d (
int) – Discard alignments if over this many consecutive seed extensions fail [keyword-only, default: 4]bt2_r (
int) – Re-seed reads with repetitive seeds up to this many times [keyword-only, default: 2]bt2_dpad (
int) – Pad the alignment matrix with this many bases (to allow gaps) [keyword-only, default: 2]bt2_orient (
str) – Require paired mates to have this orientation [keyword-only, default: ‘fr’]bt2_un (
bool) – Output unaligned reads to a FASTQ file [keyword-only, default: True]min_mapq (
int) – Discard reads with mapping qualities below this threshold [keyword-only, default: 25]sep_strands (
bool) – Separate each alignment map into forward- and reverse-strand reads [keyword-only, default: False]f1r2_fwd (
bool) – With –sep-strands, consider forward mate 1s and reverse mate 2s to be forward-stranded [keyword-only, default: False]rev_label (
str) – With –sep-strands, add this label to each reverse-strand reference [keyword-only, default: ‘-rev’]min_phred (
int) – Mark base calls with Phred scores lower than this threshold as ambiguous [keyword-only, default: 25]min_reads (
int) – Discard alignment maps with fewer than this many reads [keyword-only, default: 1000]insert3 (
bool) – Mark each insertion on the base to its 3’ (True) or 5’ (False) side [keyword-only, default: True]ambindel (
bool) – Mark all ambiguous insertions and deletions (indels) [keyword-only, default: True]overhangs (
bool) – Retain the overhangs of paired-end mates that dovetail [keyword-only, default: True]clip_end5 (
int) – Clip this many bases from the 5’ end of each read [keyword-only, default: 4]clip_end3 (
int) – Clip this many bases from the 3’ end of each read [keyword-only, default: 4]batch_size (
int) – Limit batches to at most this many reads [keyword-only, default: 65536]write_read_names (
bool) – Write the name of each read in a second set of batches (necessary for the options –mask-read or –mask-read-file) [keyword-only, default: False]relate_pos_table (
bool) – Tabulate relationships per position for relate data [keyword-only, default: True]relate_read_table (
bool) – Tabulate relationships per read for relate data [keyword-only, default: False]relate_cx (
bool) – Use a fast (C extension module) version of the relate algorithm; the slow (Python) version is still avilable as a fallback if the C extension cannot be loaded, and for debugging/benchmarking [keyword-only, default: True]mask_coords (
Iterable) – Select a region of a reference given its 5’ and 3’ end coordinates [keyword-only, default: ()]mask_primers (
Iterable) – Select a region of a reference given its forward and reverse primers [keyword-only, default: ()]primer_gap (
int) – Leave a gap of this many bases between the primer and the region [keyword-only, default: 0]mask_regions_file (
str | None) – Select regions of references from coordinates/primers in a CSV file [keyword-only, default: None]count_del (
bool) – Count deletions as mutations [keyword-only, default: True]count_ins (
bool) – Count insertions as mutations [keyword-only, default: True]no_mut (
Iterable) – Do not count this type of mutation (overrides –count-del/ins) [keyword-only, default: ()]only_mut (
Iterable) – Count only this type of mutation (overrides other mutation settings) [keyword-only, default: ()]probe (
str) – Use default mask options for this chemical probe [keyword-only, default: ‘DMS’]mask_a (
bool | None) – Mask positions with base A [keyword-only, default: None]mask_c (
bool | None) – Mask positions with base C [keyword-only, default: None]mask_g (
bool | None) – Mask positions with base G [keyword-only, default: None]mask_u (
bool | None) – Mask positions with base U [keyword-only, default: None]mask_polya (
int) – Mask stretches of at least this many consecutive A bases (0 disables) [keyword-only, default: 5]mask_pos (
Iterable) – Mask this position in this reference [keyword-only, default: ()]mask_pos_file (
Iterable) – Mask positions in references from a file [keyword-only, default: ()]mask_read (
Iterable) – Mask the read with this name [keyword-only, default: ()]mask_read_file (
Iterable) – Mask the reads with names in this file [keyword-only, default: ()]mask_discontig (
bool) – Mask paired-end reads with discontiguous mates [keyword-only, default: True]min_ncov_read (
int) – Mask reads with fewer than this many bases covering the region [keyword-only, default: 1]min_fcov_read (
float) – Mask reads covering less than this fraction of the region [keyword-only, default: 0.0]min_finfo_read (
float) – Mask reads with less than this fraction of informative base calls [keyword-only, default: 0.95]max_fmut_read (
float) – Mask reads with more than this fraction of mutated base calls [keyword-only, default: 1.0]min_mut_gap (
int) – Mask reads with two mutations separated by fewer than this many bases [keyword-only, default: None]mut_collisions (
str) – If two mutations are closer than –min-mut-gap positions, MERGE the mutations, DROP the read, or AUTO-select based on the probe. [keyword-only, default: ‘auto’]min_ninfo_pos (
int) – Mask positions with fewer than this many informative base calls [keyword-only, default: 1000]max_fmut_pos (
float) – Mask positions with more than this fraction of mutated base calls [keyword-only, default: 1.0]quick_unbias (
bool) – Correct observer bias using a quick (typically linear time) heuristic [keyword-only, default: True]quick_unbias_thresh (
float) – Treat mutated fractions under this threshold as 0 with –quick-unbias [keyword-only, default: 0.001]max_mask_iter (
int) – Stop masking after this many iterations (0 for no limit) [keyword-only, default: 0]mask_pos_table (
bool) – Tabulate relationships per position for mask data [keyword-only, default: True]mask_read_table (
bool) – Tabulate relationships per read for mask data [keyword-only, default: True]cluster (
bool) – Cluster reads to find alternative structures [keyword-only, default: False]min_clusters (
int) – Start at this many clusters [keyword-only, default: 1]max_clusters (
int) – Stop at this many clusters (0 for no limit) [keyword-only, default: 0]min_em_runs (
int) – Run EM (successfully) at least this number of times for each K [keyword-only, default: 6]max_em_runs (
int) – Run EM (successfully or not) at most this number of times for each K [keyword-only, default: 30]jackpot (
bool) – Calculate the jackpotting quotient to find over-represented reads [keyword-only, default: True]jackpot_conf_level (
float) – Confidence level for the jackpotting quotient confidence interval [keyword-only, default: 0.95]max_jackpot_quotient (
float) – Remove runs whose jackpotting quotient exceeds this limit [keyword-only, default: 1.1]max_jackpot_sims (
int) – Maximum number of simulations to compute the jackpotting quotient [keyword-only, default: 12]jackpot_max_data (
int) – Skip calculating the jackpotting quotient if reads × positions exceeds this limit [keyword-only, default: 268435456]min_em_iter (
int) – Run EM for at least this many iterations [keyword-only, default: 10]max_em_iter (
int) – Run EM for at most this many iterations [keyword-only, default: 500]em_thresh (
float) – Stop EM when the log likelihood increases by less than this threshold [keyword-only, default: 0.37]min_marcd_run (
float) – Remove runs with two clusters that differ by less than this MARCD [keyword-only, default: 0.016]max_pearson_run (
float) – Remove runs with two clusters more similar than this correlation [keyword-only, default: 0.9]max_arcd_vs_ens_avg (
float) – Remove runs where a cluster differs by more than this ARCD from the ensemble average at any position [keyword-only, default: 0.2]max_gini_run (
float) – Remove runs where any cluster’s Gini coefficient exceeds this limit [keyword-only, default: 0.667]max_loglike_vs_best (
float) – Remove Ks with a log likelihood gap larger than this (0 for no limit) [keyword-only, default: 0.0]min_pearson_vs_best (
float) – Remove Ks where every run has less than this correlation vs. the best [keyword-only, default: 0.97]max_marcd_vs_best (
float) – Remove Ks where every run has more than this MARCD vs. the best [keyword-only, default: 0.008]try_all_ks (
bool) – Try all numbers of clusters (Ks), even after finding the best number [keyword-only, default: False]write_all_ks (
bool) – Write all numbers of clusters (Ks), rather than only the best number [keyword-only, default: False]cluster_pos_table (
bool) – Tabulate relationships per position for cluster data [keyword-only, default: True]cluster_abundance_table (
bool) – Tabulate number of reads per cluster for cluster data [keyword-only, default: True]verify_times (
bool) – Verify that report files from later steps have later timestamps [keyword-only, default: True]fold (
bool) – Predict the secondary structure using the reactivities [keyword-only, default: False]fold_coords (
Iterable) – Fold a region of a reference given its 5’ and 3’ end coordinates [keyword-only, default: ()]fold_primers (
Iterable) – Fold a region of a reference given its forward and reverse primers [keyword-only, default: ()]fold_regions_file (
str | None) – Fold regions of references from coordinates/primers in a CSV file [keyword-only, default: None]fold_full (
bool) – If no regions are specified, whether to default to the full region or to the table’s region [keyword-only, default: True]fold_dry_run (
bool) – Only generate the fold command and input files; do not run folding [keyword-only, default: False]fold_backend (
str) – Model RNA structures using Fold (RNAstructure), ShapeKnots (RNAstructure), or RNAfold (ViennaRNA); auto selects Fold for DMS and RNAFold for other probes [keyword-only, default: ‘auto’]fold_temp (
float) – Predict structures at this temperature (Celsius) [keyword-only, default: 37.0]fold_energy_method (
str) – Use this method to incorporate reactivities into folding energies. auto selects Cordero for DMS and Eddy for other probes; Eddy requires –fold-backend=RNAFold; Cordero requires –fold-backend=Fold or ShapeKnots [keyword-only, default: ‘auto’]deigan_slope (
float) – Slope (kcal/mol) for SHAPE reactivities using Deigan method; used only with –fold-energy-method=Deigan [keyword-only, default: 1.8]deigan_intercept (
float) – Intercept (kcal/mol) for SHAPE reactivities using Deigan method; used only with –fold-energy-method=Deigan [keyword-only, default: -0.6]fold_quantile (
float) – Normalize and winsorize reactivities to this quantile for folding [keyword-only, default: 0.95]fold_constraint (
str | None) – Force bases to be paired/unpaired from a file of constraints [keyword-only, default: None]fold_commands (
str | None) – Command file for RNAFold [keyword-only, default: None]eddy_prior_paired_file (
str | None) – File of per-position prior probabilities of being paired for the Eddy method (passed as –sp-data with –sp-strategy Pp); only used with –fold-energy-method=Eddy and –fold-backend=RNAFold [keyword-only, default: None]eddy_prior_unpaired_file (
str | None) – File of per-position prior probabilities of being unpaired for the Eddy method (passed as –sp-data with –sp-strategy Pu); only used with –fold-energy-method=Eddy and –fold-backend=RNAFold [keyword-only, default: None]fold_md (
int) – Limit base pair distances to this number of bases (0 for no limit) [keyword-only, default: 0]fold_mfe (
bool) – Predict only the minimum free energy (MFE) structure [keyword-only, default: False]fold_max (
int) – Output at most this many structures (overriden by –fold-mfe) [keyword-only, default: 20]fold_percent (
float) – Stop outputting structures when the % difference in energy exceeds this value (overriden by –fold-mfe) [keyword-only, default: 20.0]fold_isolated (
bool) – Allow isolated (non-stacked) base pairs when folding [keyword-only, default: False]draw (
bool) – Draw secondary structures with RNArtist. [keyword-only, default: False]struct_num (
Iterable) – Draw the specified structure (zero-indexed) or -1 for all structures. By default, draw the structure with the best AUROC. [keyword-only, default: ()]color (
bool) – Color bases by their reactivity [keyword-only, default: True]draw_svg (
bool) – Output each drawing in a Scalable Vector Graphics file [keyword-only, default: True]draw_png (
bool) – Output each drawing in a Portable Network Graphics file [keyword-only, default: False]update_rnartistcore (
bool) – Check for and install updates to RNArtistCore. [keyword-only, default: False]export (
bool) – Export each sample to SEISMICgraph (https://seismicrna.org) [keyword-only, default: False]samples_meta (
str) – Add sample metadata from this CSV file to exported results [keyword-only, default: None]refs_meta (
str) – Add reference metadata from this CSV file to exported results [keyword-only, default: None]all_pos (
bool) – Export all positions (not just unmasked positions) [keyword-only, default: True]cgroup (
str) – Put each Cluster in its own file, each K in its own file, or All clusters in one file [keyword-only, default: ‘k’]graph_quantile (
float) – Normalize and winsorize ratios to this quantile (0 disables) [keyword-only, default: 0.0]hist_bins (
int) – Number of bins in each histogram; must be ≥ 1 [keyword-only, default: 10]hist_margin (
float) – Autofill margins of at most this width in histograms of ratios [keyword-only, default: 0.1]struct_file (
Iterable) – Compare mutational profiles to the structure(s) in this CT file [keyword-only, default: ()]terminal_pairs (
bool) – Include terminal base pairs (at the ends of stems) in ROC curves [keyword-only, default: True]window (
int) – Use a sliding window of this many bases [keyword-only, default: 45]winmin (
int) – Mask sliding windows with fewer than this number of data [keyword-only, default: 9]csv (
bool) – Output the data for each graph in a Comma-Separated Values file [keyword-only, default: True]html (
bool) – Output each graph in an interactive HyperText Markup Language file [keyword-only, default: True]svg (
bool) – Output each graph in a Scalable Vector Graphics file [keyword-only, default: False]pdf (
bool) – Output each graph in a Portable Document Format file [keyword-only, default: False]png (
bool) – Output each graph in a Portable Network Graphics file [keyword-only, default: False]graph_mprof (
bool) – Graph mutational profiles [keyword-only, default: True]graph_tmprof (
bool) – Graph typed mutational profiles [keyword-only, default: True]graph_ncov (
bool) – Graph coverages per position [keyword-only, default: True]graph_mhist (
bool) – Graph histograms of mutations per read [keyword-only, default: True]graph_abundance (
bool) – Graph abundance of each cluster [keyword-only, default: True]graph_giniroll (
bool) – Graph rolling Gini coefficients [keyword-only, default: False]graph_roc (
bool) – Graph receiver operating characteristic (ROC) curves [keyword-only, default: True]graph_aucroll (
bool) – Graph rolling areas under ROC curves (AUC-ROC) [keyword-only, default: False]graph_poscorr (
bool) – Graph phi correlations between positions [keyword-only, default: False]graph_mutdist (
bool) – Graph distances between mutations [keyword-only, default: False]mutdist_null (
bool) – Include the null distribution of distances between mutations [keyword-only, default: True]collate (
bool) – Collate HTML graphs and SVG drawings into an HTML report file. [keyword-only, default: True]name (
str) – Prefix the HTML report with this name. [keyword-only, default: ‘collated’]verbose_name (
bool) – Add collated file information to report name. [keyword-only, default: False]include_svg (
bool) – Include RNA structure drawings from the draw module. [keyword-only, default: True]include_graph (
bool) – Include graphs from the graph module. [keyword-only, default: True]group (
str) – Group collated graphs by one of ‘sample’, ‘graph’, ‘branches’, ‘region’, or ‘all’. [keyword-only, default: ‘sample’]portable (
bool) – Embed collated graphs into the output HTML file for portability at the expense of live updates and file size. [keyword-only, default: False]collate_out_dir (
str | pathlib._local.Path | None) – Write collated report to this directory. By default, write to the lowest level directory common to all input graphs. [keyword-only, default: None]seed (
int | None) – Seed for the random number generator [keyword-only, default: None]