API Reference
The arguments for the CLI and Python API are the same. The CLI is just a wrapper around the Python API.
CLI cmds
dreem cluster
dreem cluster [OPTIONS]
Options
- --mp-report <mp_report>
Path to the bit vector folder or list of paths to the bit vector folders.
- --out-dir <out_dir>
Where to output all finished files
- --max-procs <max_procs>
Maximum number of simultaneous processes
- --max-clusters <max_clusters>
Maximum number of clusters.
- -n, --num-runs <num_runs>
Number of time to run the clustering algorithm.
- --signal-thresh <signal_thresh>
Minimum Mutation fraction to keep a base.
- --include-gu, --exclude-gu
Whether to include G and U bases in reads.
- --include-del, --exclude-del
Whether to include deletions in reads.
- --polya-max <polya_max>
Maximum length of poly(A) sequences to include.
- --min-iter <min_iter>
Minimum number of iteration before checking convergence of EM.
- --max-iter <max_iter>
Maximum number of iteration before stopping EM.
- --convergence-cutoff <convergence_cutoff>
Minimum difference between the log-likelihood of two consecutive iterations to stop EM.
- --min-reads <min_reads>
Minimum number of reads to start clustering.
Python args
- dreem.cluster.run(mp_report: tuple[str] = (), *, out_dir: str = './output', max_procs: int = 2, max_clusters: int = 3, num_runs: int = 10, signal_thresh: float = 0.005, include_gu: bool = False, include_del: bool = False, polya_max: int = 4, min_iter: int = 100, max_iter: int = 500, convergence_cutoff: float = 0.5, min_reads: int = 1000)
Run the clustering module.
- Parameters
mp_report (
tuple
) – Path to the bit vector folder or list of paths to the bit vector folders. [positional or keyword, default: ()]out_dir (
str
) – Where to output all finished files [keyword-only, default: ‘./output’]max_procs (
int
) – Maximum number of simultaneous processes [keyword-only, default: 2]max_clusters (
int
) – Maximum number of clusters. [keyword-only, default: 3]num_runs (
int
) – Number of time to run the clustering algorithm. [keyword-only, default: 10]signal_thresh (
float
) – Minimum Mutation fraction to keep a base. [keyword-only, default: 0.005]include_gu (
bool
) – Whether to include G and U bases in reads. [keyword-only, default: False]include_del (
bool
) – Whether to include deletions in reads. [keyword-only, default: False]polya_max (
int
) – Maximum length of poly(A) sequences to include. [keyword-only, default: 4]min_iter (
int
) – Minimum number of iteration before checking convergence of EM. [keyword-only, default: 100]max_iter (
int
) – Maximum number of iteration before stopping EM. [keyword-only, default: 500]convergence_cutoff (
float
) – Minimum difference between the log-likelihood of two consecutive iterations to stop EM. [keyword-only, default: 0.5]min_reads (
int
) – Minimum number of reads to start clustering. [keyword-only, default: 1000]