Sample results: Export data from SEISMIC-RNA to SEISMIC-GRAPH
Sample results: Content format
Sample results files are in JSON format. Each file contains information about one sample in four nested layers:
data:image/s3,"s3://crabby-images/dbd6d/dbd6d7b5c5e2c9b4281e8ec3d260fabc170cc04f" alt="../../_images/sample.png"
Each layer contains data and metadata.
Metadata fields begin with #
, while data fields do not.
Sample layer
The top layer describes the entire sample.
Sample metadata
#sample
: Name of the sample.#...
: Additional fields from the sample metadata file (see Metadata for Samples).
Sample data
{ref}
: Reference layer, keyed by the name of the reference.
Reference layer
The next layer describes one reference to which you aligned the sample.
Reference metadata
#sequence
: DNA sequence of the reference.#num_aligned
: Number of reads that aligned to the reference.#...
: Additional fields from the reference metadata file (see Metadata for References).
Reference data
{reg}
: Region layer, keyed by the name of the region.
Region layer
The next layer describes one region of the reference sequence.
Region metadata
#section_start
: 5’ coordinate of the region.#section_end
: 3’ coordinate of the region.#positions
: List of positions in the region; will contain all positions if you use--all-pos
, otherwise only unmasked positions.
Region data
{profile}
: Profile layer.
Profile layer
The deepest layer describes one profile made from the region (a profile is a series of relationship data, from either the ensemble average or from one cluster).
Profile data (per position)
Most fields of the profile are lists of decimal numbers, each number
corresponding to one position in the region’s #positions
field.
cov
: Number of reads covering each position.info
: Number of informative reads at each position.sub_N
: Number of total substitutions at each position.sub_A
: Number of substitutions to A at each position.sub_C
: Number of substitutions to C at each position.sub_G
: Number of substitutions to G at each position.sub_T
: Number of substitutions to T at each position.del
: Number of deletions at each position.ins
: Number of insertions at each position.sub_rate
: Fraction of informative reads with substitutions at each position (sub_N
divided byinfo
).
Note
These fields will be computed only if you give a table of data per
position (ending in per-pos.csv
).
Profile data (per read)
sub_hist
: Histogram of substitutions per read: the first element is the number of reads with 0 substititions, the second the number of reads with 1 substitution, and so on.
Note
This field will be computed only if you give a table of data per
read (ending in per-read.csv.gz
).
Profile data (per cluster)
proportion
: Proportion of the profile in the ensemble (only for profiles of clusters, not ensemble averages).
Note
This field will be computed only if you give a table of data per
cluster (clust-freq.csv
).
Sample results file: Path format
Sample results file extensions
SEISMIC-RNA accepts the following extensions for sample results files:
.json
Sample results path parsing
Sample results files are output in the main output directory with the
name {sample}__webapp.json
, where {sample}
is the sample name.
Sample results file: Uses
Sample results as input file
Sample results are input files for the seismic-graph web app, which provides additional graphing utilities beyond those in SEISMIC-RNA.
Sample results as output file
seismic export
outputs a sample results file for each sample.