Sample results: Export data from SEISMIC-RNA to SEISMIC-GRAPH 

Sample results: Content format

Sample results files are in JSON format. Each file contains information about one sample in four nested layers:

Each layer contains data and metadata. Metadata fields begin with #, while data fields do not.

Sample layer

The top layer describes the entire sample.

Sample metadata

#sample: Name of the sample.
#...: Additional fields from the sample metadata file (see Metadata for Samples).

Sample data

{ref}: Reference layer, keyed by the name of the reference.

Reference layer

The next layer describes one reference to which you aligned the sample.

Reference metadata

#sequence: DNA sequence of the reference.
#num_aligned: Number of reads that aligned to the reference.
#...: Additional fields from the reference metadata file (see Metadata for References).

Reference data

{reg}: Region layer, keyed by the name of the region.

Region layer

The next layer describes one region of the reference sequence.

Region metadata

#section_start: 5’ coordinate of the region.
#section_end: 3’ coordinate of the region.
#positions: List of positions in the region; will contain all positions if you use --all-pos, otherwise only unmasked positions.

Region data

{profile}: Profile layer.

Profile layer

The deepest layer describes one profile made from the region (a profile is a series of relationship data, from either the ensemble average or from one cluster).

Profile data (per position)

Most fields of the profile are lists of decimal numbers, each number corresponding to one position in the region’s #positions field.

cov: Number of reads covering each position.
info: Number of informative reads at each position.
sub_N: Number of total substitutions at each position.
sub_A: Number of substitutions to A at each position.
sub_C: Number of substitutions to C at each position.
sub_G: Number of substitutions to G at each position.
sub_T: Number of substitutions to T at each position.
del: Number of deletions at each position.
ins: Number of insertions at each position.
sub_rate: Fraction of informative reads with substitutions at each position (sub_N divided by info).

Note

These fields will be computed only if you give a table of data per position (ending in per-pos.csv).

Profile data (per read)

sub_hist: Histogram of substitutions per read: the first element is the number of reads with 0 substititions, the second the number of reads with 1 substitution, and so on.

Note

This field will be computed only if you give a table of data per read (ending in per-read.csv.gz).

Profile data (per cluster)

proportion: Proportion of the profile in the ensemble (only for profiles of clusters, not ensemble averages).

Note

This field will be computed only if you give a table of data per cluster (clust-freq.csv).

Sample results file: Path format

Sample results file extensions

SEISMIC-RNA accepts the following extensions for sample results files:

.json

Sample results path parsing

Sample results files are output in the main output directory with the name {sample}__webapp.json, where {sample} is the sample name.

Sample results file: Uses

Sample results as input file

Sample results are input files for the seismic-graph web app, which provides additional graphing utilities beyond those in SEISMIC-RNA.

Sample results as output file

seismic export outputs a sample results file for each sample.

Sample results: Export data from SEISMIC-RNA to SEISMIC-GRAPH

Sample results: Content format

Sample layer

Sample metadata

Sample data

Reference layer

Reference metadata

Reference data

Region layer

Region metadata

Region data

Profile layer

Profile data (per position)

Profile data (per read)

Profile data (per cluster)

Sample results file: Path format

Sample results file extensions

Sample results path parsing

Sample results file: Uses

Sample results as input file

Sample results as output file

Sample results: Export data from SEISMIC-RNA to SEISMIC-GRAPH 