
RNA Data Analysis and Research
[Documentation]
Contents
b) Constrained Structure Alignment
c) Multiple Structure Alignment
e) Consensus Structure Prediction
f) Clustering
1. Data Format
There are two types of input data. The first type is the nested parenthesized notation representing an RNA secondary structure. For each structure, it has three lines: header line, primary sequence line and structure notation line. A sample structure is like this:>NM_003234:3394-3493 Homo sapiens transferrin receptor (p90, CD71) (TFRC), mRNA GCTTTCTGTCCTTTTGGCACTGAGATATTTATTGTTTATTTATCAGTGACAGAGTTCACTATAAATGGTGTTTTTTTAATAGAATATAATTATCGGAAGC ((((((.((((....)).))...((((.........(((((((.(((((......))))))))))))(((((((......)))))))...))))))))))
The second type is the FASTA format for RNA sequences. For the sequence data, RSmatch2.0 will automatically invoke Vienna RNA v1.4 to fold the sequences into structures and then align these structures. A sample sequence in the FASTA format is like this:
>NM_003234:3394-3493 Homo sapiens transferrin receptor (p90, CD71) (TFRC), mRNA GCTTTCTGTCCTTTTGGCACTGAGATATTTATTGTTTATTTATCAGTGACAGAGTTCACTATAAATGGTGTTTTTTTAATAGAATATAATTATCGGAAGC
Example:
The input can either be pasted in the Text boxes or can be uploaded from a file.
2. Functions
RADAR has the following functions:
This function aligns a query RNA sequence/structure against a set of RNA sequences/structures. The alignment can be either global or local based upon the user's requirements.
The input to this function can be either RNA structures or RNA sequences. The query structure is aligned with the set of RNA structures provided. There are certain parameters that can be changed by the user, such as the gap penalty (which is the penalty given for a gap), scoring matrix values etc.
The top hits are provided as the result. An example is shown below. It indicates the score obtained from the alignment and each alignment is also presented.
Output:

B) Constrained Structure Alignment
Constrained alignment is a powerful feature that allows RNA secondary structure alignment to be carried out dynamically based on prior knowledge about the RNA molecule or special requirements of the user. This function improves the performance of structure alignment by detecting structural similarity more accurately. The constrained region is annotated in the query with "*" as shown below.
C) Multiple Structure Alignment
This function of multiple structure alignment constructs a multiple local alignment for a given set of RNA structures, by progressively expanding the alignment at each stage. This is a useful tool when a small set of RNAs are functionally related by a shared motif. This shared motif could be located by the multiple local alignment function.
As shown below, the input is a set of RNA structures/sequences. The result presents the multiple structure alignment that was obtained from this input. It also gives the start and end positions within each sequence aligned.
Input:

Output:
We have uploaded the consensus sequence and structures for all the non-coding RNA families reported in the Release 8.0 of Rfam. This function can be used for searching a query RNA structure against this database. This query is aligned with the structures in the database and the best alignments are returned.
E) Consensus Structure Prediction
This function for predicting common sub-structure takes a group of RNA sequences as the input and tries to find the motif that is common to these sequences. If such a motif is detected, it becomes very useful in determining the functionality of the RNA molecule. A group of RNA sequences that are known to be performing a similar function will most likely have a good deal of similarity in their structures too. By inferring the common substructure, it becomes possible to study these RNA molecules further.
The input is a set of RNA sequences as shown below.

The given RNA sequences are folded to obtain structures that fall within the given range of folding free energy.
This method compares the given RNA structures against one another and outputs the similarity matrix which consists of the pair-wise alignment scores. This matrix can be used for clustering the RNA structures. The output is as shown below.
Output:

For any suggestions, comments or queries about this website, please contact jason.t.wang@njit.edu.