RADAR

RNA Data Analysis and Research

[Home]     [Web Server]     [Help]     [Download Software]    


Contents

Introduction to the RSmatch Software

Download & Installation

Usage Instructions


Introduction

Many ribonucleic acids (RNAs) play important roles in gene regulation, including non-coding RNAs and cis elements in mRNAs. Some of their functions are attributable to the structure they adopt, which are also called RNA motifs. Like sequence elements, RNA structure elements can be identified by comparing RNAs containing similar structures. The RSmatch package is designed to provide a light-weight approach to compare RNA structures, thereby uncovering functional structure elements. Compared with other tools for RNA structure comparison, RSmatch is fast, requiring quadratic time determined by the sizes of two given structures. 

RSmatch uses two scoring schemes, i.e. position independent and position dependent schemes. The position independent scheme entails two scoring matrices, one for single-stranded regions and the other for double-stranded regions. This scoring scheme is used in pair-wise comparisons and database searches.  The position dependent scheme, also known as profile, scores individual structure positions and is used by the multiple structure alignment and iterative database search functions. RSmatch provides both global and local alignment options even though the latter is more useful in most cases. In addition, RSmatch can take pattern-based structures as input. Please check the following publication for details:

Liu., J., Wang, J.T., Hu, J., and Tian, B. A method for aligning RNA secondary structures and its application to RNA motif detection. BMC Bioinformatics 2005, 6:89

In the current version (2.0), RSmatch provides the following functions: (1)  Pair-wise comparison & Database search (2)  Multiple structure alignment with an extended mode to compute common structure (NEW) (3)  Iterative database search (NEW)

Also, it provides following utilities: (1) We now give the user the option to perform constrained alignment of RNA structures(NEW). This is done in two ways: (a) Using phylogenetic information to derive the information content at each position of the RNA structure. (b) User-defined specification of conserved region. (2) Slide folding

These functions are described in detail below:

  1. Pair-wise comparison and Database Search (pmatch):

This function finds the RNA structures in the database that locally or globally match a given query structure. This function can also be used to detect motif occurrences in a RNA structure database when the query structure is a known motif with a defined pattern.

  1. Multiple structure alignment with an extended mode to compute common structure (mmatch):

RSmatch constructs a multiple alignment for a given set of RNA structures by progressively expanding the alignment one at a time. This function is useful when a small set of RNAs are functionally related by a shared motif. This function has been enhanced to compute common structure for a group of related RNA sequences.

  1. Iterative database search (imatch):

For iterative database search, RSmatch is able to continuously conduct database searches using a position-specific scoring matrix and update the matrix using the latest result. This function could be much more sensitive than the regular database search, but at the cost of computing time.

Please contact jason.t.wang@njit.edu for comments/suggestions/queries. 


Download & Installation

RADAR fast download version for UNIX/Windows

RADAR.jar

Then, download the following files and place them in the directory containing the RADAR.jar file:

1) codeTable.properties

2) scoreMat.homo

3) scoreMat.pattern

4) scoreMat.structure

Note: The above files do not depend on Vienna RNA package and hence requires RNA secondary structure input.

Following document provides instructions regarding executing the jar file:

Manual_RADAR

 

RADAR complete version for UNIX/Windows:

For UNIX

Note: This version is for UNIX only since Vienna RNA package which we use to fold the RNA sequences is available only for UNIX platforms.

Bug reports

6/18/2007: Fixed the bug in order to detect if the program goes in an infinite loop while aligning two structures and if so proceed with aligning the next structure instead of terminating the process.

6/30/2007: Changed the constrained alignment formula that calculates the bonus to be given for binary 0/1 constrained alignment from:
2- (length of conserved region)/(total length)
to:
1 + (length of conserved region)/(total length).

For Windows

The version 2.0 of RSmatch has also been implemented as a web based tool RADAR which can be accessed freely at http://aria.njit.edu/biodata/rna/RSmatch/server.htm.

Installation instructions for UNIX version of RSmatch2.0:

[A]   Install RSmatch 2.0

  1. Download RSmatch2.0.
  2. Extract the tar file to your installation directory, e.g. /home/RSmatch, by typing tar xvf RSmatch2.0.tar
  3. A directory named "RSmatch2.0" will appear. Switch to it by typing cd RSmatch2.0
  4. A directory named "release2.0" is present in this. Switch to it by typing cd release2.0
  5. Type ./RSmatch2.0 to run the program.

If the input data are RNA sequences in the FASTA format, follow these instructions to install RSmatch2.0 and Vienna RNA package v1.4.

[B]   Install Vienna RNA v1.4 & RSmatch2.0

  1. Download Vienna RNA package v1.4 and put it under the /home/RNA directory.
  2. Unpack the Vienna RNA package by typing gunzip < ViennaRNA-1.4.tar.gz | tar xvf -
  3. A directory named "ViennaRNA-1.4" under /home/RNA will appear. Switch to it by typing cd ViennaRNA-1.4
  4. Install the Vienna software by typing make all ; make install
  5. Set up the environment variable "VIENNA_HOME".  If your command shell is bash, add export VIENNA_HOME = /home/RNA/ViennaRNA-1.4 to your .bashrc file. If you use csh, add setenv VIENNA_HOME /home/RNA/ViennaRNA-1.4 to your .cshrc file. You need to log out and log in again to make it effective.
  6. Install and run RSmatch2.0 by following the instructions in [A] above. RSmatch1.2 will automatically invoke Vienna RNA v1.4 to fold the input sequences into structures and then align the structures.

Installation instructions for Windows version of RSmatch2.0:

[A]   Install RSmatch 2.0

  1. Download RSmatch2.zip.
  2. Extract the zip file to your installation directory.
  3. A directory named "RSmatch2.0" will appear. Switch to it by typing cd RSmatch2.0
  4. A directory named "release2.0" is present in this. Switch to it by typing cd release2.0
  5. Type java RSmatch to run the program (Make sure that you execute this from inside the directory release2.0).

Note: The Java CLASSPATH variable needs to be set correctly to be able to search current directory.

Older versions:

RSmatch1.2


Usage instructions

[A]   Input:

There are two types of input data. The first type is the nested parenthesized notation representing an RNA secondary structure. For each structure, it has three lines: header line, primary sequence line and structure notation line. A sample structure is like this:
>NM_003234:3394-3493    Homo sapiens transferrin receptor (p90, CD71) (TFRC), mRNA
GCTTTCTGTCCTTTTGGCACTGAGATATTTATTGTTTATTTATCAGTGACAGAGTTCACTATAAATGGTGTTTTTTTAATAGAATATAATTATCGGAAGC
((((((.((((....)).))...((((.........(((((((.(((((......))))))))))))(((((((......)))))))...))))))))))

The second type is the FASTA format for RNA sequences. For the sequence data, RSmatch2.0 will automatically invoke Vienna RNA v1.4 to fold the sequences into structures and then align the structures. A sample sequence in the FASTA format is like this:

>NM_003234:3394-3493    Homo sapiens transferrin receptor (p90, CD71) (TFRC), mRNA
GCTTTCTGTCCTTTTGGCACTGAGATATTTATTGTTTATTTATCAGTGACAGAGTTCACTATAAATGGTGTTTTTTTAATAGAATATAATTATCGGAAGC

[B]   Specification of conserved region:

a.      Use of phylogenetic information

Here the idea is that if for a given RNA sequence that is to be used as the query sequence, we have a set of very closely related RNA sequences then compute the multiple sequence alignment of these sequences using any of the several tools available for doing this. The result of multiple sequence alignment is used to derive the information content at each position of the RNA sequence. This information content is a value between 0-1 which indicates the percentage conservation at that position. RSmatch 2.0 provides the users with a utility “cal_cons”.

The utility takes the input as a multiple sequence alignment in the form of a single block & outputs conservation factors.

Example:

Input : NM_000146_for_cons

b.      User-defined specification of conserved region

This is like a simple 0/1 conservation. The user is required to indicate with a “*” below the position that should be taken as conserved.

Example: tab4_query.str

[C]   Output:

The output of RSmatch2.0 gives detailed alignment information. The Stockholm format is adopted to display the output of multiple structure alignment.

[D]   Options for UNIX version:

You can find the general syntax of the command by typing ./RSmatch2.0.

The general syntax is as follows:

RSmatch2.0 [options]
General options:
-p [ pmatch | imatch | mmatch]
   choose a program from:
	pmatch: 	pair-wise comparison & database search;
	imatch: 	iterative database search;
	mmatch: 	multiple RNA structure alignment with an option for finding common structure;
-u [slide | cal_cons]
   choose a utility from:
	slide: 		slide fold RNA sequences
	cal_cons: 	calculate conservation factors for the multiple alilgnment
-D <database> 		FASTA-formatted sequence database.
-d <database> 		secondary structure database. 
-g <penalty> 		gap penalty (default -6).
-o <output> 		file to receive output, default to 'result.out'.
-r <range> 		range of folding free energe (kcal/mol), used to select
   			alternative RNA structures; default is 0.
-S <ratio> 		sliding step length, expressed as a ratio of <W_length>; default is 0.5.
-W <W_length> 		sliding window size; default is 100 nt.
-z F 			To turn off the slide folding
Options for the utility 'cal_cons'
-M <multiple seq. alignment file> The file containing the multiple seq. alignment
Options for 'pmatch', 'imatch':
-n <topN> 		ouput top 'topN' hits.
-Q <query> 		query sequence in FASTA format.
-q <query> 		query secondary structure.
Options for 'pmatch' :
-s <score_matrix> 	file containing position independent score matrices;
			default is 'scoreMat.structure'.
-G <alignment type>
	T: 		global alignment
	F: 		local alignment
			default: F
-m <query type>
query type:
	0: 		real structure without IUB code
	1: 		pattern structure containing IUB code
			default: 0
-c <conservation factors file> file that contains the conservation factors
Options for 'mmatch':
 'mmatch' accepts a dataset of RNA structures except when the following option is selected:
-A <enable prediction of common structure>
	T: enables prediction of common structure (Input for this has to be a dataset of RNA sequences)
Options for 'imatch':
-R <repeat> 		number of iterations
Options for 'ecompare':
-F <factor> 		the window-size decreasing rate. A series of window sizes are generated for
			folding sequences. The default <factor> is the ratio of two contiguous window
			sizes.

        Examples:

[E]   Options for Windows version:

You can find the general syntax of the command by typing  java RSmatch.

The general syntax is as follows:

RSmatch2.0 [options]
General options:
-p [ pmatch | imatch | mmatch]
   choose a program from:
	pmatch: 	pair-wise comparison & database search;
	imatch: 	iterative database search;
	mmatch: 	multiple RNA structure alignment with an option for finding common structure;
-d <database> 		secondary structure database. 
-g <penalty> 		gap penalty (default -6).  
-o <output> 		file to receive output, default to 'result.txt'.
Options for 'pmatch', 'imatch':
-n <topN> 		ouput top 'topN' hits.
-q <query> 		query secondary structure.
Options for 'pmatch' :
-s <score_matrix> 	file containing position independent score matrices;
			default is 'scoreMat.structure'.
-G <alignment type>
	T: 		global alignment
	F: 		local alignment
			default: F
-m <query type>
query type:
	0: 		real structure without IUB code
	1: 		pattern structure containing IUB code
			default: 0
Options for 'imatch':
-R <repeat> 		number of iterations (default: 5)
Note: Windows version of RSmatch 2.0 does not accept RNA sequences as input.

        Examples:


A map of recent site visitors

For any suggestions, comments or queries about this website, please contact jason.t.wang@njit.edu.