Section: Software
Keywords : DNA sequences, distant repeats, approximate repeats, similarity regions, local alignment, sequence comparison, spaced seeds.
YASS
Since 2003, we develop YASS – a software for computing similarity regions in genomic sequences (local alignment). YASS is more sensitive than the commonly used BLAST program, due to the use of a new alignment detection strategy and a possible use of spaced and transition-constrained seeds (see Section 6.1.2 below).
YASS is available from
-
the INRIA software web page http://www.inria.fr/valorisation/logiciels/vie.fr.html ,
-
the project URL http://www.loria.fr/projects/YASS/ ,
YASS can also be queried through a Web server http://yass.loria.fr/interface.php .
During this year, we carried on the development of YASS software. The main improvement was the introduction of multiple seeds, that leads to a further improvement of the speed/sensitivity ratio. For example, using two seeds of weight 10 results in both a more sensible and a more selective search than the previous version of YASS that used a single seed of weight 9. In practice, this results in a 20% speed-up of the execution time and more complete set of results.
Except for multiple seeds, a number of more technical modifications has been done:
-
the computation of
and
Kvalues of the Gumbel law has been improved according to the algorithm by Altschul et Karlin (1990). These two values are key parameters in estimating the
p-value and
E-value of similarities found by
YASS . -
the use of cash-memory has been optimized; new hash tables have been added to speed up the computation of seed groups.
-
a parallelization of the program via multi-threads has been introduced; this enables a gain in efficiency on biprocessor architectures but also on dual-core or hyperthreaded processors.
-
the computation of alignment scores has been improved using the IUPAC alphabet for DNA representation as well as adequate scoring matrices.
The web server of YASS has been substantially improved :
-
the interface of choosing/downloading the sequences has been improved,
-
multiple seeds option has been integrated,
-
format of output pages have been improved,
-
session management (in order to avoid possible collisions of sumaltaneous executions) has been introduced.
A description of YASS software and of its web server appeared this year in Nucleic Acid Research [11] .