EULER Software

EULER-input: Classify reads
EULER reads two input files: one contains the read sequences in FASTA format and one contains the Phred quality values of these reads. The input read sequences should be vector-free. For convenience, the user can use the output file of Phred after vector screening as the input file. The Phred quality values are optional for EULER, though they help with trimming the input reads and often give better assembly results. EULER-input classifies the reads into a number of categories: the shotgun reads, the finishing reads, super-long reads and ambiguous reads. Most EULER programs only treat shotgun reads. If the read length exceeds some threshold (i.e. 1200bp), EULER treats it separately (sometimes it is of bad quality and sometimes it is a pre-assembled or known fragment which is intentionally inserted into the reads). The user may contact the administrator to explain the sources of these super-long reads; otherwise they will not be used in EULER-assembly. If one read contains too many ambiguous positions (N or X), it is considered an ambiguous read and will not be used in the assembly either. The user may indicate a read is a finishing read by using the suffix .f in its name (or other rule as specified in the file name.rul). Finishing reads will be used in EULER-Connect to connect the pre-assembled contigs, but are not used in other parts of the procedure. The mate-pair information is extracted from the names of the reads (using rules specified in the file name.rul) and forms a separate file of mate-pairs, used in EULER-DB and EULER-SF.
EULER-Trim: Trim the reads
EULER-Trim trims unreliable parts of the ends of the reads according to the Phred quality values. If the Phred quality values are not available, EULER-Trim can make its own estimation of the reliable region of all reads and cut the both ends. Since the optimal range of error rates for running EULER is 0.02, the ends of the beyond the region (50, 550) usually need to be cut.
EULER-EC: Error correction
EULER-EC corrects the sequencing errors by exploring the multiplicities of each k-mer in the reads. Usually EULER-EC can correct about 95%-98% errors in the input reads and make them suitable for the EULER equivalent transformation algorithms. A file step.inp may be given to control how the error correction is done.
EULER-Chimdet: Detect chimeric reads
A chimeric read is an experimental error wherein fragments from two different parts of the genome are combined together into a single read. EULER-Chimdet detects the suspicious chimeric and unreliable reads in the input read set and discards them into separate chimeric reads and unreliable reads files. It also trims the unreliable read ends further among the reads that are retained.
More precisely, an unreliable position is a position such that all the l-mers covering it only occur in this one read; an unreliable read is a read with fewer than 100 reliable positions; and a read is assumed to be a chimeric read when its reliable positions form two or more noncontiguous segments.
EULER-ET: Equivalent transformation of the De Bruijn Graph
EULER-ET does the equivalent transformation of De Bruijn Graph constructed from the k-mers in the reads. This program leads to the first assembly of the reads.
EULER-DB: Equivalent transformation with mate-pairs
EULER-DB further simplifies the graph by doing equivalent transformations with the superpaths constructed from mate-pairs. This leads to the second assembly of the reads.
EULER-TR: Tangle resolution
EULER-TR is an ongoing project and has not been integrated into the current EULER web server. It tries to resolve the almost perfect repeats by profile-based classification of the reads in the repeat region.
EULER-Consensus: Making consensus
EULER-Consensus chooses the major nucleotide at each position in the assembled graph and outputs a more reliable sequence of each contig.
EULER-Connect: Connect the contigs by full length reads
EULER-Connect links the contigs by the original full length reads (before trimming).
EULER-SF: Scaffolding the contigs
EULER-sf connects the assembled contigs into large scaffolds with the mate-pairs.
It is applied twice: Once to the contigs produced by EULER-DB, and once to the contigs produced by EULER-Connect.
EULER-PCR: Designing PCR experiments
EULER-PCR helps design optimal large-scale PCR experiments to resolve the repeats, which are not resolvable by the regular EULER assembly.
EULER-Compare: Comparing contigs
EULER-Compare compares assembled contigs by different assemblers with the same reads. See euler-compare for details.