(DE) +49 (0) 7531 81 60 68
(FR) +33 - 97 04 46 743
(GB) +44 (0) 207 691 4921
(SE) +46 (0) 8 655 3609
customerservice@gatc-biotech.com
Opening hours:
8 am - 6 pm CET (Mon-Fri)
FAQ InView Ne novo Genome (PDF, 410 KB)
White Paper:
A higher level of de novo genome assembly (PDF, 585 KB)
A hybrid approach for the automated finishing of bacterial genomes, Nature Biotechnology:
Hybrid error correction and de novo assembly of single-molecule sequencing reads, Nature Biotechnology:
This innovative hybrid sequencing approach provides a high-quality genome assembly in less than 4 weeks! Ultra-long PacBio RS reads are corrected with high-accuracy GS FLX reads, resulting in an assembly with very low contig numbers, thus remarkably improving the overall efficiency of the entire sequencing and assembly project.
Ideal for small organisms
Sequencing technologies: SMRT technology PacBio RS combined with Roche GS FLX
Starting from high-quality genomic DNA
Average genome coverage: 25x (of error-corrected PacBio RS data)
Larger and fewer contigs, less mis-assemblies, time and cost effective
ISO 17025 compliant service standards
Determining the complete genome sequence of an organism is a long and complex approach. Whole genome assemblies using a single NGS library and technology may result in hundreds of contigs alternating with gaps of unknown size. The correct orientation and order of these contigs requires further secondary analyses, additional sequencing and more intensive bioinformatics. This “finishing” involves substantial additional costs and can delay research for weeks or months. High-quality assemblies are therefore crucial for further downstream analyses, e.g. genome annotation.
InView™ De novo Genome is a hybrid sequencing approach leading to a high-quality genome assembly via error correction. Intensive tests showed that the lower accuracy of the PacBio RS doesn’t have to be a barrier to create high-quality assemblies. The ultra-long reads of PacBio RS can be assembled with a very low mismatch rate when correcting them with high-quality GS FLX reads. This innovative approach reduces time and costs needed for manual curation of the assembly. The overall efficiency of an entire sequencing and assembly project therefore improves tremendously.
Key benefits for your project:
Unique combination of PacBio RS & GS FLX technology: Longest read length achievable and high-accuracy reads
Generation of ultra-long corrected reads
Fewer and larger contigs (> N 50)
Less mis-assemblies
Time and cost effective method
Facilitates the analysis of unknown genomes
Determine the complete genome sequence in less than 4 weeks!
InView™ De novo Genome applications combine streamlined workflows with certified quality standards to provide customers worldwide with focused yet affordable Next Generation Sequencing solutions.
This new application helps to understand the genetic basis for phenotypes and to discover new metabolic pathways. In order to achieve a primary sequence of organisms and to facilitate a detailed genetic analysis the major advantage of this application is obtaining fewer and larger contigs. This method is ideal for particularly small organisms.
Starting material: 15 µg of high-quality genomic DNA
Sequencing platform: GS FLX and PacBio RS
Average genome coverage: 25x (of error-corrected PacBio RS data)
Sequencing data delivered:
GS FLX: 20-30x average genome coverage (raw data)
PacBio RS: 30x average genome coverage (raw data)
Data delivery in common formats, available for download from the myGATC online account
Delivery time: 4 weeks
Automated processing and LIMS controlled workflow under ISO 17025 accreditation
Barcode labeled samples
Real-time online access to project status via myGATC (secure personal account in myGATC / RSS feed)
Convenient data delivery in myGATC account
E. coli DH1 genome was chosen to compare the quality of de novo assemblies with data from different platforms. After assembly and mapping the complete E. coli DH1 genome (about 4.6 megabases; MB) is covered by 183 contigs of Illumina HiSeq reads or 77 contigs produced from Roche GS FLX reads.
Using contigs produced with error corrected PacBio RS sequences, the complete genome was covered by only 7 contigs. What is also remarkable is the size of the PacBio RS contigs. The biggest contig is about 2.1 MB and covers almost half of the E. coli DH1 genome.In addition, only a very small region of the genome is not covered (about 0.1%).
| HiSeq | GS FLX | PacBio corrected read |
|---|---|---|---|
# Reads | 4,630,708 | 820,756 | 240,689 |
# Bases | 467,701,508 | 430,259,032 | 508,406,033 |
Coverage (raw data | 101x | 93x | 110x |
# Contigs | 183 | 77 | 7 |
Largest Contig [bp] | 174,818 | 309,952 | 2,099,292 |
Ø Contig size [bp] | 24,713 | 59,060 | 660,594 |
# Bases in Contigs | 4,522,607 | 4,547,660 | 4,624,161 |
Bases of reference not covered | 108,100 | 83,047 | 6,546 |
Table 1: Comparison of the assemblies
For further information please download our White Paper A higher level of de novo genome assembly.
1. Where do I get my results?
All raw data as well as the analysed and assembled data can be downloaded via your secure myGATC online account.
2. What coverage should I use?
Internal tests showed that a coverage of about 25-30x with corrected PacBio RS reads is sufficient to produce high-quality genome assemblies. Assembly results highly depend on the composition of the analysed genome and may vary between different organisms.
3. Which and how much starting material should I send?
For sequencing on the PacBio RS it is crucial that high-quality DNA is used as starting material. The use of too little, degraded, contaminated or otherwise damaged starting material can result in low yield or failure of the sample preparation and impair quality and amount of sequencing results. For optimal results we require at least 15 µg double-stranded, purified, high molecular and RNA-free DNA (concentration approx. 200 ng/µl; OD 260/280 ≥ 1.8; OD 260/230 ≥ 1.9).
4. Do you guarantee a certain output?
Sequence data are assembled de novo under consideration of all read information, using optimized programs and parameters. The number of contiguous sequences (contigs) that can be unambiguously assembled depends on the complexity (frequency, length and distribution of highly repetitive and duplicate regions) of the sequenced genome and also the quality of the provided starting material. Therefore we cannot guarantee a minimum number of contigs.
5. What kind of quality controls do you perform?
The quality and quantity of each incoming sample will be determined by appropriate methods (e.g. agarose gel analysis / Qubit® Fluorometer / NanoDrop / Agilent 2100 Bioanalyzer). Further quality controls are performed at various steps of the process.
6. Where should I send my samples?
Send your samples by post to the following address:
GATC Biotech AG
European Genome and Diagnostics Centre
Jakob-Stadler Platz 7
D-78467 Konstanz, Germany
