DE NOVO GENOME ANALYSIS
DE NOVO GENOME ANALYSIS
Overview – What is de novo genome sequencing?
De novo genome sequencing is used to resolve the primary genetic sequence of a specific prokaryotic or eukaryotic organism. The next-generation sequencing (NGS) is performed for uncharacterised genomes for which no prior knowledge of the nucleotide sequence exists and for which no reference sequence is available.
Applications – What are the advantages of de novo genome sequencing?
De novo genome sequencing is ideal for:
- Sequencing of uncharacterised prokaryotic genomes, like bacterial genomes
- Sequencing of unknown eukaryotic genomes, including plant and human genomes
- Sequencing of known genomes with significant variation
- Gap closure and finishing of complex genomes with relatively high amounts of similar or repetitive regions
- Analysis of structural variants and complex rearrangements, including copy number variations, inversions and translocations
- Ability to acquire epigenetic information and sequencing data simultaneously
Workflow – de novo genome sequencing methods & technologies
The process of de novo genome sequencing involves the sequencing small DNA fragments, assembling the reads into longer sequences (contigs) and finally ordering the contigs to obtain the entire genome sequence.
Different de novo genome assembly methods are available. Often, a hybrid approach is used where short reads sequenced at higher depths are used to error-correct longer reads from a second library. This de novo genome sequencing strategy requires two libraries, two runs and two data sets.
GATC Biotech analysis of PacBio reads with a non-hybrid assembly algorithm can generate the longest contigs with a minimum number of misassemblies. A hierarchical genome assembly process (HGAP) (Fig. 1) takes advantage of multiple alignments of all reads to obtain an accurate de novo genome sequence, where even extended repetitive regions are successfully resolved. The PacBio RS II platform also provides the exclusive opportunity to gain additional epigenetic information simultaneously within one sequencing run.
Scientific expertise: de novo genome sequencing
Historically, GATC Biotech has been involved in several key de novo genome sequencing projects. In 1993, GATC Biotech participated in the sequencing of the first yeast chromosome. In 2006, GATC Biotech sequenced for the Potato Genome Sequencing Consortium (PGSC).
GATC Biotech has now sequenced and assembled hundreds of genomes, perfecting protocols for prokaryotic or eukaryotic genomes and improving workflows for finishing genomes of more complex organisms. Our use of cutting-edge single-molecule real-time (SMRT) technology and proprietary genome assembly algorithms provide an accurate method for de novo genome analysis. GATC Biotech has sequenced thousands of genomes of bacteria, fungi, algae and other higher eukaryotes. Please contact us to see how you can benefit from our capabilities.
INVIEW DE NOVO GENOME 2.0 was applied for the KLEBSICURE Consortium, a cooperation project with GATC Biotech AG, the Max Planck Institute for Infectious Biology (Berlin, Germany) and the Ludwik Hirszfeld Institute of Immunology and Experimental Therapy (Wroclaw, Poland) as consortium members and Arsanis Biosciences GmbH (Vienna, Austria) as consortium leader. The main objective of the project is the identification and characterisation of the pathogen Klebsiella pneumoniae, which causes severe infections. The unique combination of de novo sequencing and detection of base modifications was used to study the virulence of the pathogen to guide the generation of monoclonal antibody therapeutics and to establish a test method for clinical diagnostics.
Find here, a list of selected research articles supported by GATC Biotech ’s sequencing products, including articles on de novo genome sequencing.
Related products to de novo genome sequencing
Did you know that de novo genome sequencing can be accomplished not only quickly, but also cost-efficiently? Simply take advantage of our complete service package including expert library preparation, sequencing on the leading PacBio platform, professional BioIT analysis and a final comprehensive GATC Data Analysis report.
For cases where sequencing data for a specific organism already exist and direct comparison to a reference genome is possible, targeted sequencing or whole-genome resequencing might be the right service for you. Take advantage of our products to help you discover and validate single-base mutations, insertions, deletions and structural variants.
Further reading on de novo genome sequencing
Rhoads, A., Au, K.F. PacBio Sequencing and Its Applications. Genomics Proteomics Bioinformatics. doi: 10.1016/j.gpb.2015.08.002 (2015).
Baker, M. De novo genome assembly: what every biologist should know. Nature Methods 9, 333 – 337 (2012).