BLOG

STAY POSTED

As a non-invasive test for cancer research and diagnostics, liquid biopsy has already gained lots of traction. The market is hot and business analysts are anticipating more future growth as liquid biopsy is increasingly adopted by healthcare providers. The clinical acceptance of liquid biopsies will likely be boosted by several advantages the tests have over traditional tissue biopsies. Some of these benefits include lower total test cost, quicker turnaround times, ability to capture tumour heterogeneity, ability to monitor recurrence and the minimally invasive nature of liquid biopsies.

Currently, three major biomarkers are explored by liquid biopsies. A report by Research and Markets identifies over 50 liquid biopsy tests that are presently offered on the market. Of these, 50% of the tests are based on detection of cancer biomarkers in circulating tumour DNA (ctDNA). Roughly 37% of the tests are based on characterisation of circulating tumour cells (CTCs) and the remaining 13% draw conclusions from exosome analysis. 

A Kalorama information report shows that the most common genes currently analysed in cell-free DNA (cfDNA) include BRAF, EGFR, ESR1, KRAS, MET, PIK3CA, TP53, KIT and PDGFRA. The report points to a variety of clinical uses of liquid biopsies in oncology including early detection, identification of mutations for targeted therapy, patient stratification, companion diagnostics, tracking of minimal residual disease, characterisation of molecular heterogeneity, monitoring of tumour dynamics and metastases, cancer prognosis and others. 

Financially, liquid biopsies are meant to be a lucrative investment. Research and Markets predicts that the global liquid biopsy market will reach nearly $4.5 billion by 2020. The cancer application segment is expected to make up $2.5 billion of the market. Research and Markets predicts that four major cancer types, prostate cancer, breast cancer, colorectal cancer and lung cancer, will be the main market drivers by 2030, accounting for over 70% of the total liquid biopsy market.

Convinced of the potential of liquid biopsy to transform patient care, GATC Biotech has established a unique service line for non-invasive analysis of cfDNA. GATCLIQUID offers three services for accurate tumour mutation profiling from blood. GATCLIQUID ONCOEXOME is a unique service for whole exome sequencing of cfDNA that provides an unbiased overview of all mutations in protein coding regions. GATCLIQUID ONCOPANEL ALL-IN-ONE is a next-generation sequencing based cancer panel that offers targeted screening of key cancer drivers. GATCLIQUID ONCOTARGET enables ultra-sensitive monitoring of the most important tumour mutation in a given case. Together the services serve aim to improve cancer research and diagnostics today and in the years to come.

References

Kalorma Information. (2017). Cell-free DNA (cfDNA): Market Size and Share Analysis (Report No. KLI15188961).

Research and Markets. (2016). Liquid Biopsy Resarch Tools, Services and Diagnostics: Global Markets (Report No: 3632954). 

Research and Markets. (2015). Non-Invasive Cancer Diagnostics Market, 2015-2030 (Report No. 3454294 ). 

Introduction

A recent publication of Sinha et al  from April 2017 stimulated a lively discussion about a phenomenon referred to as “index hopping”, “index swapping” or “barcode mis-assignment”. It occurs when multiplexed samples are being sequenced on Illumina´s HiSeq 3000/4000/X Ten systems using Exclusion-Amplification (ExAmp) chemistry. They observed that “up to 5-10% of sequencing reads are incorrectly assigned from a given sample to other samples in a multiplexed pool”. Illumina reacted with a white paper describing the impact and best practices for minimising barcode mis-assignment and reported “index hopping” rates of below 2% on patterned flow cells. “Index hopping” rates were dependent on the library preparation method showing highest rates for PCR-free libraries and libraries contaminated through free adapters and primers. While the underlying mechanism remains elusive, the overall consensus from Illumina´s white paper, as well as bloggers from Enseqlopedia and UC Davis Genome Centre, is that clean sequencing libraries are essential for sequencing on the HiSeq 3000/4000/X Ten. Moreover, they declared that “for the majority of applications ’index hopping’ between clean libraries will be minimal and will have minimal or no impact on the data analysis”.

Results

Since we run a large number of HiSeq 4000 projects, we took the matter very seriously and started digging into our data to assess the level of “index hopping” at GATC Biotech. From two recent HiSeq 4000 sequencing runs, 5 lanes were selected with 5 to 9 libraries per lane comprising different library types: strand-specific RNA libraries from different organisms (2 lanes), all exome-enriched DNA libraries (WES) (1 lane), and ChIP libraries (2 lanes).

The number of reads with unexpected dual index combinations not matching the combinations of the loaded libraries were retrieved from the file ‘DemuxSummaryF1L[1-8].txt’, which is generated automatically for each lane during demultiplexing. For each possible dual index combination (including the ones present in the pool and all combinations not present in the pool), the number of reads was divided by the total number of reads of the lane to get percent index representation values. The results of one lane with 6 ChIP libraries with unique i7 and i5 indices are shown in figure 1 and another lane with 8 RNA libraries in figure 2. The observed levels of “index hopping” were substantially lower than the ones reported in the Illumina white paper. The background read distribution seems to be random as every possible combination of indices was detected. We could not observe a significant correlation between the library type and the level of “index hopping”. The three other analysed lanes containing RNA, ChIP and WES libraries showed similar levels of “mis-assignments” (data not shown). Analysing all “index hopping” events across 5 lanes, a median value of 0.008% was determined (Figure 3). 

By summing up all “index hopping” events per lane, the cumulative median frequency of “index hopping” per lane was 0.27%. In contrast to our findings, applying this measure to the example data presented in Illumina’s white paper (Figure 3 of Illumina’s white paper) the cumulative rate of “index hopping” was 1.59%, which is nearly six times higher than what we observed at GATC Biotech. 

Discussion

The data presented was derived from currently ongoing customer projects and was randomly selected. We assume that the library preparation has the highest influence on levels of “index hopping” events. As our library preparation process results in clean high-quality libraries (i.e. no detectable primer and adapter dimers), we consequently have extremely low rates of “index hopping”. At GATC Biotech most steps of the library preparation are automated using liquid handlers and very strict purification steps are performed, which seem to mitigate this effect to nearly negligible amounts (Figure 4).

With our workflow, a non-uniquely dual indexed library may contain on average 0.008% of the reads coming from a library sharing one of the index sequences. This equals to 1 mis-assigned read per 1,250 correctly assigned reads. For example, if a non-uniquely dual indexed library was loaded with approximately 10% of total reads (e.g. 30 million read pairs) per lane and this library was affected by “index hopping” as another library present on the lane shared one of the indices, then 0.08% of reads (e.g. 24,000 read pairs) of the affected library would originate from the contaminating library.

Does this level of mis-assigned reads influence data interpretation?
For many study types such as whole genome sequencing, whole exome sequencing and bisulphite sequencing no influence is expected. 

This includes re-sequencing projects aiming at detecting minor allele frequencies down to 1%, where usually a sequencing depth of 300x average coverage is recommended. This means that at least 3 unique reads with a specific mutation are needed in order to call a mutation. At 300x average coverage and an “index hopping” rate of 0.08%, there is <30% chance that a single mis-assigned read with the mutation may be detected, which is well below the threshold of 3 mutated reads. Moreover, this will only be the case if the mutation was present  at 100% in the “contaminating library”. If the mutation frequency is lower, the likelihood for carry-over is even further reduced. Therefore, rare mutation detection studies are very unlikely to be affected at GATC Biotech. If the “contaminating library” belongs to a different organism most of the “index hopping” reads will not map, leaving the experimental data unaffected. 

Another concern is RNA-Seq on HiSeq4000, where gene expression levels can vary substantially between sample types or treatments. The impact on an experiment, however, is in most cases very low. For example, if a cell line would upregulate a certain transcript upon treatment with a compound by the factor of 100, e.g. from 10 FPKM to 1,000 FPKM, the “index hopping” could increase the FPKM of the untreated control from 10 to 11 FPKM (~0.1% of 1000 FPKM). In conclusion, the fold change will not be substantially different.

Nevertheless, for single cell RNA-Seq where commonly up to 384 libraries are pooled on a single lane, it is recommended to use uniquely indexed libraries if very different cell types are analysed. 

Overall, similar to the reports from Sinha et al and Illumina, we observed “index hopping” on HiSeq 4000 but at significantly lower levels. GATC’s proprietary library preparation protocols and high degree of automation show that this effect can be reduced by preparation of high quality libraries and rigid purification / size selection steps. 

In any case, we will continue to monitor “index hopping” on a regular basis to ensure only the highest quality standards are achieved for our customers. 

References

1. Sinha R et al. (2017). Index Switching Causes “Spreading-Of-Signal” Among Multiplexed Samples In Illumina HiSeq 4000 DNA Sequencing. BioRxiv: doi: doi.org/10.1101/125724.

2. Illumina (2017). Effects of Index Misassignment on Multiplexing and Downstream Analysis [white paper].

3. Enseqlopedia.com (2017). Update on @illmina index-swapping [Blog post].

4. Froenicke L. (2017). Update on Barcode Mis-Assignment Issue [Blog post]. 

Happy DNA Day!

24.04.2017 | Detlef Janssen

Today is none other than DNA Day! The special day is celebrated every year on April 25 to commemorate the first publication of DNA structure in 1953, as well as the completion of the Human Genome Project in 2003.

DNA Day was first marked on April 25, 2003 in the United States. Annual DNA Day celebrations have since been organised by the National Human Genome Research Institute. The purpose of the event is to offer students, teachers and the public an opportunity to learn about the latest advances in genomic research.

GATC Biotech is proud to offer expertise in the DNA sequencing field, ranging from Sanger sequencing to whole genome sequencing to targeted sequencing. But besides technical knowledge, we’ve also found out a thing or two that can get anyone excited about DNA. Read some DNA fun facts below:

1. Half-man, half-microorganism
Not quite, but humans harbour as many as 145 genes that have jumped from bacteria, viruses or other single-celled organisms through the process of horizontal gene transfer. Most of these genes play established roles in metabolism, immune responses and other biochemical processes.

2. No T-Rex resurrection

Scientists believe that DNA has a half-life of 521 years. This means that at a temperature of -5°C, every bond would be destroyed after a maximum of 6.8 million years. DNA would cease to be readable much earlier, roughly at 1.5 million years. Bad luck for T-rex, as dinosaurs are believed to have lived 65 million years ago.

3. Are you a pumpkin head?
Humans and pumpkins share about 75% common DNA. About 98% of our genetic make-up is identical to chimpanzees and human-to human genetic variation is only 0.5% to 1%.

4. Get out of jail free card
DNA-based evidence has exonerated more than 300 wrongly convicted prisoners in the U.S. since 1989. Twenty of these prisoners have been on death row.

5. DNA goes sugar-free
Xeno nucleic acid (XNA) is a synthetic alternative to DNA. XNA is created by exchanging DNA’s sugar group for any number of artificially produced molecules. Six of these XNAs already exist, such as glycol nucleic acid (GNA), threose nucleic acid (TNA) and peptide nucleic acid (TNA)

6. To Pluto and back? You’ve got it in you! 
If the DNA in all cells of the human body was uncoiled, it would stretch 16 billion kilometers. Depending on the location in their orbits, the distance from Earth to Pluto varies between 4 and 7.5 billion kilometres.

7. DNA in the cell’s power generator
Human mitochondrial DNA (mtDNA) encodes for only 37 genes. Of these genes, 13 code for proteins of the electron transport chain and the rest code for transfer RNAs (tRNAs) and ribosomal RNAs (rRNAs). In mammalians, mtDNA is usually inherited from the mother, as mitochondria in mammalian sperm are usually lost or destroyed in the process of fertilization.

8. DNA and CSI
In forensics, DNA profiling is based on polymerase chain reaction (PCR) and uses short tandem repeats (STR) that are highly variable. DNA analysts in North America look at 13 specific DNA loci, whereas those based in the UK have a 17 loci system. The odds that two individuals have the same thirteen-loci DNA profile is about one in a billion.

9. An octoploid coffee bean
Humans are diploid organisms with two pairs of 23 chromosomes or 46 in total. Some C. arabica coffee species are octoploids with eight sets of 11 chromosomes or 88 in total.

10. All in a day’s work
It takes about 8 hours for a mammalian cell to completely copy its DNA. Human DNA replicates at a rate of 50 nucleotides per second at 20 to 80 origins of replication. In contrast, E. coli DNA replicates at a rate of 1,000 nucleotides per second at one single origin of replication. The process takes about 40 minutes. 

11. Birds of a feather flock together
A controversial 2014 study of 2,000 Americans found that people tend to befriend those with similar DNA to their own. The authors analysed 500,000 markers from across the genome to conclude that friends share about 0.1% more DNA than strangers. This level of similarity is expected for fourth cousins.

12. Should Anne of Green Gables join X-Men?
Red hair, freckles and blue eyes are genetically considered mutations. Red hair appears in people with a recessive allele on chromosome 16 which produces an altered version of the MC1R protein. The MC1R gene is also often implicated in the presence of freckles. A specific mutation in the HERC2 gene, which affects the function of OCA2 is strongly linked to the appearance of blue eyes. 

Analysis of single nucleotide polymorphisms (SNPs) and insertions and deletions (InDels) in the human exome is one of the most popular applications of next-generation sequencing (NGS). More and more clinical researchers are turning to exome sequencing to help in the diagnosis, prognosis, treatment and prevention of disorders caused by genomic abnormalities.  

Clinical samples can be quite challenging to sequence. This is especially true for starting material commonly used for tumour mutation profiling, such as formalin-fixed, paraffin-embedded (FFPE) and blood samples, from which cell-free DNA (cfDNA) is extracted. Below we offer insights into why FFPE-extracted DNA and cfDNA are difficult to sequence and how optimisation of certain steps can help perform efficient exome sequencing of cfDNA and FFPE samples.

A blood sample of a cancer patient contains not only circulating tumour DNA (ctDNA) but also high levels of cfDNA from non-cancerous cells. Moreover, ctDNA levels tend to vary significantly between patients, cancer type and the health status of the patient. The short ctDNA length of only about 160 bp complicates the analysis even further. Often, DNA isolation from plasma results in DNA concentrations ranging from as little as 1 ng to 10 ng/ml of plasma. With variant allele percentage as low as 1% in early stages of the disease, highly sensitive methods are required to achieve accurate variant calling.  

Besides proper plasma preparation, the library preparation step is crucial for successful ctDNA exome sequencing. A library that works with both low-input DNA and with a broad range of DNA input is needed.  The library preparation steps must be performed with extreme caution in order to maximise yield and quality of the genomic material. Only high fidelity enzymes should be used during the procedure. 

The process of fixation and the storage conditions for FFPE samples can cause substantial DNA damage. Genomic DNA derived from FFPE tissue is often partially degraded or in very limited quantity. Damaged DNA is prone to promote jumping between templates during PCR and inducing DNA polymerase errors during any PCR steps. There is often extensive variability in the amount of damage and types of damage in DNA extracted from FFPE material. Errors, such as inter-and intra-strand crosslinks, as well as accumulation of strand breaks are common damage events seen in FFPE-derived DNA. FFPE material also has higher rates of C>T deamination artefacts, as well as high levels of other base substitutions.  

If you want to perform exome sequencing of DNA extracted from FFPE samples, make sure you measure the final DNA concentrations on the Nanodrop and Qubit spectrophotometers. Establish a quality threshold and do not continue the experiment if the DNA quality falls below these levels. Perform exome sequencing with a high sequencing depth in order to achieve accurate variant calling. Ideally, samples should be run in duplicates and a minimum coverage sample cut-off should be established prior to downstream data analysis. 

It would be really unfortunate to exist for nearly 4 billion years without anybody noticing your presence. But that’s exactly what happened to one lonely molecule called deoxyribonucleic acid (DNA). The molecule duplicated, mutated and evolved without anyone giving it any thought. Even complex multicellular Homo sapiens carried on for thousands of years completely ignorant of their DNA, although each of their billions of cells carries two meters of the genetic material. 

DNA had to wait until 1869 for its first physical encounter with a human. The lucky man was Friedrich Mietscher, an Austrian physician who successfully isolated DNA in the form of chromatin from pus-soaked hospital bandages. 

Scientists at that time were not fully convinced DNA was worth getting excited about. Most still believed proteins were the carriers of genetic information. The false notion began to change in the 1940s and in 1952 the matter was finally laid to rest with an elegantly simple experiment from Alfred Hershey and Martha Chase, which demonstrated once and for all that DNA is genetic material. 

Just one year later, Francis Crick and James Watson with a crystallography hint from Rosalind Franklin, introduced the world to the double helical structure of DNA in 1953. With the structure now in the bag, scientists began their search for DNA function. The answer came from Marshal Nirenberg, who in 1961, showed that different combinations of DNA bases code for specific amino acids, the building blocks of proteins. 

With the realisation that DNA was the blueprint for life, came the curiosity to “read” the plan which DNA held within.  A great molecule to start sequencing with was RNA, as these types of nucleic acids are single-stranded and often considerably shorter than DNA. Indeed, in 1965, Robert Holley and his co-workers became the first people to read the bases of a nucleic acid when they sequenced yeast transfer RNA (tRNA) using RNAses with base specificity.  In 1970, Ray Wu was first to decipher a short sequence of DNA by using a technique called primer extension. Two years later, Walter Fiers read the first ever DNA sequence of a whole gene, the one encoding a MS2 virus coating protein. One year later, Walter Gilbert and Allan Maxam developed a way of sequencing DNA which used chemicals to cut DNA at certain bases. In 1975, Frederick Sanger introduced his first alternative method to DNA sequencing, called the “plus and minus” technique. The approach used polyacrylamide gels to separate products of primed synthesis in order of increasing chain length. In 1977, Sanger modified Ray Wu’s primer extension technique to develop the chain-terminator method or dideoxy sequencing or simply Sanger sequencing as we know it. The technique went on to dominate the sequencing world for the next 30 years.

Sanger used his newly developed method to sequence the first ever genome in 1977. The genome of bacteriophage virus øx174 became the most popular DNA positive control in labs around the world. A few years later, in 1982, researchers discovered DNA mutations. The first documented case was of a single DNA base change in the HRAS gene that could affect the onset of bladder cancer by altering the structure of its protein product.

Meanwhile, improvements to the Sanger sequencing method were constantly made. In 1984, Fritz Pohl developed the first non-radioactive sequencing technology platform, GATC1500. In 1986, Leroy Hood in collaboration with Applied Biosystems developed the first semi-automated DNA sequencing machine where sequencing data could be directly collected by a computer. The following year, Applied Biosystems launched the first automated DNA sequencing machine, selling at $300,000 apiece. Nearly 10 years later, ABI would become the first commercial provider to use capillary electrophoresis rather than a slab gel, establishing truly automated DNA sequencing.

Meanwhile, in 1990, the ambitious Human Genome Project began with the astronomical costs of $75 per DNA base. That same year, Haemophilus influenza became the first bacterium to have its genome sequenced using the “shotgun” sequencing technique. The slightly longer and more complex yeast genome of the Saccharomyces cerevisiae species followed in 1996.

1996 was not just the year of the yeast, it was also the year where next-generation sequencing (NGS) first came to be. It was during this year that Mostafa Ronaghi introduced a new DNA reading technique called pyrosequencing, based on a sequencing-by-synthesis method.  Two years later, Shankar Balasubramanian and David Klenerman founded Solexa, the precursor to Illumina. The two men combined efforts to develop a new sequencing-by-synthesis technique based on fluorescent dyes. 1998 was also the year that first animal genome was successfully sequenced, that of the microscopic worm, Caenorhabditis elegans. One year later, an international collaboration managed to publish the first human chromosome sequence, introducing the scientific community to chromosome 22. 

The beginning of the 21st century was certainly an exciting time for DNA. Genomics success stories were pouring in from every corner of the world. In 2000, Arabidposis thaliana became the first plant and Drosophila melanogaster the first insect to have their respective genomes sequenced. The first year of the new millennium also saw the much awaited first draft version of the human genome sequence, a combined effort attributed to project leaders Francis Collins from the U.S. National Institute of Health and Craig Venter, founder of Celera. In 2001, the draft human genome sequence, based on samples from 12 anonymous volunteers, was officially published. In 2002, the complete genome sequence of the mouse followed, showing 90% identities to that of humans.  In 2003, the human genome sequence of around 3 billion pairs was finalised, although a few gaps still exist to this day. 

The next-generation sequencers were not sitting idly during the human genome sequencing craze. In 2005, Jonathan Rothberg and colleagues used pyrosequencing to develop the 454 system, the first next-generation sequencing platform to come on the market. Meanwhile, Solexa researchers used their own sequencing-by-synthesis technique to read the whole genome of a virus called phiX-174. In 2007, Illumina took over Solexa in a $600 million buy-out, going on to provide the most widely used next-generation sequencing technology in the world. In 2007, a new competitor to 454 and Illumina was released in the form of the SOLiD system, which was based on sequencing by ligation.  A few years later, in 2011, Life Technologies released another competing sequencer, the Ion Torrent, which used a form of sequencing-by-synthesis based on detection of hydrogen ions whenever new DNA is made.  

Next-generation sequencing was becoming more and more accepted in the scientific community. In 2008, the International Cancer Genome Consortium was launched with the goal of using NGS to analyse thousands of tumour samples and profile cancer-related mutations. This was a tremendous year for cancer research, as scientists also managed to decode the whole DNA sequence of a cancer for the first time. To achieve this, they used NGS to read the genetic code from leukaemia cells isolated from a 50 year old patient. Also in 2008, James Watson became the first person to have his whole genome read using NGS. 

The year of 2009 was the first time third-generation sequencing technology came into the spotlight with the release of the Helicos sequencer. This technique made use of single molecule fluorescent sequencing to read DNA sequences, but the technology quickly fell out of favour due to high error rates. The technology was more successful in the hands of Pacific Biosciences, who launched their first single molecule real time sequencing platform in 2011. 

The latest sequencing technology to hit the mainstream was nanopore sequencing, where DNA is passed through a tiny nanopore in a membrane. The order of bases is then determined based on changes in the electrical current across the pore. Oxford Nanopore Technologies became the first company to commercialise this new form of sequencing in 2012.

DNA sequencing is now as popular as ever with scientists reading its composition and using the information for a countless number of applications. Now that good quality sequencing data is becoming cheaper and easier to generate, it is highly probable that the need for bioinformatics analysis will grow. A truly multidisciplinary approach will be needed in order to interpret and make use of the vast amount of generated data. Nowhere is this truer than in the field of personalised medicine. For newly emerging applications like liquid biopsies for non-invasive cancer detection, DNA analysis holds the great promise of personalised, effective and painless care. 

Odds are it won’t take another four billion years to get there.

There was no shortage of enthusiasm from the three bosses interviewed for this blog post. The ladies readily put their busy schedules aside to make time for a lively conversation in honour of International Women’s Day. If after the first sentence you were surprised to find out the three bosses were female, then you, like most of us, need to read on about what it takes for a woman to climb up the biotech ladder. 

Julia Bottlang – The Lab Powerhouse on beating the odds 

Mrs. Bottlang found her groove during the set-up of a Sanger sequencing laboratory in London. The lab technician found herself willing to make decisions rather than constantly waiting for directions. This initiative did not go unnoticed. When Mrs. Bottlang applied for the position of Head of Pre-Sequencing NextGen Sequencing Lab, she beat out an all-male applicant pool to land the job. To get to where she is, she believes young girls need to be encouraged to pursue their interests fully so that they can grow into confident women who are smart enough to recognise an opportunity and self-confident enough to take it.  

Silvana Mamone – The Data Queen on the importance of the next generation 

Mrs. Mamone is always eager to learn something new. She feels her enthusiasm for new knowledge is what led to a string of promotions and ultimately her current post as Head of the Data Analysis and Processing Team at GATC Biotech. Her newest position was probably the most challenging for her, not in the least because she had just become a mother. Neither she nor her partner considered becoming stay-at-home parents. So she sharpened her negotiating skills and landed flexible working hours, the opportunity to work from home when her child is sick and enough free days to cover all kindergarten holidays. She acknowledges she has reached a great work-life balance in part thanks to her employer’s full support of employees with families. Whenever Mrs. Mamone is off from analysing sequencing data, she is busy instilling self-confidence in her daughter as a next generation of powerful women.  

Kerstin Stangier – The Sequencing Mastermind on bringing a female perspective to a male’s world 

It might be difficult for a daughter to follow in the footsteps of an 80’s feminist with a job, a family and a political agenda. But becoming one of two female directors in a blooming biotech company is a pretty good start. Dr. Stangier proudly credits her mother for giving her the strength to go on and dominate male-dominated fields. From earning a graduate degree in chemistry to becoming a Director Production at GATC Biotech, Dr. Stangier is not only used to voicing her opinion in front of male crowds, but to actually having it heard. She believes that being a woman is advantageous in business and scientific discussions, as she can often offer a different perspective and suggest more creative solutions than her male counterparts. Although she is happy with the improvements she sees in the workplace for women, she acknowledges that top jobs in companies are nearly always exclusively filled by men. Her advice for women and even men after these top jobs is to have full self-confidence in their education, experience, abilities and judgement. In her opinion, winning is in the self-esteem rather than the gender card and with enough practice, any employee can deal this winning card to themselves. 

There are several important factors that you need to consider before doing next-generation sequencing (NGS) on an Illumina platform. A well-planned experiment could easily maximise the success of the final outcome regardless of whether you perform the sequencing yourself or outsource to a service provider. 

  • Starting material –The starting point of your project is the choice of a proper DNA / RNA extraction method for your organism of interest. The lysis and homogenisation steps of the protocol should be tailored to the specific material in order to maximise yield and quality. The protocol should be performed by experienced users to avoid degradation of the nucleic acid due to missteps or delays. For some materials, such as plants, the removal of inhibitors is of essence.  In all cases, the quality of your DNA/RNA should be assessed using capillary gel electrophoresis or similar methods.
  • Number of samples – The number of samples very much depends on the aim of your sequencing experiment. If you want to check the presence of a gene/SNP of a bacterial strain, one sample might be enough to answer this question. However, if you want to analyse effects on groups of samples, replicates should be performed if possible for proper statistical analysis. In general biological replicates are preferred over technical ones. Also consider including controls to validate your results. 
  • Number of reads – Your project will require a minimum number of sequencing reads in order to generate reliable data. If you are sequencing amplicons, small RNAs or re-sequencing small genomes, as little as 5 million reads can be sufficient. For resequencing projects the genome size needs to be considered and it is directly connected to the desired sequencing depth/coverage. For larger eukaryotic genomes more sequencing reads are needed than for small prokaryotic genomes. Over-sequencing costs not only more money and time, it also complicates downstream data analysis. 
  • Sequencing depth - The sequencing coverage, or the average number of times a single base is read during a run, is also of importance. The more frequent the base is sequenced, the more reliable the base call is likely to be. This parameter is also highly dependent on the application. For example, to reliably identify germline mutations 30x coverage is usually sufficient. However, 100x and more should be sequenced to detect somatic mutations of tumour samples.
  • Sequencing mode – Illumina offers two distinct types of sequencing - the single-read and the paired-end mode. Single-read runs sequence DNA fragments from one end to the other end depending on the fragment length and the sequencing length. The single-read mode is fast, cheap and could be beneficial for some RNA-seq and ChIP-Seq experiments. In paired-end mode the fragment is read first from one end and in a second read the same fragment is read from the opposite end. Thereby, for each fragment two paired reads are generated. Although this sequencing mode is more expensive, it generates more data and it makes the mapping to the genomic reference more reliable. Therefore, it is the preferred choice for applications like SNP analysis and genome assembly
  • Multiplexing – During the library preparation each sample is labelled with a sample-specific molecular tag, called barcode.  This process allows multiple samples to be processed in the same sequencing reaction and then separated during BioIT analysis. Besides lowering costs, multiplexing allows for randomisation and can help minimise sequencing bias. In a perfect world, the experimental design would involve pooling all controls and experimental samples together and sequencing these on the same lane. If this cannot be achieved samples should be randomised so that in each batch of samples both cases and controls are processed. For low complexity samples, such as amplicons and bisulfite-treated DNA, pooling samples together with high complexity samples can increase the sequencing quantity and quality.  
  • BioIT analysis –As sequencing costs are lower than ever before, the current bottleneck of NGS tends to be the bioinformatics analysis. Thinking about how you will analyse your data in advance can help ensure you have included all necessary controls. You can go ahead yourself with the data analysis using free or commercially available software, but learning how to use these programs often involves a steep learning curve. Alternatively, the option exists to outsource the BioIT analysis to an experienced provider.
The visitor team on a round tour through GATC´s laboratories in Cologne (f l t r: Dr Ronny Uhlig, Ivan Schlembach, Simone Schmitz, Desi Askitosari from the Institute of Applied Microbiology at RWTH Aachen University).

On a day like any other in late 2016, Mr. Ivan Schlembach logged in to his myGATC account and downloaded his Sanger sequencing results from his newest watchbox, the folder where Sanger data from GATC Biotech is delivered. The watchbox looked like any other with all the usual Sanger sequencing results, but this was no ordinary watchbox. Mr. Schlembach was in for a big surprise. He had just accessed the company’s 2,000,000th watchbox! 

To mark the occasion, the lucky PhD student in the lab of Dr. Rosenbaum at the Institute of Applied Microbiology at RWTH Aachen University received a tour of GATC Biotech’s Sanger sequencing facilities, as well as a customer appreciation gift this February. 

To join the lab tour, Mr. Schlembach had to tear himself away from his important work at the Institute. There, he is developing a consolidated bioprocess for direct platform chemical production from cellulose using mixed culture fermentation. He hopes that by obtaining new synthetic biofuels, he could make a real positive impact by counteracting the challenges associated with fossil fuel use. 

Sanger sequencing is essential for Mr. Schlembach’s work. He uses the results on a monthly basis for validation of cloned gene fragments, checking of qPCR targets and verification of the identity of his fungal strains, including Aspergillus terreus and Trichoderma reesei. Altogether, he and his labmates go through nearly 600 Sanger sequencing barcodes per year. 

Now that he has pocketed the coveted two millionth watchbox, Mr. Schlembach can look forward to many more Sanger sequencing reactions that propel him forward to the successful completion of his PhD. During this time, he would love to see next-generation sequencing (NGS) becoming faster, cheaper and comparable to Sanger sequencing. With affordable NGS packages starting at 5 million base pairs already available, it could be that his hopes come true before the three millionth watchbox is delivered.

When Richard Lumb lost his father to mesothelioma in 2009, he kept on waging the war against cancer for the sake of all others who still stood a chance against the disease. He saw genomics as a key weapon in the battle, but like many he recognized that genomics progress was not translated quickly enough into clinical practice.

Richard Lumb saw the large information gap between academia, industry and healthcare as the main obstacle to genomics integration in clinical care. To try to close this gap, he established Frontline Genomics, a media organisation whose main focus is on spreading open-access information across all channels in conventional and unconventional ways. 

One of Lumb’s methods for engaging anyone and everyone interested in genomics was the launch of Festival of Genomics events in Europe and the U.S. The events’ laid-back, yet still informative nature has increasingly caught the attention not only of notables like George Church, Craig Venter and Ting Wu, but also of healthcare providers, academia and industry members, as well as patients and their families.

GATC Biotech is pleased to support the Festival of Genomics event held today and tomorrow in London, UK. The company hopes to contribute to Lumb’s initiative by offering its genomic expertise to all interested parties. GATC Biotech is especially proud of its extensive cancer product portfolio featuring analysis of tumour-associated mutations from both tissue and blood (liquid biopsy). Armed with proven whole genome sequencing and exome sequencing capabilities, GATC Biotech is excited to find collaborators who are interested in translating genomic breakthroughs into meaningful patient progress today rather than tomorrow. 

For more information on how our diagnostics solutions can help improve disease management, visit GATC Biotech at stand 44.

The start of the New Year is a time for new beginnings like re-launching our blog. It is also perfect time for making new predictions for what might happen in 2017. Whereas industry experts have tried to foretell anything from election results to stock market trends, we thought that we would go for the hottest topics closest to our hearts: those in genomics. 

Without further ado, here is our sneak peak of the DNA trends that will likely steal the genomics spotlight for the next 365 days:

The microbiome movement

Coming off yet another breakthrough year in 2016, microbiome analysis will most certainly take centre stage again in 2017. Besides a flurry of research activity, expect viruses to nudge themselves next to bacteria in the research spotlight. Microbiota junkies should have a huge craving satisfied as new results from the US Human Microbiome Project are scheduled for this year. 

Nothing more crisp than CRISPR

This year promises countless publications that use the gene editing method as well as more clinical trials in people using CRISPR. In the coming months, we should also see a ruling on the CRISPR-Cas9 patent dispute between University of California, Berkeley and the Broad Institute in Cambridge, Massachusetts. As our genome editing power grows, more ethical discussions and possible restrictions will certainly follow. 

Zapping Zika

Plenty of research work on the Zika virus and the transmitter mosquito species should be completed in 2017 with results expected to settle some previously conflicting reports. The first results from human clinical trials testing Zika vaccines should also roll in. Another development to watch out for: We will see if millions of genetically modified mosquitoes are released in the Florida Keys this year. The mosquitoes are designed to mate with native species, producing offspring that cannot survive, hence reducing mosquito populations and possibly the spread of some viral diseases. 

Upsizing synthetic genomes

In 2010, scientists successfully transformed an artificially constructed genome of Mycoplasma mycoides into a closely related bacterial cell. Since then, scientists have worked on an even more ambitious goal, building a complete synthetic yeast genome with all 16 chromosomes. Results from project “Yeast 2.0” are expected this year.  

A surge in pharmacogenomics

Decreasing sequencing costs and increasing interest in individual treatment options will likely lead to more studies on how genetic variation contributes to drug response. More companies are expected to engage in metabolism-oriented gene testing to monitor treatment response for diseases like HIV and cancer. The pharmacogenomics trend follows a general increase of genomics use in clinical decision-making. 

Jumping ship from invasive to noninvasive cancer detection

The validation of liquid biopsy, a blood-based cancer screening test, should continue full steam ahead in 2017. Numerous studies with large patient cohorts are underway to help implement liquid biopsies into the clinic. The test is expected to revolutionise cancer management by providing a painless, affordable and quick method to help select personalised therapies based on the tumour genomes of individual patients.