Reducing deep sequencing errors to find residual cancer cells

Image of people standing around lab equipment

Using this new computational tool designed in the lab of Xiaotu Ma, PhD, identifies errors in deep sequencing analysis to detect residual cancers and monitor treatment progress in detail.

Our lab has been dedicated to mapping the genomic landscape of childhood cancers to understand what drives the tumors and to improve treatments. We also have accumulated significant genomic insights in pediatric patients with relapsed disease. We want to explore advanced genetic analysis to detect cancers long before symptoms appear because early detection could lead to even better cure rates. For example, being able to detect a small number of cancer cells in a large population of normal cells will help clinicians develop treatment strategies to knock out residual tumor cells that remain in patients who have been treated. Clinicians currently use ultra-deep sequencing to identify the cancer cells hiding amid the millions of cells in tumor biopsies.

Challenge of deep sequencing errors

The major challenge in ultra-deep sequencing has been to eliminate errors in the sequencing process. Errors can be introduced through handling the specimen, but there are multiple other sources.  Extracting the DNA, using enzymes to amplify the DNA for sequencing, the computational methods used to analyze the results and even the DNA sequencing machine itself are all common sources.

Sequencers are like any other machine — there are functional variations among different machines, even between machines of the same make and model. All sequencers use biochemical reactions in their analysis. Variability in that step can also introduce errors. Until now, no one has figured out how to precisely identify and measure such machine errors.

The path to SequencErr

In 2019, we published a mathematical approach for analyzing and suppressing errors in ultra-deep sequencing data. While the method greatly improved accuracy, it didn’t address sequencing machine errors.

SequencErr is a computational tool we designed to measure sequencer errors. It enables standardizing sequencing machines for ultra-deep sequencing applications. In turn, this will better assist clinicians to detect low-level residual cancer cells in patients, the information they can use to formulate better treatments. Sequencing centers can use the tool to assess which of their machines are best suited for ultra-deep sequencing and to identify the sequencing chemicals that give the most accurate results.

Sequencer manufacturers can use SequencErr to improve machine accuracy. SequencErr will also help industries from drug companies to beer brewers that use biological processes in manufacturing to achieve better quality control.

We published a report on SequencErr in the journal Genome Biology and have made the tool available for free to the academic research community.

How SequencErr works: Matching DNA strands reveal machine error

The principle behind SequencErr is simple. It’s based on the double-stranded DNA molecule. The strands are complementary and fit together like puzzle pieces on string. The DNA sequencing process entails sequencing each strand of double-stranded segments, with their complementary “forward” and “reverse” strands.

Deep sequencing for better cancer treatment

Today at St. Jude we routinely do DNA sequence analysis as part of patients’ cancer diagnoses. Now, we can look forward to using accurate ultra-deep sequencing to monitor treatment progress in unprecedented detail. Our clinicians will be able to see exactly how well the treatment is progressing and make adjustments to stop relapses in their earliest stages.


About the Author

Xiaotu Ma

Xiaotu Ma, PhD, is an assistant faculty member in the Computational Biology Department. View full bio.