Sponsored content brought to you by
In cancer genomics, researchers are “really missing a lot of things by not looking for them in the right way,” says Shruti Iyer, a PhD candidate in the Genetics Program at Stony Brook University. In a recent talk in New York*, Iyer, who is pursuing her doctoral research in Dr. W. Richard McCombie’s lab at Cold Spring Harbor Laboratory, described how her lab is using targeted long-read nanopore sequencing to study previously undetected structural variants (SVs) with sensitivity and specificity of over 95%.
SSVs are DNA variants spanning more than 50 base pairs (bp), including insertions, deletions, duplications and translocations. Because of their size, SVs can be very disruptive. Among their achievements, the McCombie group has identified several thousand variants in the HER-2-amplified breast cell line SK-BR-3 using long-read DNA and RNA sequencing in addition to short reads. This study, Iyer says, “completely highlighted the deficiency of short reads”, prompting her to look at developing targeted long read enrichment approaches using CRISPR-Cas9.
The CRISPR-Cas9 method of PCR-free, long-read target enrichment begins with dephosphorylation of the ends of the sample DNA. The CRISPR-Cas9 machinery (including the custom guide RNA) is then added. The Cas9 nuclease is guided to the sites flanking the target loci, where it induces double-stranded cuts in the DNA, excising the region. Sequencing adapters can then be ligated to these exposed phosphorylated ends, enabling sequencing of the target regions.
The process has the advantage of dispensing with PCR and can be used to enrich for very large regions. In an early test, Iyer enriched the BRCA1 gene in the SK-BR-3 cell line. At the time, she says, groups using CRISPR-Cas9 enrichment hadn’t gone beyond targeting and sequencing regions of 5–10 kilobases (kb). Iyer wanted to push the process, aiming for 200 kb.
The results speak for themselves. Iyer’s team successfully captured and sequenced BRCA1 end-to-end in a single, ultra-long nanopore read of 198 kb. As far as she is aware, this stands as the record for the longest read generated with CRISPR-Cas9 enrichment. All of her group’s long read experiments were carried out using Oxford Nanopore’s FLO-MIN 106 cells on the GridION instrument.
Going Deep
Iyer was also interested in improving the depth of coverage in regions of interest. Pooled libraries helped but did not fully address the problem, so Iyer decided to focus on the “background” problem. Was the background DNA competing and inhibiting the sequencing of the ultra-long fragments? To tackle this, she used the Circulomics Short Read Eliminator (SRE) Kit on the CRISPR-Cas9-enriched libraries, prior to preparation for sequencing. In an enriched sample of MCF 10A prepared using this method, one 142-kb read was observed, further bridging the target.
The team’s next step was to enrich multiple targets, some of which were below the length cut-off of the SRE Kit, meaning that they would be removed by the process. To enable the preservation of this enriched DNA whilst effectively removing the background DNA, Iyer developed ACME (Affinity-based Cas9-Mediated Enrichment). This method makes use of the histidine tag present on the Cas9 enzyme, used to purify the protein in its production, enabling the capture of Cas-bound regions on Dynabeads. Iyer showed how libraries prepared using ACME performed better in terms of depth of coverage, but reads still did not fully span very large targets.
She then designed a cancer gene panel, targeting multiple genes of different sizes to test the upper limit of the enrichment method; the genes selected were those where SVs had been found in whole genome data. With more targets, more DNA was pulled out using the ACME process. Iyer displayed alignments showing end-to-end coverage of the 90-kb region targeting the BRCA2 gene in both cell lines, with depth of coverage much improved by ACME. For SK-BR-3, 99-fold enrichment, to a depth of coverage of 100x, was achieved with ACME.
Analysis of the different target lengths—spanning tens of kilobases—determined that good end-to-end enrichment is seen with the method up to ~100 kb. As there isn’t always enough sample to enable multiple preps and pooling prior to sequencing, Iyer tested ACME using single-prep libraries, which showed improved depth of coverage over the non-ACME libraries.
Fixing Blind spots
This advance is important, because while cancer genomics is rapidly advancing, SVs are still often “blind spots,” Iyer says. To date, many thousands of tumors have been sequenced via next-generation sequencing, enabling the discovery of different signatures and mutation rates across dozens of tumor types, while revealing insights into clonal structure and tumor evolution. Several hundred genes have been identified that can cause cancer if mutated. Those genetic alterations can vary from single point mutations to larger SVs, which can affect one or multiple genes.
Iyer’s team is now focusing its efforts on targeted, long-read sequencing to achieve the depth needed to identify rare variants. The use of targeting strategies allows higher-throughput sequencing of key targets, improving depth of coverage and allowing for the detection of rare alleles. As targeting avoids having to sequence an entire genome to generate sufficient depth of coverage. However, until about two years ago, Iyer says, there wasn’t an effective method of long-read target enrichment.
In the future, Iyer and her team plan to apply this long-read method to the second version of their cancer panel, encompassing BRCA1, ERBB2, APAF1 and other Catalogue of Somatic Mutations in Cancer (COSMIC) genes with evidence of SVs. She also intends to compare the performance of SV detection between targeted and whole genome strategies, among other projects.
*Nanopore Community Meeting, hosted by Oxford Nanopore Technologies; New York: December 5–6, 2019.
Watch Shruti’s full talk nanoporetech.com/shrutiiyer