We deploy to the cloud optimized implementations of a selected suite of tools for long-read NGS data processing. You can access this tools individually through our platform, but we also build customized workflows.
These are tools for consensus calling and polishing, including a cloud HPC implementation of the Circular Consensus Sequencing (CCS) tool to produce Highly Accurate Single-Molecule Consensus Reads (HiFi Reads). A bam file of raw subreads of ~400Gb can be processed in 4 hours, and one of ~600Gb in 5 hours.
Genome mapping and assembly
We are currently running minimap2 and NGMLR for mapping, and Canu and RaGOO for genome assembly. These tools have been deployed in a cluster configuration of ec2 instances that minimize computing cost and processing time.
A number of tools for variant calling of single nucleotide variation, small Indels and large structural variants are available. Currently we are running Assemblytics, Sniffles, and a suite of our tools for haplotype calling built around our own algorithms, which can produce Sanger sequencing equivalent accuracy from either CCS reads or raw uncorrected subreads. There are situations in which one or the other input dataset would be more convenient.
Metagenomics is challenging because of the co-existance of multiple genetic variants. PacBio's hifi approach has helped a lot with deconvoluting the genetic composition of these mixtures, but challenges still remain in terms of low frequency variants, and reconstruction of long haplotypes. Long-read metagenomics can be of great use for identifying new Biosynthetic Gene clusters, and we are actively working on new solutions.
Sequencing and data analysis costs of long-read become virtually prohibitive at large-scale. We have developed new approaches for producing highly accurate variant calling from low coverage data. We are piloting this on a project seeking to uncover hidden correlates to disease in patients with different clinical conditions.
Our services go all the way from sample preparation to long-read sequencing and bioinformatics. We can help you with any of these three fundamental steps separately, or provide end-to-end solutions for specific applications. We are obsessive about sample preparation, and particularly skilled at HPC in the cloud.
Long-read DNA Sequencing
Description of all the possible types of sample preparation that we can offer
Description of all the possible types of DNA sequencing that we can offer
Description of all the possible types of bioinformatics solutions that we can offer