Blog - Sentieon

General

Sentieon 2026Q2 Update Datasheet

June 19, 2026June 19, 2026

While traditional linear reference genomes introduce bias and variant-calling errors, pangenome approaches resolve these issues by utilizing graph-based population haplotypes. The recently launched Sentieon pangenome pipeline brings this advanced analysis on generic hardware with a highly efficient fastq-to-VCF workflow.

This quarter, we have significantly expanded the pipeline’s capabilities:

Broader Reference Support: The pipeline now natively supports CHM13 as reference in addition to GRCh38. Performance on NIST truth v5.0 has also been improved. We will also support GRCh37 in the future.
Improved Rare Variant Sensitivity: The updated pipeline utilizes population information generated directly from the full pangenome graph. This change enhances rare variant sensitivity while generalizing the pipeline for non-human pangenomes.
Ultima Solaris 2.0 Optimization: In close collaboration with the Ultima Genomics team, we updated our models to fully support their newly released Solaris 2.0 chemistry, achieving industry-leading variant-calling accuracy.
Runtime & Cost Efficiency: The pipeline processes a 30x WGS dataset from FASTQ to comprehensive variant calls (SNP/Indel/SV/CNV) in just 82 minutes. Running on a 64-thread AWS c8i.16xlarge instance, the on-demand compute cost is approx. $4.10.

Section 1: Elevating Germline SNP/Indel Accuracy Across Standard v4.2.1 Benchmarks

Sentieon DNAscope Pangenome brings down the WGS error count to just 5,605—representing a 3.8x reduction compared to the linear DNAscope pipeline, while outperforming DRAGEN v4.5.

Figure 1. HG002 35x WGS (Illumina platform) error counts evaluated against NIST v4.2.1 truth data. DNAseq (Sentieon’s GATK reimplementation) and DNAscope both utilize the linear reference genome (Sentieon pipeline version 202503.03). DRAGEN performance figures are obtained from the v4.5 release update.

Section 2: Updated Model Training to Incorporate the GIAB v5.0 benchmark and the CHM13 Reference

The new model is trained using both the GAIB v4.2.1 and the newly released GIAB v5.0 (Q100-T2T) benchmarks. The v5.0 benchmark introduces additional training data on difficult repeat sequences that are uniquely included in its truth VCF, improving the performance of our pangenome model on these difficult regions. Additionally, integrating the CHM13 reference into our training has broadened the pipeline’s capabilities, proving that the workflow can seamlessly adapt to multiple reference genomes.

Figure 2. HG002 35x WGS (Illumina platform) error counts evaluated against NIST v5.0q benchmark with GRCh38 and CHM13 references. DNAseq (Sentieon’s GATK reimplementation) and DNAscope both utilize the linear reference genome (Sentieon pipeline version 202503.03). DRAGEN performance figures are obtained from the v4.5 release update.

Section 3: Maximizing Detection Sensitivity for Rare Variants

The DNAscope Pangenome pipeline, leveraging the HPRC pangenome graph, maintains high sensitivity for rare variants that are absent from the pangenome. We benchmarked the recall of rare variants (defined as sites excluded from the HPRC graph but cataloged in either dbSNP or gnomAD) across DNAscope Pangenome, DNAseq, and linear DNAscope using the v4.2.1 benchmark (HG003). The results confirm that DNAscope Pangenome delivers high recall across all variant sites; even those not included in the pangenome graph.

Figure 3. Detection Sensitivity of Rare Variants on 35x Illumina WGS (HG003). Recall is measured using GIAB v4.2.1 variant sites that are absent from the HPRCv2 pangenome and present in either the GnomAD or dbSNP.

Section 4: Delivering Industry-Leading Accuracy for Ultima Genomics WGS Analysis

The Sentieon pipeline supports most mainstream short- and long-read sequencing platforms. Our updated Ultima pipeline, retrained specifically for the Solaris 2.0 chemistry, outperformed DeepVariant’s accuracy on the same dataset released at AGBT 2026, establishing what is currently considered the industry-leading benchmark.

Figure 4. HG002 WGS error counts from the Ultima 2026 AGBT reference dataset using the GIAB v4.2.1 and v5.0q benchmarks. Sentieon pipelines utilize version 202503.03; DeepVariant accuracy was derived directly from the released 2026 AGBT VCF files.

Section 5: Comprehensive Structural and Copy Number Variant (SV/CNV) Detection

In addition to small variants, the DNAscope Pangenome pipeline excels at detecting structural variants (SVs) and copy number variants (CNVs). Benefiting from improved read alignment around breakpoints, the recall of SVs and small-sized CNVs shows a dramatic increase compared to traditional linear genome analysis.

Figure 5. Comparative accuracy of benchmarked pipelines across NIST v5.0q. DNAscope Pangenome and DNAscope LongRead achieved the highest F1-scores, driven mostly by their higher recall in these challenging benchmarks. Long-read accuracies (ONT) are included for reference.

Figure 6. Comparative accuracy of benchmarked pipelines across the whole-genome CNV benchmark. CNVscope Pangenome significantly outperforms CNVnator, particularly for small events. The CNVscope Pangenome accuracy improvement is largest for copy number gains and smaller events, effectively pushing the reliable detection limit down to 500bp.

General

Sentieon DNAscope vs. GATK: Best Variant Calling Alternatives for…

March 5, 2026March 5, 2026

Key Takeaways

GATK remains a trusted standard, but at scale it exposes efficiency limits: As sequencing volumes grow, traditional GATK-based variant calling pipelines can lead to longer runtimes, higher compute costs, and operational bottlenecks, prompting labs to evaluate more scalable alternatives without compromising analytical confidence.
Sentieon DNAscope balances clinical-grade accuracy with practical performance: By combining machine-learning–based variant calling, deterministic results, and PrecisionFDA-validated consistency, DNAscope delivers accuracy better than DeepVariant in benchmarked evaluations while running efficiently on standard CPU infrastructure.
Software optimization can rival hardware acceleration for high-throughput labs: With accelerated alignment (BWA-MEM compatible), fast short- and structural-variant calling, and seamless CLI full-pipeline integration, Sentieon offers runtime comparable to FPGA- and GPU-based solutions for many production-scale WGS workloads, while maintaining flexibility, predictable costs, and easier adoption.

The Standard Bottleneck: Why Labs are Moving Beyond GATK Best Practices

For many years, GATK has been the reference framework for variant calling in genomics.

It is well documented, widely validated, and deeply embedded in research and clinical workflows. The GATK Best Practices pipeline has helped establish consistency across laboratories and has played a major role in advancing secondary analysis standards.

As sequencing capacity has grown, however, many labs are finding that what once worked well at smaller scales becomes harder to sustain in high-throughput environments.

When processing large volumes of whole genome sequencing (WGS) or whole exome sequencing (WES) data, GATK pipelines can introduce operational friction:

Long turnaround times per sample
High CPU and memory usage
Limited parallel efficiency in some pipeline stages
Increased infrastructure and cloud computing costs

For example, running a standard 30× WGS sample through a traditional GATK-based variant calling pipeline can take up one to two days on conventional CPU infrastructure. At the population scale or in clinical settings with tight reporting timelines, this latency can affect throughput, cost control, and service levels.

Importantly, this shift is not about questioning GATK’s scientific foundation. Many labs still trust GATK for accuracy. The challenge is practical: how to maintain that level of confidence while meeting modern expectations for speed, consistency, and scalability.

This is where newer alternatives, designed specifically for production-scale genomics, are gaining attention.

Sentieon DNAscope: Clinical-Grade Accuracy on Standard CPU Infrastructure

Sentieon DNAscope was developed to address the performance limitations of traditional pipelines while preserving the rigor expected in clinical and regulated environments.

Rather than layering optimizations on top of existing tools, Sentieon rebuilt key algorithms with a focus on computational efficiency and reproducibility. DNAscope incorporates machine-learning–based variant calling, enabling it to achieve higher accuracy than DeepVariant without relying on GPU acceleration.

Several characteristics distinguish DNAscope in practice:

Proven performance in PrecisionFDA challenges, including accuracy benchmarks and the Consistency Challenge for variant calling.
Deterministic results, meaning repeated runs on the same data produce identical outputs
Compatibility with existing GATK workflows and formats
Deployment on standard x86 and ARM CPUs is commonly used in labs today

The PrecisionFDA Consistency Challenge is particularly relevant for clinical users. Consistency across runs is essential for compliance, validation, and long-term confidence in results. DNAscope’s ability to deliver stable outputs across repeated analyses addresses a common concern with pipelines that rely on stochastic processes or downsampling.

From a variant quality perspective, in benchmarked datasets, DNAscope’s ML models help reduce false positives in complex genomic regions such as:

Low-complexity sequences
Homopolymer stretches
Segmental duplications

These regions often account for a significant portion of downstream manual review. Improving signal quality upstream can reduce analyst workload and shorten reporting timelines.

All of this is achieved on standard CPU hardware, lowering the barrier to adoption for labs that prefer not to invest in specialized compute accelerators.

Benchmarking Speed: Sentieon Compared With Illumina Dragen and NVIDIA Parabricks

Performance comparisons are often where labs focus their evaluation, particularly when throughput and cost efficiency are top priorities.

Today, three common approaches are used to accelerate secondary analysis:

FPGA-based acceleration (Illumina Dragen)

Illumina Dragen uses FPGA hardware to deliver high-speed alignment and variant calling. In many cases, 30-35x WGS analysis can be completed in under two hours.

Trade-offs to consider include:

High upfront hardware costs
Vendor-specific infrastructure
Less flexibility outside the Dragen ecosystem

GPU-based acceleration (NVIDIA Parabricks)

Parabricks leverages GPUs to accelerate alignment and variant calling, often achieving performance comparable to FPGA systems.

Key considerations include:

Dependence on high-end GPUs
Higher infrastructure and operational costs
Additional complexity in managing GPU workloads

CPU-optimized acceleration (Sentieon DNAscope)

Sentieon takes a different approach by focusing on software optimization rather than specialized hardware.

Across independent benchmarks and real-world deployments, Sentieon commonly demonstrates:

Alignment speeds up to ~3-5× faster than standard BWA-MEM implementations
Variant calling speeds that are often observed to be ~10–20× faster than traditional GATK-based pipelines in benchmarked workflows
End-to-end WGS runtimes comparable to Dragen and Parabricks

In practice, a 30× WGS sample can often be processed in around one hour on a modern 64-core CPU server, depending on the pipeline configuration.

For many labs, this shifts the cost equation. Instead of investing in dedicated FPGA or GPU systems, teams can:

Reuse existing CPU infrastructure
Scale horizontally with predictable costs
Avoid hardware-specific lock-in

From a return-on-investment perspective, Sentieon allows performance optimization through software, which is often easier to budget and scale than specialized hardware deployments.

From BWA-MEM to Accelerated Mapping: Improving Efficiency at the Front End

Alignment remains one of the most resource-intensive steps in any sequencing workflow. BWA-MEM has long been the standard aligner due to its balance of speed and accuracy, and it remains widely trusted across the industry.

However, BWA-MEM was developed when sequencing throughput and core counts were much lower than they are today.

Sentieon addresses this by offering an accelerated implementation of BWA-MEM that preserves algorithmic behavior while improving execution efficiency.

Key points of trust for labs:

Output is algorithmically equivalent to standard BWA-MEM
No changes to seeding, scoring, or alignment logic
Compatible with existing downstream tools and validations

Performance gains come from implementation-level improvements, including:

Better multi-threading across high-core CPUs
More efficient memory access patterns
Reduced I/O overhead

In real workflows, this often results in 3–5× faster alignment, depending on the hardware configuration. For labs processing dozens or hundreds of genomes per week, faster alignment significantly shortens overall pipeline runtime and improves resource utilization.

Because outputs remain consistent with BWA-MEM, adoption does not require re-establishing scientific trust in a new aligner—an important factor for regulated or clinically validated pipelines.

Seamless Integration: Command-Line Compatibility With CLI Interface

Operational friction is a common barrier when evaluating new tools. Many labs have invested heavily in workflow automation, monitoring, and staff training around existing pipelines.

Sentieon reduces this friction by supporting GATK-style syntax and offering a Dragen-like command-line interface for common workflows:

GATK-style syntax
CLI Interface

For teams currently using Dragen, this compatibility enables a smoother transition:

Many existing scripts require only minimal changes
Output formats remain consistent for downstream analysis
QC and logging workflows can be preserved

For labs migrating from GATK, Sentieon accepts familiar parameters and integrates cleanly with workflow managers such as Nextflow, Snakemake, and WDL.

The result is faster evaluation and adoption, with lower engineering overhead. Teams can focus on performance and accuracy outcomes rather than re-engineering their pipelines.

Top GATK Alternatives for Production-Scale Genomics (2026 Comparison)

Feature	Sentieon DNAscope	GATK	Dragen	Parabricks
Accuracy	PrecisionFDA-validated, Best in industry	Widely trusted standard	High, proprietary	High, DeepVariant-based
Speed (30× WGS)	~1 hour on CPU	24–48 hours	~90 minutes	~2 hours
Hardware Requirements	Standard CPUs	Standard CPUs	FPGA servers	GPU servers
Infrastructure Cost	Moderate	Low	High	High
Ease of Integration	GATK & Dragen compatible	Familiar ecosystem	Vendor-specific	GPU expertise required
Scalability	Linear CPU scaling	Limited by runtime	Hardware-bound	GPU availability
Clinical Readiness	Strong adoption in regulated labs	Broad clinical history	Common in Illumina labs	Growing adoption, dependent on validation context

How labs typically decide:

Sentieon is often chosen by labs seeking a balance of accuracy, speed, and infrastructure flexibility.
GATK remains useful in research environments that prioritize open-source tools and established standards.
Dragen appeals to Illumina-centric labs with dedicated hardware budgets.
Parabricks fits organizations already operating GPU-heavy compute environments.

References

https://fabricgenomics.com/resource/secondary-analysis-fast-alignment-and-variant-calling/

https://www.scispot.com/blog/top-cgm-labdaq-alternatives-and-competitors?utm_campaign=xyz123