Validation

Every pipeline,
benchmarked against ground truth.

For each omics pipeline we run a published reference dataset end-to-end and report what was recovered versus what was expected. This is the technical evidence behind the case studies — same code paths, public data, verifiable results.

For real client deliverables in plain English, see /case-studies.

UMAP of 27,000 mouse heart cells coloured by Leiden cluster
scRNA-seq

scRNA-seq pipeline — mouse heart TAC time course

GSE308859 — 4 timepoints (Sham, TAC 2w/4w/6w), ~27,200 cells, 10x Chromium

PCA plot showing clean separation of dexamethasone-treated vs DMSO-control samples
RNA-seq

Bulk RNA-seq pipeline — dexamethasone perturbation

Public RNA-seq dataset, dexamethasone vs DMSO in A549 cells (canonical glucocorticoid response benchmark)

DiffBind PCA showing condition separation across melanoma ATAC-seq samples
ATAC-seq

ATAC-seq pipeline — melanoma cell-line accessibility

ATAC-seq across melanoma cell lines, ~20 samples

Protein identification rate across the 10 paired tumour-vs-benign samples
Proteomics

Proteomics pipeline — paired tumour vs benign tissue

LFQ-Analyst tumour-vs-benign liver-tissue example, 10 patient-paired Benign/Malignant samples, LFQ-DDA MaxQuant

Heatmap of top regulated phosphosites between sleep-deprived and baseline synaptosomes
Phosphoproteomics

Phosphoproteomics pipeline — sleep-deprivation synaptosomes

PXD010697 — Brüning 2019, sleep-deprivation synaptosomes, 48 LFQ-DDA samples across 6 timepoints

Variant tier distribution across the GIAB NA12878 chromosome 22 truth set
WGS

WGS interpretation pipeline — GIAB NA12878 chromosome 22

Genome-in-a-Bottle (GIAB) v4.2.1 NA12878 truth set, chromosome 22 (49,964 PASS variants)

How to read these

Each validation run takes a publicly-available dataset where the expected biological signal is already published, processes it through the production OmicsDesk pipeline unmodified, and reports what was recovered. A pipeline that can recover canonical signals on public data is the same pipeline that will be run on your data — the code path is identical.