The intelligence layer
for your
single-cell omics data

CyteType deploys specialized AI reviewers to annotate each cluster with ontology-mapped labels, marker-level evidence, functional state resolution, and confidence-scored quality control.

Request a Demo

Marker Evidence 12 supporting · 2 missing · 1 unexpected Functional State Activated lipid-secretory CL:0000235 Ontology-anchored Reviewer Synthesis Dual contractile and ECM-remodeling programs supported by ACTA2 + COL1A1

Annotation intelligence at production scale

100,000+ clusters annotated

99.99% completion rate

Days not weeks, to audit-ready

Every cluster passes through specialized AI reviewers that trace each annotation to marker-level evidence, ontology-mapped labels, and confidence-scored quality control, giving your team defensible calls at production scale.

Where CyteType delivers

When annotation speed, consistency, and traceability break down, programs stall. CyteType surfaces the biological problem first, then provides an evidence-backed path to a defensible call.

Consortium-scale cell atlas annotation

Site-to-site labeling drift makes consortium atlases hard to align and harder to defend. Ontology-mapped calls with explicit evidence and confidence scoring enforce one reviewable language across studies.

Ontology-anchored annotation

Each cluster is mapped to a Cell Ontology term with confidence and label match scores, plus a direct CL reference so the definition is explicit and reviewable.

Salivary gland acinar cell (serous/mucous mixed)

Salivary gland acinar cell (serous/mucous mixed) — actively secreting immunomodulatory state with homogeneous population co-expressing CRISP3, MUC7, and AQP5

Cell State Actively secreting and immunomodulatory

Confidence High

Ontology CL_4052067 — seromucous acinar cell of salivary gland

User Label Salivary/Submucosal_glands Match: 70%

Coarse lineage map

Clusters are grouped at a coarse resolution into major lineages with confidence cues, so you can orient quickly before diving into subtypes.

7 Inflammatory Dendritic Cell

13 Macrophage

14 Activated Mast Cell

18 Neutrophil

3 Activated B Cell

19 Plasma Cell

23 CD8+ Cytotoxic T Cell

8 Endothelial Cell

9 Lung Fibroblast / Lymphatic Endothelial Cell

Confidence and heterogeneity QC

Badges summarize certainty and heterogeneity, with narrative reasoning describing what is solid versus mixed in each cluster.

Quality Control Confidence: High −

Confidence

Strong expert consensus with compelling evidence. All five reviewers independently confirmed the annotation based on canonical salivary acinar markers (CRISP3, PIP, MUC7, PRR4) with exceptionally high expression (>74%) and massive fold-changes. The computational classifier supports this with 90% cross-validated accuracy.

Heterogeneity

Homogeneous population of salivary gland acinar cells with mixed serous/mucous phenotype. All five assessments converge on the same identity, with disagreements limited to activation state. Co-expression of CRISP3 and KRT7 in 71.7% of cells confirms they mark the same population rather than distinct subpopulations.

Cell classifier

90.4% accuracy Entropy 0.37

Salivary gland acinar cell (serous/muco...

90.4%

Esophageal basal keratinocyte (early dif...

4.6%

Suprabasal esophageal keratinocyte

2.0%

Th17-polarized memory CD4+ T cell

1.0%

Activated Regulatory T cell (Treg)

0.8%

Gastric SPEM cell (TFF2-high, MUC5AC...

0.4%

Multi-expert synthesis

Specialized reviewers evaluate each call and surface strengths, weaknesses, and plausible alternatives before you sign off.

Reviewer Panel 5 panelists −

Reviewer 1 Salivary Gland Cell Biologist +

The annotation as salivary gland acinar cell (serous/mucous mixed) is well-supported by canonical markers, though elevated ductal keratins warrant consideration for potential ductal contamination.

Reviewer 2 Epithelial Cell Developmental Biologist +

Mixed serous/mucous acinar differentiation is confirmed, but elevated ELF5/SOX9/KRT7 suggests a luminal progenitor state rather than fully mature terminally differentiated acinar cells.

Reviewer 3 Mucosal Immunology Specialist +

The annotation is strongly supported — this is an actively secreting salivary acinar cell with mixed serous/mucous phenotype and robust antimicrobial immunomodulatory function.

Reviewer 4 Oral/Upper GI Pathology Expert +

KRT7 co-expression likely reflects shared epithelial programs rather than ductal contamination, given near-absence of canonical ductal marker SLC4A1.

Reviewer 5 Computational Biology & Pathway Specialist +

Membrane trafficking and glycosylation pathways strongly support salivary gland acinar identity; pathway enrichment is consistent with high metabolic demands of secretion.

Decision traceability

See the full candidate funnel, why each was accepted or rejected, and the quantitative evidence behind it, including Log2FC and tissue context.

Cell Type Candidates Investigated Refined 4 −

Lung Myofibroblast

Refined from 'Myofibroblast' with tissue context. All cells are from lung tissue (82% adenocarcinoma), with SPARC, POSTN, and FBLN1 enriched in lung stromal fibroblasts. High ECM remodeling gene expression confirms a lung-specific activated fibroblast phenotype.

Myofibroblast

Refined from 'Fibroblast'. Strong canonical markers ACTA2 (Log2FC=3.62), TAGLN, THY1, and COL1A1, but lung adenocarcinoma context supports a more specific lung myofibroblast identity.

Pericyte

Strong expression of pericyte markers PDGFRB (Log2FC=5.08) and THY1 (Log2FC=4.08), but these are shared with myofibroblasts. MGP and MFGE8 absence argues against a pure pericyte identity.

Fibroblast

Retained as baseline due to core ECM genes BGN (Log2FC=3.83), COL6A2, DCN, and LUM, but the absence of lineage-specific transcription factors supports a general fibroblast identity insufficient for this cluster.

Initial candidates 10 candidates

Fibroblast Endothelial Cell Macrophage Smooth Muscle Cell Pericyte Epithelial Cell Myofibroblast Stromal Cell Mesenchymal Cell Dendritic Cell

These are the initial cell type candidates generated before refinement and evaluation.

Audit-ready export

Export the full annotation table with cluster IDs, CL terms, states, confidence scores, and label match percentages.

Annotation Export 12 clusters Download CSV

Cluster	Cell Type	CL Term	State	Confidence	Match
0	Macrophage	CL:0000583	Activated, immunosuppressive	High	100%
1	Lung Myofibroblast	CL:0000186	Activated, ECM-remodeling	High	94%
2	CD8+ T Cell	CL:0000625	Cytotoxic, exhausted	High	97%
3	Epithelial Cell	CL:0000066	Proliferating, basal-like	Medium	82%
4	Endothelial Cell	CL:0000115	Tip-like, angiogenic	High	91%

Validated benchmarks

Multi-agent AI · Full expression profiles · Evidence-grounded reasoning

Up to 388% higher annotation accuracy

16 LLMs tested across model families

Up to 300% improvement over existing methods

Annotation score across methods

CyteType (GPT-5)

SingleR

CellTypist

GPTCellType (GPT-5)

CyteType across LLMs

SingleR

CellTypist

GPTCellType (GPT-5)

CyteType configured with different LLMs

Claude Sonnet 4 (C)

GPT-5 (C)

Gemini 2.5 Pro (C)

GPT-4.1 (C)

Kimi K2 (O)

GLM 4.5 (O)

LLaMA 4 Maverick (O)

DeepSeek R1 (O)

Magistral Medium 2506 (O)

Grok 4 (C)

Qwen3 235B A22B Thinking (O)

GPT-OSS 120B (O)

Gemini 2.5 Flash (C)

Qwen3 235B A22B (O)

Minimax M1 (O)

Qwen3 30B A3B Thinking (O)

(O) = Open weight LLM (C) = Closed weight LLM

Datasets Resource Reliability

Performance improves up to 300% over existing methods, orders of magnitude beyond the typical 10–20% gains seen across the field. Even open-weight models like DeepSeek R1 and Qwen3 reach 95% of peak performance. The breakthrough is in structured reasoning, not prompting at scale — moving single-cell annotation from guesswork to interpretable, evidence-based classification.

Read the benchmarking study on bioRxiv

How CyteType compares

Pharma teams need more than accuracy from an annotation tool. They need evidence they can defend, terminology that stays consistent across partners, and infrastructure that passes security review.

Evaluation Criteria	CyteType	SingleR	scType	CellTypist	scANVI	Azimuth
Can We Deploy This?
On-premise / air-gapped deploymentInfoSec and data governance clearance	✔Local model hosting; AWS Bedrock supported	✔Runs locally in R	PartialR scripts local; web tool is external	✔Runs locally in Python	✔Runs locally; requires GPU	PartialDocker available; web app is cloud-hosted
Pipeline integration (Python + R)Drop-in to existing Scanpy/Seurat workflows	Bothpip install (AnnData) + CyteTypeR (Seurat)	R onlyBioconductor package	R onlyR scripts; no native Python	Python onlyPyPI package	Primarily PythonR via reticulate bridge	Primarily RAccepts h5ad via web upload
GPU infrastructure requiredIT procurement and compute cost implications	No	No	No	No	Effectively yesDocumented limitation of the method	No
Commercial licensingLegal and procurement clarity	Explicit commercial licenseOpen-source for academia; commercial terms for industry	Open source (GPL-3); copyleft implicationsSame as scType and Azimuth now. Three of the five competitors carry copyleft obligations, which strengthens CyteType's positioning on licensing clarity.	Open sourceGPL v3; copyleft implications	Open sourceMIT-style	Open sourceBSD 3-Clause	Open sourceGPL-3; copyleft implications
ReproducibilityRegulatory and QC requirement: same input, same output	✔Deterministic	✔Deterministic	✔Deterministic	✔Deterministic	PartialStochastic training; seed-dependent	✔Deterministic
Does It Reduce Risk in Our Decision-Making?
Reference data dependencyOperational overhead of curating and maintaining reference datasets per project	Not requiredOperates from expression data; accepts marker genes to guide annotation	RequiredAccepts any custom labelled dataset	Not requiredBuilt-in marker DB; custom markers via XLSX	RequiredUser-trained custom models supported	RequiredAccepts any labelled dataset for training	RequiredLimited to HuBMAP curated references only
Evidence trail per annotationDefend calls in target review, regulatory, and cross-functional settings	✔Full reasoning chain: markers, literature, reviewer assessment	—	—	—	—	—
Cell Ontology standardisationConsistent terminology across projects, sites, and CRO partners	✔Automatic CL ID assignment per annotation	—	—	PartialCL IDs in encyclopedia; not in annotation output by default	—	—
Disease context handlingTME, inflammatory tissue, and disease-state datasets	✔Adapts reasoning to disease vs. healthy contexts natively	PartialOnly if reference contains disease labels	PartialSNV calling distinguishes malignant vs. healthy	PartialOnly if model trained on disease data	PartialOnly if reference contains disease labels	—Training data explicitly excludes cancer
Functional state resolutionDistinguish actionable cell states from coarse labels	✔Activation, exhaustion, polarization with marker-level support	—Constrained by reference label granularity	—Constrained by database categories	PartialGranularity depends on model used	—Constrained by training labels	PartialMulti-level hierarchy (L1/L2) available
Cluster-level interrogationQuery the biology behind each annotation call directly	✔Chat interface per cluster: ask biological questions, explore evidence	—	—	—	—	—

Benchmarking context: In published benchmarking across 977 clusters and 20 datasets (bioRxiv, Nov 2025), CyteType demonstrated 101% improvement over SingleR, 268% over CellTypist, and 388% over GPTCellType. Performance validated across PBMC, bone marrow, tumour microenvironment, and cross-species datasets. Independent peer review pending.

Full technical comparison covering 30+ features (architecture details, supported modalities, visualization exports, admin/security, and benchmarking breakdown) available on request.

Sources: All features verified against official documentation, GitHub repositories, and peer-reviewed publications as of March 2026.

Built to hold up in the real world

LLM-driven annotation fails without reliability, privacy, and scale. CyteType is built for those constraints.

Defensible labels

Ontology IDs, evidence trails, and reviewer rationale on every call.

Production LLM stack

Hundreds of calls per cluster with retries and health-aware fallbacks, built to finish at scale.

Enterprise ready

Cloud pilots now; on-prem for pharma-run LLMs, zero retention, no training use, isolated storage.

Fits your stack

Scanpy, Seurat, and AnnData supported via the CyteType Python and R packages.

Python package R package

Benchmarked

Tested against CellTypist, SingleR, and GPTCellType across four datasets and sixteen LLMs.

"I was dreading annotating 12 tumour samples with 40+ clusters each. CyteType got us to solid initial labels in three days instead of the three weeks I'd planned for. The Cell Ontology mapping was particularly valuable during revisions when reviewers asked why we hadn't used standardised terminology. The evidence documentation also helped when my PI questioned some of the macrophage subtype calls."

Postdoc, Immunology Lab — Johannes Gutenberg-Universität, Mainz

"The recurring problem I faced was getting inconsistent annotations from different team members, then being stuck trying to decide between them without clear criteria. CyteType's multi-reviewer system and pathway enrichment let me evaluate the biology myself. It's not perfect for rare populations, but it gets us most of the way there much faster than our previous workflow."

Principal Investigator — Karolinska Institutet

"This was my first time analyzing scRNA-seq independently and the computational aspects were challenging. The cluster chat feature was really helpful - I could ask why certain plasma cells showed unexpected markers and get explanations grounded in the actual expression data. The speed was remarkable, what I'd budgeted three weeks for took an afternoon. My supervisor was impressed with how quickly I could defend the annotation choices."

PhD Student — Università degli Studi di Salerno

"We support approximately 60 projects per year across different tissue types. Building and maintaining reference datasets for each context was becoming unsustainable. CyteType eliminates that requirement entirely, and the accuracy is comparable to our carefully curated references. The main advantage for us is consistency - every project now starts from the same standardised baseline. We still validate everything, but the time savings are significant."

Senior Computational Biologist, Core Facility — Institut Pasteur

"The functional state resolution was the standout feature for me. Instead of simply labeling cells as macrophages, we got detailed breakdowns like 'M2-polarized macrophages with active lipid metabolism and ECM remodeling', with specific gene sets supporting each functional program. This granularity completely changed our validation plan - we knew exactly which biological processes to target in follow-up perturbation studies."

Researcher, Computational Oncology — Memorial Sloan Kettering Cancer Center

"I've evaluated several automated annotation tools. CyteType's strength is the reasoning transparency - you can examine exactly why each label was chosen and what the sources of uncertainty are. My one concern is potential bias from the LLM's training data. If the model has encountered papers on a specific tissue type, is it discovering patterns de novo or recalling literature? We independently verify key populations, but it's worth being aware of. The heterogeneity scores have been useful for identifying clusters that need subclustering, and for first-pass annotation it performs well."

Bioinformatician — Helmholtz Zentrum München

"Our patient samples contained many transitional microglial states that were difficult to classify. What proved most valuable was how CyteType decomposes cell states into discrete functional programmes - clearly separating which genes support activation versus antigen presentation versus proliferation. This gave us clear hypotheses about what these cells were actually doing in the disease context, rather than just arguing over terminology."

Postdoc — Ludwig-Maximilians-Universität München

"Annotation is necessary but time-consuming work that doesn't directly generate publications, yet incorrect annotations can derail a manuscript. CyteType handles the majority of straightforward clusters, allowing my team to focus on biologically interesting populations. The CytetypeR package integrated smoothly with our existing Seurat pipeline - just a few lines of code. No infrastructure changes required. For differentiated cell types, it's proven quite reliable."

Principal Investigator, Developmental Biology — University College London

"The multi-reviewer assessment caught something we'd initially missed. One reviewer flagged unexpectedly high HLA-DR expression in what we'd labeled as NK cells, which prompted us to reclassify them as activated CD8+ T cells. The literature citations were also valuable when responding to reviewer questions about our monocyte subtype assignments."

Researcher, Infectious Disease — University of Innsbruck

"As someone still learning the field, the detailed marker-level explanations were really educational. Rather than just accepting a cell type label, I could see which specific genes supported each call and understand their biological relevance. This made me much more confident discussing the data in journal clubs and lab meetings."

PhD Student — Medical School of Hannover

"The functional state annotations go well beyond basic cell type labels. For our tumour microenvironment study, seeing the specific metabolic and signalling programmes active in each immune population helped us understand the biology mechanistically. This directly informed better experimental design."

Bioinformatician — University of Glasgow

"We advise multiple research groups, and maintaining annotation consistency across projects has been challenging. Building and maintaining reference atlases for every tissue type was consuming significant resources. CyteType performs comparably to our carefully curated references without requiring that upfront investment. The Cell Ontology mapping makes meta-analyses much more feasible."

Senior Computational Biologist — University of Cambridge

"In translational research, you need annotations that can withstand scrutiny. The complete evidence trail - markers, citations, reviewer assessments - provides documentation we haven't had access to before. We've used it for immune monitoring studies, and the clinical team appreciated being able to trace the reasoning behind each classification. It still requires domain expertise to interpret appropriately, but it provides a solid starting point."

Clinician-Scientist — Mass General Brigham

"We study disease processes across multiple tissue types, and maintaining cross-tissue annotation consistency has been difficult. CyteType's context-aware approach has helped us compare related cell populations in different organs more systematically. The tool doesn't replace expert biological knowledge - we've corrected some edge cases - but it dramatically accelerates the process and improves reproducibility across our lab."

Principal Investigator — University of Florida

See CyteType on your data

Leave your details and our team will arrange a session where you can bring a dataset and walk away with a full annotation report.

The intelligence layer
for your
single-cell omics data

Annotation intelligence at production scale