P
Physics World Model
← All principles

RNA-seq Cell-Type Classification (PWDR) L1-520

Computational BiologySingle-cell / bulk RNA-seq transcript quantification with marker-gene-panel cell-type categorical readoutδ=5 · advancedL_DAG = 8.3📋 Stub — not mineable
📋

Unclaimed Principle — open for contribution

This Principle is declared in the catalog but has no reference solver, no pinned dataset, and is not registered on-chain. There is no reward pool. Submitting a cert against this Principle today will record the cert for reproducibility but pay zero PWM.

To claim it as a Bounty #7 contribution: open a PR adding (1) a reference solver, (2) ≥1 dataset pinned to IPFS, (3) updates to the L3 manifest with dataset CIDs. After verifier-agent triple-review, the founders' 3-of-5 multisig signs PWMRegistry.register() and the Principle becomes mineable.

Forward model E

RNA-seq Cell-Type Classification (PWDR): wraps RNA-seq alignment + transcript quantification core with canonical marker-gene-panel rules. Stage 1 (analytical, sibling to L1-413): align reads to transcriptome (HISAT2, STAR, kallisto, Salmon); estimate per-transcript abundance via EM (RSEM, Kallisto pseudoalignment) or Bayesian inference (Salmon variational); normalize to counts-per-million or apply scTransform/sctransform Pearson residuals. Stage 2 (deterministic threshold): per-cell argmax over marker-panel scores per CellMarker / PanglaoDB / Tabula Sapiens taxonomies. Difficulty tier delta = 5. Mismatch parameters: dropout_rate, batch_effect, doublet_contamination, ambient_rna_contamination, marker_panel_coverage_uncertainty, taxonomy_disagreement.

L-DAG

L.poly_a_capture -> L.reverse_transcription -> L.pcr_amplification -> L.sequencing -> L.transcriptome_alignment -> L.transcript_quantification -> L.normalization -> L.marker_panel_classifier -> int.cell
L.poly_a_captureL.reverse_transcriptionL.pcr_amplificationL.sequencingL.transcriptome_alignmentL.transcript_quantificationL.normalizationL.marker_panel_classifierint.cell

Well-posedness W

Existence:
true
Uniqueness:
conditional
Stability:
conditional
κ:
200

Existence guaranteed within Omega bounds. Uniqueness conditional on adequate sequencing depth (typically >30k reads per cell for 3' chemistry) and adequate marker-panel coverage. Stability conditional with dropout_rate dominant for low-expressing markers; batch_effect dominant cross-sample; doublet_contamination dominant for high cell densities. Joint Hadamard well-posedness for the coupled RNA-seq + marker-panel-classifier forward established by Trapnell 2014 (foundational scRNA-seq), Macosko 2015 (Drop-seq), Stuart-Butler 2019 (Seurat v3 integration), Tabula Sapiens Consortium 2022, Zhang 2019 (CellMarker), Franzen 2019 (PanglaoDB).

Solvability C

Solver class:
linear-operator + statistical [Salmon EM transcript quantification + marker-panel argmax] | nonlinear [Seurat / Scanpy clustering then label-transfer] | linear-operator + deep neural [scVI variational, scANVI, scPhere]
Convergence rate q:
1
Complexity:
O(N_reads * log(N_genes)) for alignment; O(N_cells * N_genes) for normalization + marker scoring; total alignment-dominated

Specs (0)

No L2 specs registered yet for this principle.