RNA-seq Cell-Type Classification (PWDR) L1-520
Unclaimed Principle — open for contribution
This Principle is declared in the catalog but has no reference solver, no pinned dataset, and is not registered on-chain. There is no reward pool. Submitting a cert against this Principle today will record the cert for reproducibility but pay zero PWM.
To claim it as a Bounty #7 contribution: open a PR adding (1) a reference solver, (2) ≥1 dataset pinned to IPFS, (3) updates to the L3 manifest with dataset CIDs. After verifier-agent triple-review, the founders' 3-of-5 multisig signs PWMRegistry.register() and the Principle becomes mineable.
Forward model E
RNA-seq Cell-Type Classification (PWDR): wraps RNA-seq alignment + transcript quantification core with canonical marker-gene-panel rules. Stage 1 (analytical, sibling to L1-413): align reads to transcriptome (HISAT2, STAR, kallisto, Salmon); estimate per-transcript abundance via EM (RSEM, Kallisto pseudoalignment) or Bayesian inference (Salmon variational); normalize to counts-per-million or apply scTransform/sctransform Pearson residuals. Stage 2 (deterministic threshold): per-cell argmax over marker-panel scores per CellMarker / PanglaoDB / Tabula Sapiens taxonomies. Difficulty tier delta = 5. Mismatch parameters: dropout_rate, batch_effect, doublet_contamination, ambient_rna_contamination, marker_panel_coverage_uncertainty, taxonomy_disagreement.
L-DAG
Well-posedness W
- Existence:
- true
- Uniqueness:
- conditional
- Stability:
- conditional
- κ:
- 200
Existence guaranteed within Omega bounds. Uniqueness conditional on adequate sequencing depth (typically >30k reads per cell for 3' chemistry) and adequate marker-panel coverage. Stability conditional with dropout_rate dominant for low-expressing markers; batch_effect dominant cross-sample; doublet_contamination dominant for high cell densities. Joint Hadamard well-posedness for the coupled RNA-seq + marker-panel-classifier forward established by Trapnell 2014 (foundational scRNA-seq), Macosko 2015 (Drop-seq), Stuart-Butler 2019 (Seurat v3 integration), Tabula Sapiens Consortium 2022, Zhang 2019 (CellMarker), Franzen 2019 (PanglaoDB).
Solvability C
- Solver class:
- linear-operator + statistical [Salmon EM transcript quantification + marker-panel argmax] | nonlinear [Seurat / Scanpy clustering then label-transfer] | linear-operator + deep neural [scVI variational, scANVI, scPhere]
- Convergence rate q:
- 1
- Complexity:
- O(N_reads * log(N_genes)) for alignment; O(N_cells * N_genes) for normalization + marker scoring; total alignment-dominated