Plain language
What this result means
This matters because learned sequence designers are often evaluated on deterministic proxy objectives before wet-lab validation. If a simple deterministic baseline dominates a learned method on its own proxy, the learned method is leaving objective value on the table. The result stays inside that proxy boundary.
- The comparison uses translation correctness, MFE/EFE, CAI, uridine fraction, and forbidden motifs as deterministic scores.
- The strongest warning is also in the result: better proxy values do not imply better expression in cells.
- LinearDesign is treated as an honest wall. It is exact on its own MFE/CAI objective and is not dominated.
Visual notes
How to read the result
Result table
A deterministic baseline dominates CodonRL on 54/54 proteins and GEMORNA on 35/54 under the stated proxy objective.
| Cell | Baseline | Numaro | Delta | Note |
|---|---|---|---|---|
| CodonRL | released soup55 | 54/54 dominated | all usable proteins | MFE, CAI, uridine |
| GEMORNA | repo sequences | 35/54 dominated | proxy improvement | 4-axis objective |
| EnsembleDesign | EFE specialist | 1 dominated | mostly ties | honest wall |
| LinearDesign | exact MFE/CAI | 0 dominated | not beaten | exact wall |
| Verifier | recorded dominations | 90/90 pass | confirmed | translation and scores rechecked |
Method
How it was found
The campaign combined deterministic seeds from exact or simple objectives with a ViennaRNA-in-the-loop local polish, accepting only candidates that re-scored as true Pareto improvements.
- Reproduced the scoring conventions used by the compared tools.
- Generated seeds from maximum-CAI, minimum-uridine, and LinearDesign-guided sequences.
- Polished candidates while rechecking translation and all objective values.
- Separated proxy-objective statements from biological-expression claims.
Verification
How it was checked
verify.py checks translation to the target protein, recomputes all deterministic scores, and rechecks each Pareto-domination claim from raw FASTA sequences.
Scope
What is not being claimed
This is a computational proxy result only. It is not an in-cell expression claim, not a safety claim, and not a claim that lower MFE is always biologically better.
References
Baseline sources
Citation
How to cite
Numaro Autoresearch Team. "Learned mRNA coding-sequence designers are Pareto-suboptimal on their own objective." Numaro Research Report NUMARO-2026-014, 2026.
@techreport{numaro2026MrnaCodonStructure,
title = {Learned mRNA coding-sequence designers are Pareto-suboptimal on their own objective},
author = {Numaro Autoresearch Team},
institution = {Numaro},
number = {NUMARO-2026-014},
year = {2026},
url = {https://numaro.tech/research/mrna-codon-structure-design-2026/}
}