Plain language

What this result means

This matters because learned sequence designers are often evaluated on deterministic proxy objectives before wet-lab validation. If a simple deterministic baseline dominates a learned method on its own proxy, the learned method is leaving objective value on the table. The result stays inside that proxy boundary.

  • The comparison uses translation correctness, MFE/EFE, CAI, uridine fraction, and forbidden motifs as deterministic scores.
  • The strongest warning is also in the result: better proxy values do not imply better expression in cells.
  • LinearDesign is treated as an honest wall. It is exact on its own MFE/CAI objective and is not dominated.

Visual notes

How to read the result

Bar chart of mRNA design Pareto domination counts for CodonRL, GEMORNA, EnsembleDesign, and LinearDesign.
Domination countsCodonRL and GEMORNA are dominated on their stated proxy axes; LinearDesign is not, because it is exact on its own MFE/CAI objective.
Bar chart of average proxy-objective improvements on dominated GEMORNA proteins.
Average movementOn dominated GEMORNA rows, the proxy movement is not one-dimensional: MFE, CAI, uridine, and motifs all move in the allowed direction on average.

Result table

A deterministic baseline dominates CodonRL on 54/54 proteins and GEMORNA on 35/54 under the stated proxy objective.

CellBaselineNumaroDeltaNote
CodonRLreleased soup5554/54 dominatedall usable proteinsMFE, CAI, uridine
GEMORNArepo sequences35/54 dominatedproxy improvement4-axis objective
EnsembleDesignEFE specialist1 dominatedmostly tieshonest wall
LinearDesignexact MFE/CAI0 dominatednot beatenexact wall
Verifierrecorded dominations90/90 passconfirmedtranslation and scores rechecked

Method

How it was found

The campaign combined deterministic seeds from exact or simple objectives with a ViennaRNA-in-the-loop local polish, accepting only candidates that re-scored as true Pareto improvements.

  • Reproduced the scoring conventions used by the compared tools.
  • Generated seeds from maximum-CAI, minimum-uridine, and LinearDesign-guided sequences.
  • Polished candidates while rechecking translation and all objective values.
  • Separated proxy-objective statements from biological-expression claims.

Verification

How it was checked

verify.py checks translation to the target protein, recomputes all deterministic scores, and rechecks each Pareto-domination claim from raw FASTA sequences.

Scope

What is not being claimed

This is a computational proxy result only. It is not an in-cell expression claim, not a safety claim, and not a claim that lower MFE is always biologically better.

References

Baseline sources

Citation

How to cite

Numaro Autoresearch Team. "Learned mRNA coding-sequence designers are Pareto-suboptimal on their own objective." Numaro Research Report NUMARO-2026-014, 2026.

@techreport{numaro2026MrnaCodonStructure,
  title = {Learned mRNA coding-sequence designers are Pareto-suboptimal on their own objective},
  author = {Numaro Autoresearch Team},
  institution = {Numaro},
  number = {NUMARO-2026-014},
  year = {2026},
  url = {https://numaro.tech/research/mrna-codon-structure-design-2026/}
}