paper · methodology · reproducibility

The loop is closed.
The gap is characterized.

dydact's operator detects confabulation, prescribes a fix, measures whether the fix worked, and reports honestly when the result is partial. Below: the paper's headline findings, reproducible from any academic API key.

0.03
decade error on held-out Nb (v0.3)
43%
MAE reduction v0.1 → v0.3
3.0×
pressure-axis span learned (v0.3, Pb-vs-CaH6)
132
calibrated corpus entries (v0.3)
finding 1

Structural generalization, confirmed.

Held Nb out of training entirely. The operator predicted Nb@ambient at 9.82K (measured 9.25K, 0.03-decade error) with physically-correct monotonic pressure suppression across a 1.3-decade Tc range. Learned from Pb-under-pressure and MgB2-under-pressure curves; generalized to the new compound.

Paper-2 scope concern closed: hold-out predictions do not collapse to hydride-regime values even though H3S sits in the adjacent geometry. The BCS-cluster anchors the prediction gradient, not the hydride basin.

  0.0001 GPa  →  9.82K   // near-measured Nb@ambient
  1.0    GPa  →  5.65K
  10.0   GPa  →  1.44K
  50.0   GPa  →  0.66K
  100.0  GPa  →  0.53K
  200.0  GPa  →  0.45K

Reproduce with cookbook recipe 03 →

finding 2

Qualitative generalization on FeSe, scale compressed.

All FeSe training entries excluded (including 6 pressure-augmented points). Operator predicted pressure-induced enhancement → peak → decay with correct peak magnitude (35K predicted vs 37K measured) but wrong peak position (50 GPa predicted vs 6 GPa measured).

What this shows: the operator learned iron-pnictide pressure behavior from BaFe2As2 alone and transferred it to the held-out FeSe. Qualitative transfer works; quantitative fit is off. An honest result — we report both dimensions.

finding 3

Mode-I → mode-II confabulation transition.

v0.1 sat in mode-I (pressure signal indistinguishable from crystal-axis noise). v0.3 shifted to mode-II (strong pressure response, matched by crystal-axis sensitivity). Reed's confabulation-spectrum framework catches both. The 3.0× ratio threshold stays the arbitration; the span-diagnostic distinguishes the regime.

Mode-I

Weak pressure signal, indistinguishable from noise. Prediction is directional at best. API confidence: mode-I.

Mode-II

Real pressure response but matched by crystal-axis sensitivity. Prediction is worth cross-checking experimentally. API confidence: mode-II.

Clean

Pressure signal dominates noise. Prediction is safe to act on. API confidence: clean.

Stale

Calibration invalidated — pipeline fingerprint changed. API refuses to rate until re-run. stale.

why the paper ends here and the api begins

Clean outcome-B not reached. Prescribed next steps in the paper.

The v0.3 closed loop is partial by intent — the paper publishes the diagnosis, the fix, and the honest characterization of what's still open. Two paths to clean outcome-B:

More corpus

5–10× augmentation with hundreds of compounds × multiple pressures each. Arxiv + PubMed + USPTO ingestion in progress.

Architectural modification

Gate pressure sensitivity separately from crystal perturbation. Active research.

Invariance constraint

Pressure-invariance training constraint. Candidate for v0.4.

The API lets you run the current v0.3 operator directly — read the confidence verdict and weight your experimental follow-up accordingly. When v0.4 lands and clears the threshold, your existing integration upgrades automatically (or pin to v0.3 explicitly with ?version=v0.3 if reproducibility matters).

citation

How to cite.

Working citation (final BibTeX updates when the paper clears review):

@misc{dydact2026,
  title         = {Closed-loop Structural Intelligence: Diagnosis, Prescription,
                   and Honest Confabulation Reporting for Material Property Prediction},
  author        = {Ukpeh, Francis and collaborators},
  year          = {2026},
  institution   = {dydact},
  url           = {https://github.com/dydact/dydact}
}

If you use the API for published work, email access@dydact.io with the preprint/citation — we track paper count as the primary academic-tier metric.