A system that protects itself.
And gets better at it.
Lyapunov-proven convergence. 200 rounds of adversarial evolution. False negative rate reduced by 60%.
Lyapunov-Proven Convergence
FNR -60% over 200 self-evolution rounds. Mathematically proven.
Most systems change over time. MAREF's evolution engine converges. Lyapunov stability analysis proves the system monotonically approaches a safer state — the error rate does not oscillate; it descends toward a provable minimum.
Provably converging.
Lyapunov stability guarantees the system gets safer over time, not just different. The destination is mathematically fixed.
200 rounds. 60% better.
Red-blue adversarial evolution. We attacked it 200 times. It learned every time. FNR dropped by 60%.
Trust that earns itself.
Five-factor Trust Engine v2. Reputation recalibrates with every interaction. Anti-gaming detected.
Evolution with a mathematical destination.
Lyapunov stability analysis proves the governance engine converges toward a safer state over time. The false negative rate does not oscillate — it descends monotonically toward a provable minimum. This is not "empirically better." It is mathematically guaranteed.
Lyapunov function: V(x) = xᵀPx, V̇(x) ≤ -α‖x‖² → exponential convergence
We attacked it 200 times. It thanked us.
Red-blue adversarial evolution pits attack agents against defense agents in 5-stage rounds. Attack intensity escalated from 2.47 to 18.98 (a 7.7x increase). The false negative rate dropped from baseline to -60%. Every attack made the system stronger.
from maref import RedBlueEvolution
evolution = RedBlueEvolution(
rounds=200,
attack_intensity=(
"escalate", # 2.47 → 18.98
"adapt", # learns from prior round
"diversify" # tries new attack surfaces
)
)
result = evolution.run()
print(f"FNR delta: {result.fnr_delta}%") # -60% Trust is earned. And recalibrated. Per interaction.
Trust Engine v2 weights five factors per interaction: recency, consistency, alignment, outcome quality, and adversarial resistance. Goodhart anti-gaming detection prevents agents from optimizing for the trust metric instead of actual trustworthiness.