Iterative Refinement Neural Operators are Learned Fixed-Point Solvers

A Principled Approach to Spectral Bias Mitigation

1 Dartmouth College 2 The Chinese University of Hong Kong, Shenzhen 3 Lawrence Berkeley National Lab
ICML 2026  ·  Spotlight

TL;DR

IRNO treats operator prediction not as a one-shot map, but as an iterative refinement process.

Iterative Refinement Neural Operator (IRNO) augments a pretrained neural operator with a shared-weight refinement module that iteratively corrects residual errors through a fixed-point-style update. This enables the model to recover fine-scale structures that are often smoothed out by single-pass neural operators.

Plug-and-play

Wraps around a pretrained neural operator with no retraining or architectural modification of the base model.

Fixed-point interpretation

Provides a contraction-style refinement view under local assumptions, yielding a principled bound on the approximation error.

Targets spectral bias

A progressive spectral loss increasingly emphasizes high-frequency recovery across refinement steps.

Up to 56% error reduction

Achieves up to 56.05% VRMSE improvement on turbulent flow benchmarks and remains stable beyond the training iteration count.

Interactive Demos

Step through refinement iterations to see how IRNO progressively recovers fine-scale details. The blue marker indicates the training cutoff \(K\).

ERA5 16× Super-Resolution 45×90 → 720×1440  |  Base: FNO
Training cutoff \(K=6\)
ERA5 16x IRNO prediction
Base FNO
ERA5 16x ground truth
Ground Truth
k=0k=2k=4K=6k=8k=10k=12
ERA5 8× Super-Resolution 90×180 → 720×1440  |  Base: FNO
Training cutoff \(K=6\)
ERA5 8x IRNO prediction
Base FNO
ERA5 8x ground truth
Ground Truth
k=0k=2k=4K=6k=8k=10k=12
TR-2D Turbulent Radiative Layer 384×128  |  Base: TFNO
Training cutoff \(K=4\)
TR-2D IRNO prediction
Base TFNO
TR-2D ground truth
Ground Truth
k=0k=1k=2k=3K=4k=5k=6k=7k=8

Abstract

Neural operators serve as fast, data-driven surrogates for scientific modeling but typically rely on a monolithic, single-pass inference procedure that struggles to resolve high-frequency details, a limitation known as spectral bias. We introduce the Iterative Refinement Neural Operator (IRNO), which augments pre-trained operators with a learned refinement module iteratively applied via fixed-point iteration. IRNO decomposes the prediction into a coarse initialization followed by successive residual corrections, paralleling classical numerical solvers. Under mild assumptions, we establish contraction of the induced operator, ensuring convergence to a unique fixed point. To explicitly target high-frequency errors, we propose a progressive spectral loss that adaptively increases penalty on high-frequency components over refinement steps during training.

Across physical systems, IRNO consistently lowers error, with up to 56.05% improvement on turbulent flow. On Active Matter, spectral analysis reveals that, relative to the base operator, the normalized error ratios decrease to 27.72–36.10% in low-, 5.07–6.68% in mid-, and 1.48–2.04% in high-frequency bands, remaining stable beyond the trained iteration count.

Pipeline

Top: standard neural operators perform single-pass inference and often lose fine-scale details. Bottom: IRNO iteratively corrects residuals with a shared-weight refinement operator.
Standard neural operators produce a single-pass estimate (top). IRNO reformulates inference as a dynamic process, iteratively correcting the residual with a shared-weight refinement operator (bottom).
Standard neural operators
  • Predict the solution in a single forward pass
  • Monolithic inference can smooth out fine-scale structures
  • Often exhibits strong spectral bias, especially in high-frequency regimes
IRNO
  • A frozen base operator first provides a coarse initialization, \(h_0 = T_{\text{base}}(x)\)
  • A shared-weight refinement operator \(\Phi_\theta\) is then applied iteratively
  • At each step, it takes the original input \(x\) together with the current estimate \(h_k\)
  • It predicts a residual correction and updates the state as \(h_{k+1} = h_k + \alpha \cdot \Phi_{\theta}(x,\, h_k)\)
  • A progressive spectral loss increasingly emphasizes high-frequency recovery across refinement steps
  • The base operator requires no retraining or architectural modification

Theory

Under local assumptions, IRNO's update rule is shown to be a contraction mapping. Even when the learned operator carries non-zero bias at the true solution, convergence is still guaranteed as the iteration settles to a unique fixed point whose error floor scales linearly with that bias.

Corollary 3.3  —  Convergence with Bias

If \(b = \Phi_\theta(x, y) \neq 0\) and \(\|b\|\) is sufficiently small, the iteration converges linearly to a unique fixed point \(h^*\). The limiting error satisfies

\[ \|e^*\| \;\leq\; \frac{\alpha\,\|b\|}{1 - q} \;+\; \mathcal{O}(\|b\|^2) \]

where \(q < 1\) is the contraction factor and \(\alpha\) is the step size. Minimizing \(\|b\|\) via the fixed-point regularizer \(\mathcal{L}_{\text{fp}}\) directly lowers the error floor.

The scatter plots below validate Corollary 3.3 empirically. Across both benchmarks, the minimum attainable error correlates strongly and linearly with the bias magnitude \(\|\Phi_\theta(x, y)\|\), with Pearson \(r > 0.93\) in both cases.

Bias vs error floor — Active Matter
Active Matter (FNO). Pearson \(r = 0.933\), \(p \ll 10^{-10}\).
Bias vs error floor — TR-2D
TR-2D (TFNO). Pearson \(r = 0.949\), \(p \ll 10^{-10}\).

Results

56.05%
VRMSE reduction
Turbulent flow (TR-2D)
80.46%
VRMSE reduction
Active Matter
34.09%
RFNE reduction
ERA5 16× SR
21.3%
L² error reduction
Irregular mesh (CE-Gauss)

Iterative error reduction across physical systems. FNO evaluated at \(K=6\), TFNO and WDSR at \(K=4\). Metrics are VRMSE for TR-2D and Active Matter; ACC and RFNE for ERA5.

Dataset Metric Base Model Initial IRNO (Ours) Improvement
TR-2D VRMSE ↓ FNO 0.2394 0.1309 45.32%
TFNO 0.2371 0.1042 56.05%
Active Matter VRMSE ↓ FNO 0.1017 0.0501 50.73%
TFNO 0.1981 0.0387 80.46%
ERA5 ACC ↑ FNO 0.7523 0.892 18.59%
WDSR 0.9091 0.9104 0.143%
RFNE ↓ FNO 0.3247 0.214 34.09%
WDSR 0.2119 0.1953 7.83%

Spectral Dynamics

Across both panels (Active Matter, FNO), mid-to-high frequency error decreases monotonically with refinement, with the largest relative reductions near the Nyquist limit.

Dataset-level median normalized spectral error ratios
Dataset-level median normalized spectral error ratios \(\tilde{E}^{(k)}(\omega)\) across the test set, with shaded interquartile ranges (25–75%). IRNO exhibits consistent attenuation of mid-to-high frequency error with increasing refinement steps, with stable behavior near the Nyquist limit \(\omega = 128\).
Instance-level spectral MSE trajectories
Instance-level spectral MSE trajectories for a representative test sample across \(k \in [0, 12]\) refinement steps. Each curve shows the radial-spectral error of the base operator and successive IRNO refinements, illustrating monotonic reduction of spectral error and stable behavior beyond the training cutoff.

Transferability

IRNO trained on one base operator transfers to refine predictions from a different base operator without retraining, confirming that the learned update dynamics are not specific to a single architecture.

Subscripts denote the operator IRNO was trained with. For example, \(\text{IRNO}_{\text{TFNO}}\) was trained with TFNO as the base and is evaluated here on FNO predictions.

Dataset Metric Base + \(\text{IRNO}_{\text{train}}\) Initial IRNO (Ours) Improvement
TR-2D VRMSE ↓ FNO + \(\text{IRNO}_{\text{TFNO}}\) 0.2396 0.0994 58.53%
TFNO + \(\text{IRNO}_{\text{FNO}}\) 0.2366 0.1345 43.15%
Active Matter VRMSE ↓ FNO + \(\text{IRNO}_{\text{TFNO}}\) 0.1004 0.0445 55.66%
TFNO + \(\text{IRNO}_{\text{FNO}}\) 0.1955 0.1127 42.36%
ERA5 ACC ↑ FNO + \(\text{IRNO}_{\text{WDSR}}\) 0.7523 0.8022 6.22%
WDSR + \(\text{IRNO}_{\text{FNO}}\) 0.9091 0.9219 1.39%
RFNE ↓ FNO + \(\text{IRNO}_{\text{WDSR}}\) 0.3247 0.2823 13.06%
WDSR + \(\text{IRNO}_{\text{FNO}}\) 0.2119 0.1935 8.68%

Convergence Behavior

Error decreases monotonically across refinement steps and remains stable beyond the training cutoff \(K\) (dashed line), confirming the contraction dynamics predicted by theory.

TR-2D VRMSE vs refinement step
TR-2D (VRMSE ↓)
Active Matter VRMSE vs refinement step
Active Matter (VRMSE ↓)
ERA5 RFNE vs refinement step
ERA5 (RFNE ↓)
ERA5 ACC vs refinement step
ERA5 (ACC ↑)

Accuracy vs Compute

IRNO achieves a favorable accuracy-compute trade-off. Each additional refinement step moves along the Pareto frontier, allowing compute to be traded for accuracy at inference time.

ACC vs FLOPs Pareto plot
ACC vs FLOPs (ERA5). IRNO at each refinement step k shown as separate points.
RFNE vs FLOPs Pareto plot
RFNE vs FLOPs (ERA5). Lower-right is better.

BibTeX

@inproceedings{liu2026irno,
  title     = {Iterative Refinement Neural Operators are Learned Fixed-Point Solvers:
               A Principled Approach to Spectral Bias Mitigation},
  author    = {Liu, Xiaotian and Shang, Shuyuan and Wang, Xiaopeng and Ren, Pu and Yang, Yaoqing},
  booktitle = {Proceedings of the 43rd International Conference on Machine Learning},
  year      = {2026}
}