From Real Arithmetic to Reality

Verified Certified Robustness for Neural Networks - Part V

This post is the fifth in a series on the topic of Verified Certified Robustness for Neural Networks. These posts accompany and explain our recent papers A Formally Verified Robustness Certifier for Neural Networks that appeared at CAV 2025, and the more recent Lipschitz-Based Robustness Certification Under Floating-Point Execution, and the broader research agenda that those papers initiate.

The series so far comprises the following posts:

In the previous post, I showed how—perhaps surprisingly—it is possible to break a formally verified robustness certifier. The key idea was to exploit the gap between how the formal specification against which we verified the certifier says how neural networks behave, and how they actually behave in practice.

In this post, I want to look more closely at that gap.

In particular:

  • where it actually shows up, mathematically,
  • how we can model it precisely, and
  • what that means for certified robustness.

Where the Gap Actually Lives

In the third post, we defined a function NN(n,v) that captures how a neural network n behaves on input vector v.

That definition lives entirely in the world of real arithmetic. Its model for what a dense neural network layer W does when applied to an input vector v says that an exact matrix-vector multiplication is performed to produce an intermediate result, say z := Wx, such that for each index i of z we have: $$ z_i = \sum_{j=1}^n w_{ij}x_j $$

This definition is incredibly convenient. For example, all of the standard laws apply to it.

But, as we saw in the previous post, it is not how neural networks actually execute.

In reality, neural networks are implemented using floating-point arithmetic. Each individual operation of the neural network, including every addition and multiplication performed to multiply a dense layer’s weight matrix W with its incoming input vector v, introduces rounding error.


When “Small” Errors Become Large Effects

A natural reaction at this point is to say: floating-point arithmetic only introduces small errors. Surely those errors are too small to matter for certified robustness?

Unfortunately, we’ve known for years that that intuition does not hold.

In the previous post, we saw concrete examples showing that rounding effects can produce counterexamples to verified certified robustness at semantically meaningful perturbation bounds. At lower precisions such as float16 and bfloat16, counterexamples reach ε ≥ 0.1. And for adversarially constructed models at standard float32 precision, counterexamples reach ε up to 0.47—well within the range used for MNIST robustness evaluation.

To understand why this can happen, it helps to look more closely at what our certifier is actually checking.

In the third post, we defined robustness in terms of the ArgMax of the network’s output y: an input is robust if nearby inputs do not change which output component is largest, i.e., do not change ArgMax(y).

A convenient way to reason about this is to look at the difference between the largest output component and each of the others. Let i* be the index of the largest output: i* := ArgMax(y), and let j be any other index. We define the margin between class i* and class j as the following difference:

$$ m_{j,i*}(x) := y_{i*} - y_j $$

If this margin is positive, then class i* is preferred over class j. Robustness requires that this remains true for all nearby inputs within ε of the input x.

Remember that in the earlier posts in this series, we said we do robustness certification using Lipschitz constants, and that these Lipschitz constants are a measure of how sensitive the neural network’s output is to changes in its input. Well, specifically, the way that this kind of global Lipschitz-based certification works is that it checks robustness by performing a simple check at the original input x: $$ m_{j,i*}(x) > L_{j,i*} \cdot \varepsilon. $$ Here, Lj,i* is a Lipschitz constant that says how much the margin mj,i*(x) changes for a given change in the input x. If this inequality holds for all other indices j, then the output is certified robust at ε.

Crucially, it is not the size of the margin mj,i*(x), in isolation, that matters. What matters is how far it is from the threshold Lj,i* · ε. Writing $$ \Delta := m_{j,i*}(x) - L_{j,i*} \cdot \varepsilon, $$ certification succeeds precisely when Δ > 0 for all j.

There is no reason for Δ to be large. In many cases it can be arbitrarily small. When that happens, even very small floating-point perturbations to the computed margin can be enough to make Δ ≤ 0, thereby causing the certification condition to fail. If the Lipschitz constants are tight, this can mean that the input x is non-robust (because there could exist a nearby point x’ within ϵ of x that the model classifies differently) even though x is certified robust at ϵ.

In this sense, floating-point rounding error need not necessarily be large to break certification.


Modelling Floating-Point Error

One way to think about floating-point arithmetic is as an approximation to real arithmetic.

While this is true, it’s important to recognise that floating-point defines a different execution semantics. The natural question, then, is:

Can we model that semantics in a way that is precise enough to reason about?

The answer is yes—at least to a useful approximation.

People have been studying how to do this for almost 60 years, going back to the late 1950s. In 1960 a method that would become to be known as backward error analysis was developed for reasoning about floating-point computations. Let’s suppose we have a mathematical operation, like addition, multiplication, or even something more complicated. We use the symbol ⊙ to represent this operation. $$ z = x \odot y $$ Let’s use a ^ symbol to represent computations performed in floating-point arithmetic. A backward analysis for ⊙ might reason about it like this: $$ \hat{z} = (x \odot y)(1 + \delta) + \eta $$ The way to read this is that the floating-point computation of ⊙ behaves like a perturbed version of the standard mathematical computation, in which two kinds of error might be introduced: relative error, δ, and absolute error, η. Importantly the magnitude of both kinds of error is bounded by fixed quantities, which is what allows us to reason about how much the floating-point computation might differ from the exact one.

This kind of error analysis is very well studied for important floating-point algorithms, not least including the linear algebra operations at the heart of neural networks. Clever people have even formalised these results in interactive theorem provers.

Using this rich body of theory, we can track how floating-point rounding errors propagate through a neural network.


From Operations to Networks

Using these floating-point error models, we can reason about the difference between:

  • the real-valued network NN, and
  • the floating-point network that is actually executed.

Roughly speaking, we can show that the floating-point network behaves like the real network, but with additional perturbations introduced at each layer.

These perturbations accumulate across layers in a way that can be bounded. In particular, the computation at each layer can magnify the rounding errors already introduced by preceding layers, as well as introducing fresh rounding error. Luckily, we can bound both the amount of magnification as well as the amount of fresh error that each layer introduces. This gives us a way to reason about how much rounding error is present in the neural network’s outputs around a particular input point x.

This gives us a way to deduce from the robustness properties of the real-arithmetic network conclusions about the robustness of the floating-point network.


What This Means for Certification

When we account for floating-point effects we need to modify the robustness certification condition that we saw above: $$ m_{j,i*}(x) > L_{j,i*} \cdot \varepsilon. $$

The theory we develop in our paper allows us to bound how much the margin mj,i*(x’) might be reduced by floating-point rounding, for all x’ within ϵ of x. Let’s call this bound E. This gives us a degraded certification condition to check when x is robust at ϵ under floating-point execution: $$ m_{j,i*}(x) > L_{j,i*} \cdot \varepsilon + E. $$

To make this work, we also need to check that the floating-point computations inside the neural network never overflow. Our paper also develops certification checks for the absence of overflow. Broadly speaking, the analysis operates layer-by-layer forward through the neural network: at each layer we first check for the absence of overflow, which then allows us to bound the rounding error at that layer, which in turn allows us to check for the absence of overflow at the subsequent layer, and so on.


Pre-Deployment Hybrid Certification

The main theory that we develop is a worst case analysis: it works out how big E might be if the maximum rounding is applied everywhere.

In this sense, our theory is pessimistic. In fact, for low precision floating point formats like float16, our theory is so pessimistic that it is unable to certify robustness of standard models like MNIST or CIFAR-10: according to the theory, the rounding errors E can be so large in these formats that none of the model inputs are certified robust (even though many of them are certainly robust in reality).

Even for standard formats like float32, however, our analysis is still more pessimistic than ideal. We therefore developed a trick to reduce the conservatism in E.

To understand this, the error term E := Ectr + Eball is composed of two parts:

  1. Ectr: the rounding error that the neural network introduces when evaluated at the input point x we are trying to certify robust, and
  2. Eball: the hypothetical rounding error that the neural network might introduce were it run instead on a nearby point x’ within ϵ of x.

We can drastically reduce the first component of this error by evaluating the neural network at x in high precision (e.g. at float64). The answer produced here is very close to the exact answer (were the network run with real arithmetic). In fact we can even account for the tiny discrepancy between the high-precision execution and exact arithmetic by using the theory we already developed to bound rounding errors. Even though the theory is pessimistic, the pessimism at high precision formats like float64 is incredibly small (because these formats only ever introduce incredibly tiny rounding errors). We call the resulting certification method a hybrid certification approach, because it uses both static and runtime information to certify robustness.

You might wonder why we would want to certify a model’s output by essentially comparing that output to a high-precision execution of that model. After all, if we can run the model at high precision, why not do that in the first place and just certify the high-precision output as robust?

The reason is that we might not have the high-precision model available at inference time, where it might be too expensive to execute or incompatible with the deployment hardware.

However, if we have the ability to run the model at high-precision before it is deployed, then we can use this hybrid certification method to bound the model’s robustness over an input distribution ahead of time. This is a common use case for methods for verifying or certifying robustness that are too expensive to run at inference time.

Our paper shows that this hybrid certification approach can efficiently check a model’s robustness over its standard test set (e.g., the 10,000 points in the MNIST test set) while accounting for floating-point rounding. We compare against ERAN, to our knowledge the only publicly available pre-deployment robustness verifier that claims soundness under floating-point neural network execution. Despite certifying a stronger property than ERAN, our method certifies a comparable proportion of correctly-classified test inputs as robust on MNIST, substantially more on Fashion MNIST, and considerably fewer on CIFAR-10, where the larger input dimension drives greater error accumulation.

Unfortunately, while greatly reducing the pessimism at standard formats like float32, this hybrid certification approach is still too pessimistic to allow useful robustness certification at low-precision formats like float16. The reason is that, while the high-precision execution does a good job at drastically bounding the amount of rounding error at the input point x, it doesn’t tell us much about the rounding error we would see at nearby points x’ within ϵ of x. Similarly, for the adversarially biased model from the previous post, the bias-induced rounding errors are so large that our certification yields 0% of test inputs certified robust at float32. We leave further hybrid certification improvements for future work.


The Irony of Soundness

The counterexamples from the previous post show that floating-point rounding becomes a genuine threat to certification soundness in two scenarios: at low-precision formats like float16 and bfloat16, and under adversarial model construction at standard float32 precision. It is somewhat ironic that in both of these scenarios, while our theory remains sound—correctly rejecting all counterexamples—certification becomes vacuous: the error bounds are so large that no inputs are certified robust, even when using hybrid certification.


Looking Ahead

The floating-point-aware certifier we describe in our paper is implemented in Python but is not itself formally verified. This means it could, in principle, contain implementation-level bugs of the kind we found in Leino et al.’s certifier. The theory behind it, however, is formalised in the Rocq theorem prover.

A natural next step would be to close this gap: extend the formal specification of our verified real-arithmetic certifier to cover the floating-point-aware theory introduced in this post, and verify a certifier implementation against that stronger specification. That would give us a certifier whose guarantees apply directly to neural networks as they actually execute.

The key takeaway from this post is simple:

To reason soundly about neural networks in practice, we need to reason about the semantics under which they actually execute.

Understanding that gap is the first step towards closing it properly. In this post we have seen where the gap lives, how it can be modelled using well-studied floating-point error analysis, and what it means for certified robustness. The theory we developed provides the foundation that a formally verified floating-point-aware certifier would need to build on.

If you’re interested in this line of work, please get in touch.

In the meantime, thanks for reading.