This post is the third in a series on the topic of Verified Certified Robustness for Neural Networks. These posts accompany and explain our recent paper A Formally Verified Robustness Certifier for Neural Networks that will appear at CAV 2025, and the broader research agenda that that paper initiates.
The series so far comprises the following posts:
- Part I: Verified or Certified Robustness? Why don’t we have both?
- Part II: When “Verified Robust Accuracy” Isn’t, Actually, Verified
- Part III: Formally Verified Certified Robustness (this post)
In the first post I explained the scalability challenges faced by much prior work on verified robustness for neural networks, and how approaches instead based on certified robustness might provide a way to circumvent those challenges.
In the second post I explained what can go wrong with certified robustness implementations, in the form of exploitable bugs we found in a state-of-the-art certifier, of Leino et al.
In this post, I explain how formal verification can help overcome those problems, leading to a provably sound method for efficiently verifying the robustness of neural network outputs.
Before going any further, it is worth making something very clear: even though our formally verified robustness certifier is heavily inspired by Leino et al., we were unaware of the flaws in their certifier implementation until after we had completed the (initial) verification of our certifier.
In other words: formal verification helped us avoid bugs we didn’t even know we needed to avoid.
But how?
In this post, I’ll explain that. Towards the end, I’ll spell out how verification ensures that the specific faults we identified in Leino et al.’s certifier are absent in ours.
What is Formal Verification?
That question has had many books written about it. For the purposes of this blog post, however, formal verification is the process of demonstrating the correctness of a program through the application of logical deduction.
Doing so requires first stating in logic what is means for your program to be correct. That logical statement is called the program’s formal specification. One then uses logic to prove that the program’s behaviours are all consistent with the formal specification.
Writing the formal specification for a program is not necessarily straightforward. Indeed, it remains perhaps the major barrier to the wide application of formal methods. However, happily, robustness certifiers happen to be one of the kinds of programs for which we can write concise and clean formal specifications.
So, then, a formal specification is a logical statement of what it means for your program to be correct.
I’ve used the words “logic”, “logical deduction” and “logical statement” quite loosely. There are many kinds of logic and depending on the kind of program you are dealing with some are better suited than others. For imperative programs, Hoare logic has been the logic of choice for the past 60 years.
In Hoare Logic one specifies how a program should behave via logical statements, namely preconditions and postconditions. Given a program, its precondition states what the program assumes to be true before the program executes; while its postcondition states what should be true after the program has finished executing under the assumption that the precondition was true when the program began executing.
Here is a tiny example, taken from our verified robustness certifier:
1method SqrtUpperBound(x: real) returns (r: real)
2 requires x >= 0.0
3 ensures r >= Sqrt(x)
Here we are specifying a method SqrtUpperBound
that takes a real
number x
as its input, and returns
a real
number r
as its result. The requires
keyword is used to indicate the method’s precondition;
the ensures
keyword indicates its postcondition. So this specification says that so long as SqrtUpperBound
is given a non-negative input x
, it will return an answer that is as least as large as Sqrt(x)
.
Of course, this has meaning only once we know what Sqrt
means. In fact, Sqrt
is not an executable function.
Instead it is given meaning by the following logical axioms:
$$ \forall x : \mathtt{real}.\ x \ge 0 \Longrightarrow \mathtt{Sqrt(}x\mathtt{)} \ge 0 $$
$$ \forall x : \mathtt{real}.\ x \ge 0 \Longrightarrow \mathtt{Sqrt(}x\mathtt{)} \cdot \mathtt{Sqrt(}x\mathtt{)} = x $$
(The first axiom is needed to prevent things like -3 being a suitable answer to Sqrt(9)
.)
Putting all of this together, we can deduce that SqrtUpperBound
is specified to return a result that is indeed
an upper bound for the square root of its input.
This post doesn’t have room to dig into how SqrtUpperBound
might be implemented. Nor how one might prove
that its implementation does indeed satisfy the specification above.
In the case of our verified robustness certifier, we implement it in the Dafny programming language, which is designed to support formal verification. Dafny includes many facilities also for proving that programs meet their specifications, which are beyond the scope of this post.
(If you’re familiar with Hoare logic, however, and you’re curious about how we implemented and verified
SqrtUpperBound
, check out Section 7 of our paper.)
Specifying Certified Robustness
To specify what it means for our verified robustness certifier to be correct, we need to do two things:
- write down in logic what it means for a neural network to be robust for a particular output that it produces, and then
- specify that whenever our robustness certifier says that a particular output is robust, then Condition 1 actually holds
Formalising Condition 1 is akin to giving meaning to Sqrt
above via the two axioms; while doing step 2 is akin to writing the pre- and postcondition specification for SqrtUpperBound
.
Fortunately, the kinds of neural networks we consider in our work can be described in mathematical logic relatively straightforwardly. Roughly, each neural network layer can be represented as a matrix, and application of the neural network as repeated matrix multiplication.
We therefore define vectors, matrices, and matrix multiplication in logic. Vectors are sequences of real numbers, matrices are sequences of equal-length vectors, and matrix multiplication we define via recursive pure functions. A neural network we represent in logic as a sequence of matrices, one per layer. From this we define a pure function NN(n,v)
that takes a neural network n
(represented as a sequence of matrices) and an input vector v
as its arguments and returns the output that would be produced by running n
on v
. We also define the Euclidean distance between two vectors u
and v
via a pure function Distance(u,v)
, which incidentally makes use of the Sqrt
definition we saw above. We also define straightforward functions like ArgMax(v)
, which returns the index of the maximum component of its vector argument v
.
With all of this machinery we then define in Dafny a predicate Robust(v,v',e,n)
that states what it means for a neural network n
to be robust to perturbations bounded by distance e
at output v'
produced by input vector v
.
1ghost predicate Robust(v: Vector, v': Vector, e: real, n: NeuralNetwork)
2 requires IsInput(v, n)
3 requires NN(n, v) == v'
4{
5 forall u: Vector | |v| == |u| && Distance(v, u) <= e ::
6 ArgMax(v') == ArgMax(NN(n, u))
7}
IsInput(v,n)
captures the assumption that vector v
is the right input shape for neural network n
, while |v| == |u|
simply enforces that we only care about (all) inputs u
that have the same size as input v
.Notice how this definition corresponds almost exactly to the textbook definition of robustness (which
we defined in the first post in this series):
$$\forall\textbf{u}:||\textbf{v}-\textbf{u}||\le\varepsilon\implies\mathit{ArgMax}(N(\textbf{v}))=\mathit{ArgMax}(N(\textbf{u}))$$
The only difference is that the formal specification has to explicitly limit vectors u it considers
to those that have the same size as the input v, which is left implicit in the textbook definition.
The textbook definition also treats the neural network as a function N, rather than having an explicit
representation of it n
that then needs to be converted to an executable function via NN
.
The core of our robustness certifier is the CertifyMargin(v',e,L)
method, which takes as its arguments an output vector v'
, perturbation bound e
, and pre-computed Lipschitz constants L
. Its specification says that
whenever it returns true
, then Robust(v,v',e,n)
must hold for all inputs v
and neural networks n
for which L
are Lipschitz bounds and for which v' == NN(n, v)
. We specify this via its postcondition.
Verifying the Dafny Implementation
The implementation of our certifier we verify in Dafny. Much of the verification is done by defining
intermediate properties, such as what it means for purported Lipschitz bounds L
to be valid Lipschitz
bounds for a neural network n
, and then proving intermediate conditions, such as that the part of our
certifier that computes Lipschitz bounds does indeed compute valid Lipschitz bounds.
Doing so requires writing and proving many intermediate lemmas, whose details are beyond the scope of this post. Roughly speaking, however, Dafny proofs are akin to programs, and proving can be seen as a very precise (if dry) form of programming.
One key advantage of verifying our certifier’s implementation is that once we finished the initial proof, we were able to experiment with performance optimisations without fear that we would inadvertently introduce bugs into our certifier: so long as we fixed the proofs after implementing each optimisation, we could be assured that the certifier was still sound. This allowed us to implement a custom normalisation procedure within the core of our Lipschitz bounds computation procedure, which resulted in transforming the running time of the part of our certifier that computes Lipschitz bounds from exponential to linear.
This is an example of the observation—well known amongst formal methods practitioners but all too often overlooked by outsiders—that the application of formal verification can often help unlock performance optimisations in verified code.
How does this help to avoid the kinds of bugs we found in Leino et al.’s implementation?
We didn’t carry out our verification to avoid the bugs we found in Leino et al.’s certifier. Indeed, the point of verification is not to avoid specific bugs but instead to avoid whole classes of bugs. That said, because our certifier is verified it does indeed avoid the problems we discovered in Leino et al.’s certifier.
If you haven’t read my previous post that discusses those bugs in detail, do check it out. However, below I briefly summarise each problem and then explain how our verification avoids it.
Issue 1: Power Iteration Does Not Guarantee Soundness
The first issue we found in Leino et al.’s implementation was their use of the power method for computing Lipschitz bounds: namely, that it doesn’t actually guarantee to produce a safe upper bound. Had we chosen to implement the power method, we would not have been able to prove the specification for the Lipschitz bounds computation procedure in our certifier, whose postcondition requires that it does indeed compute safe upper bounds.
Issue 2: A Subtle Normalisation Bug
The second issue was a subtle bug in the normalisation procedure of Leino et al.’s power iteration implementation. We avoid this bug trivially by avoiding power iteration. But, perhaps less glibly, it is worth noting that we also implement multiple normalisation procedures in our certifier. Because each is formally specified to normalise correctly—in the sense that normalisation does not destroy the Lipschitz upper bounds that are computed—and are verified to adhere to their specifications, we gain assurance that our normalisation implementation does not introduce unsoundness into our certifier.
As an example, consider the aptly named (at least for non-American English speakers) Normalise
method, whose
postcondition explicitly states how it affects the Lipschitz bounds. The same applies to the Truncate
method, whose
postcondition does likewise.
Issue 3: Floating-Point Imprecision
The third issue we documented arises from floating point imprecision: specifically, due to floating point rounding, Leino et al.’s certifier can mistakenly compute Lipschitz constants of 0 for neural networks with very tiny weights.
We avoid this problem by implementing our certifier using Dafny’s real numbers. In practice, these are compiled to arbitrary precision rational numbers.
This is a sound design choice for a certifier, like ours, that is designed to be used only at inference time (when soundness is most crucial). For certifiers meant to be used only at training time, then a floating point implementation is a better design choice since it will be many orders of magnitude more efficient (and certification soundness is arguably not nearly as important during model training as it is during model evaluation and application).
Issue 4: A Subtle Certification Routine Bug
The final issue we discussed in Leino et al.’s certifier was a subtle bug that caused outputs like the 0 vector to be certified as robust always. Our verification avoids this issue trivially because we specify that our certifier should say that an output is robust only when it is actually robust.
This can be seen in the main method of our certifier, which contains an assert statement that directly precedes the print statements that are used to print out certification results. This assertion is not checked at runtime but is instead proved once-and-for-all when our certifier is verified by Dafny. The assertion says that our certifier will print out that a certification was true only if the certified output really was robust.
So Can We Trust Our Certifier 100%?
The short answer is “no”. A longer answer is that our certifier is far more trustworthy than unverified implementations; however, no formal verification is perfect. In particular, when one writes a formal specification, one must always abstract away from messy realities of the real world. If an adversary is able to exploit one of the details you chose to abstract away from, then they may be able to invalidate your formal verification.
A prime example is the NN
function I mentioned earlier, which formally specifies how a neural network is applied to an input
vector. We specify this using real arithmetic. However, actual neural networks will in practice be
implemented using floating point arithmetic. We have chosen therefore to abstract away from the messy
realities of floating point arithmetic here, in favour of the platonic ideal of real arithmetic instead.
What does this mean in reality? It means that our certifier computes sound Lipschitz bounds, and thus sound certification results, for neural networks that are implemented using real arithmetic (i.e., for very few deployed neural networks). Whether those bounds and results also hold when those same neural networks are implemented using floating point arithmetic, and whether the difference matters and is exploitable, is an open question.
This issue aside, it is also worth noting that the usual caveats apply when talking about guarantees and formal methods.
Summary
Verification is a powerful tool for ensuring program correctness. It was especially useful to us to create a sound robustness certifier. In particular, sound robustness certification is a property that is amenable to formal specification. While I haven’t said much about our implementation, it is worth noting that formal verification was possible here only because methods for performing sound robustness certification are also relatively straightforward to implement and to formally verify.
If you’re interested in further details, check out our paper.
However, this is only the beginning. While useful, our certifier is best seen as a very promising proof-of-concept. There remains much work to be done to extend our ideas to more powerful kinds of neural networks. Not to mention closing the specification gap between real and floating point arithmetic that I mentioned.
In the meantime, thanks for reading. Please get in touch if you’re curious about any aspect of this work.