vault backup: 2025-06-30 15:22:11

This commit is contained in:
Dane Sabo 2025-06-30 15:22:11 -04:00
parent 76bf9c8c78
commit 736018164c
2 changed files with 211 additions and 6 deletions

View File

@ -7,7 +7,7 @@ authors:
citekey: "urbanReviewFormalMethods2021"
publish_date: 2021-04-21
last_import: 2025-05-12
last_import: 2025-06-30
---
# Indexing Information
@ -28,46 +28,252 @@ Published: 2021-04
# Annotations
## Notes
![[Notes on Papers/A Review of Formal Methods applied to Machine Learning]]
![[Paper Notes/A Review of Formal Methods applied to Machine Learning.md]]
## Highlights From Zotero
>[!tip] Brilliant
> Recent advances in artificial intelligence and machine learning and the availability of vast amounts of data allow us to develop computer software that efficiently and autonomously perform complex tasks that are difficult or even impossible to design using traditional explicit programming (e.g., image classification, speech recognition, etc.). This makes machine-learned software particularly attractive even in safetycritical applications, as it enables performing a whole world of new functions that could not be envisioned before
> 2025-05-06 5:03 pm
>
>[!warning] Dubious
> Safety-critical applications require an extremely high level of insurance that the software systems behave correctly and reliably.
> 2025-05-06 5:04 pm
>
>[!quote] Other Highlight
> For instance, they are used at an industrial level in avionics [146], where the development processes of aircraft software systems have very stringent assurance and verification requirements mandated by international standards (i.e., DO-178C).
> 2025-05-12 11:25 am
>
>[!highlight] Highlight
> In contrast, research in formal methods for the development and assurance of machine learning systems remains today extremely limited. It is however rapidly gaining interest and popularity, mostly driven by the growing needs of the automotive and avionics industries.
> 2025-05-06 5:07 pm
>
>[!highlight] Highlight
> Specifically, in the following, we give an introduction to formal methods (Section 2) with a particular focus on static analysis by abstract interpretation [37], as it offers a unifying theory for reasoning about disparate formal verification approaches (Section 2.3).
> 2025-05-06 5:07 pm
>
>[!done] Important
> Formal methods are an array of techniques that employ logic and mathematics in order to provide rigorous guarantees about the correctness of computer software.
> 2025-06-23 2:23 pm
>
>[!done] Important
> Developing and checking by hand the very large proofs that are required to verify real-sized applications in realistic languages would be intractable.
> 2025-06-23 2:24 pm
>
>[!tip] Brilliant
> However, a requirement of all formal methods is that they should be sound, that is, any correctness property established by the tool is indeed true of the actual executions of the program.
> 2025-06-23 2:38 pm
>
> *A good definition of soundness*
>[!highlight] Highlight
> Deductive Verification. Deductive methods stem directly from the work of Floyd, Hoare, and Naur [68, 83, 121]. The user provides a program and a formal specification expressed in a logic language. The specification is then propagated through the program source code, using symbolic execution as proposed by Burstall [23] or weakest preconditions as proposed by Dijkstra [51].
> 2025-06-23 2:53 pm
>
>[!done] Important
> One benefit of deductive verification is its ability to handle complex specifications using decidable logical theories (reasoning about, e.g., integers, floating-point numbers, pointers, arrays, recursive data-structures) and to perform modular verification by breaking down the specification into function contracts with preconditions and postconditions.
> 2025-06-23 2:53 pm
>
>[!highlight] Highlight
> Proof Assistants. Interactive proof assistants are general-purpose tools able to produce reliable proofs of theorems, with application to mathematics, logic, and computer science. Their strength lies in the use of very expressive logic and the ability to check every proof with a small trusted core, reducing the possibilities of errors. Although they can offer some degree of automation (e.g., through tactics), they employ undecidable logics and essentially rely on a very tight interaction between the programmer and the tool, which can prove time consuming.
> 2025-06-23 2:56 pm
>
>[!highlight] Highlight
> and other liveness properties), and the ability to check concurrent models. Model
> 2025-06-23 2:35 pm
>
>[!tip] Brilliant
> Crafting a model that is both faithful and efficiently checked is a difficult task. Even then, the soundness and completeness properties only hold with respect to the model, not the original system.
> 2025-06-24 10:06 am
>
>[!done] Important
> Bounded model checking [14] (e.g., the CBMC tool [35]) limits the exploration to execution traces of fixed length. The drawback of this method is that it loses soundness and completeness with respect to the full program; while it can uncover errors occurring at the beginning of the execution of the program, it cannot prove the absence of errors.
> 2025-06-24 10:07 am
>
>[!done] Important
> The benefit of static analysis by abstract interpretation is full automation, parameterized precision, and the ability to verify large programs in real languages at the source level (C, Java, etc.) or binary level. The analysis is however inherently incomplete, and can fail to verify a desired property due to over-approximations. Thus, it can output false alarms, that must be checked by other means.
> 2025-06-24 10:08 am
>
>[!done] Important
> As a critical aircraft component, avionic software is subject to certification by certification authorities and must obey the DO-178 international standard, which notably describes the verification process that such software must follow. Traditional methods consist in massive test campaigns and intellectual reviews, which have difficulties scaling up and maintaining a reasonable cost (which already accounts for more than half of the cost of overall software development). A shift occurred in the 2010s with the new DO-178C revision (and its DO-333 supplement) which opens the door to certification by formal methods. One key aspect of the standard is the emphasis on employing sound formal methods. Another key aspect is that any verification technique (formal or not) employed at the source level cannot be considered valid at the binary level unless the compilation process is also certified, which is important in practice as some tools can only reason at the source level.
> 2025-06-24 10:11 am
>
>[!highlight] Highlight
> This state semantics forgets about the history of computation (i.e., which state appears before which state), and so, cannot be used to verify temporal properties.
> 2025-06-24 10:18 am
>
>[!warning] Dubious
> With this abstraction, we can no longer prove that X never takes the value 4.
> 2025-06-24 10:19 am
>
> *Huh?*
>[!warning] Dubious
> As machine learning enters critical industries, it would be interesting to combine verification techniques for traditional software and for artificial intelligence using abstract interpretation as a common theory, and develop an analyzer adapted to this new generation of critical software.
> 2025-06-25 12:44 pm
>
> *This posits that artificial intelligence and software are the same thing, kind of.*
>[!done] Important
> The training phase is highly non-deterministic, which is in contrast with predictability and traceability requirements of traditional safety-critical software development processes
> 2025-06-25 12:45 pm
>
>[!highlight] Highlight
> Most formal verification methods proposed so far in the literature apply to trained machine learning models (Section 3.1) while only few are dedicated to the data preparation and model training phases (Section 3.2).
> 2025-06-25 12:46 pm
>
>[!fail] This ain't right
> they generally require several hours for neural networks with hundreds or thousands of neurons)
> 2025-06-25 12:51 pm
>
> *Relative to what???*
>[!highlight] Highlight
> Note, however, that soundness is not typically guaranteed with respect to floating-point arithmetic, but only with respect to computations on reals that ignore rounding errors and may thus differ from the actual computations [122, 105].
> 2025-06-25 12:51 pm
>
>[!done] Important
> SMT-based formal methods reduce the safety verification problem to a constraint satisfiability problem. Specifically, they encode a given neural network and (the negation of a) desired safety property as a set of constraints. If the constraints are satisfiable, the corresponding solution represents a counterexample to the safety property. Otherwise, the constraints are unsatisfiable, no counterexample can exist and thus the safety property holds.
> 2025-06-25 12:52 pm
>
>[!tip] Brilliant
> The approach is evaluated on a small vehicle collision avoidance use case and on a neural network with over a thousand neurons trained for digit recognition on the MNIST dataset [101].
> 2025-06-25 1:00 pm
>
>[!tip] Brilliant
> The approach is applied to the verification of the ACAS Xu neural networks [90], developed as early prototype for the next-generation airborne collision avoidance system for unmanned aircraft. The neural networks take sensor data as input (i.e., speed and present course of the current aircraft and of any nearby intruder) and issue appropriate navigation advisories. They consist of six hidden layers with 300 neurons each.
> 2025-06-25 1:02 pm
>
>[!done] Important
> The experimental evaluation shows that Marabou (in divide-and-conquer mode running on four cores) generally outperforms both Planet [60] and Reluplex and is able to verify all ACAS Xu benchmarks within a one hour timeout per benchmark.
> 2025-06-25 1:03 pm
>
>[!highlight] Highlight
> Other Complete Formal Methods. Ruan et al. [134] presented an approach based on global optimization for verifying Lipschitz continuous neural networks [148], such as neural network with piecewise linear layers (i.e., fully-connected, max- and averagepooling, and convolutional layers) and ReLU as well as sigmoid and tanh activations. Specifically, given inputs to the neural networks and a Lipschitz continuous functions over its outputs (e.g., a function determining the confidence of the neural network in classifying an input), their approach computes lower and upper bounds on the function values. The approach is implemented in a tool named DeepGO (https://github.com/trustAI/DeepGO). Compared to Reluplex [92] and Sherlock [58], DeepGO on average is 36 times faster than Sherlock and almost 100 times faster than Reluplex. Moreover, DeepGO is shown to be able to scale to state-of-the-art neural networks with millions of neurons trained on the MNIST dataset [101].
> 2025-06-25 5:53 pm
>
>[!highlight] Highlight
> On the ACAS Xu neural networks, Neurify is on average 20 times faster than ReluVal and 5000 times faster than Reluplex. The experimental evaluation also shows that Neurify is able to verify safety properties of neural networks with over ten thousands hidden neurons such as the self-driving car convolutional neural network Dave [15].
> 2025-06-25 5:56 pm
>
>[!highlight] Highlight
> Incomplete formal verification methods are generally able to scale to large neural network architectures (they generally require at most a few minutes for neural networks with thousands of neurons) and are sound (often also with respect to floating-point arithmetic [140, 141, 105]) but suffer from false positives.
> 2025-06-30 10:26 am
>
>[!done] Important
> The experimental evaluation on 90 of the ACAS Xu benchmarks shows that Marabou [93] combined with their abstraction framework outperforms Marabou alone. Within a 20 hours timeout, it is able to verify 58 benchmarks, while Marabou alone can only verify 35 benchmarks. The abstractionenhanced Marabou is also orders of magnitude faster than Marabou alone.
> 2025-06-30 10:44 am
>
>[!done] Important
> Support vector machines [44] partition their input space in regions using hyperplanes that separate the training inputs according to their labels. They hyperplanes are chosen (i.e., learned during training) to maximize their distance (called margin) to their closest training inputs (called support vectors).
> 2025-06-30 1:59 pm
>
>[!tip] Brilliant
> hen the training inputs are not linearly separable in the input space, the input space is implicitly projected, using so-called kernel functions, into a much higher-dimensional space in which the training inputs become linearly separable. This allows a support vector machine to also learn non-linear decision boundaries.
> 2025-06-30 1:59 pm
>
>[!done] Important
> Robust training aims at minimizing the worst-case loss for each given input, that is, the maximum loss (i.e., model error) over all possible perturbations of a given training input. However, since calculating the exact worst-case loss can be computationally costly in practice, robust training approaches typically minimize an estimate of the worst-case loss.
> 2025-06-30 3:14 pm
>
>[!highlight] Highlight
> Another interesting possibility is to develop approaches to determine constraints on the training process that enforce a desired property on the trained model. For instance, one could enforce a specific behavior for a given input range of interest, or a particular structure of the trained model that facilitates its subsequent verification [171] (or its abstraction for verification [125, 145, 62]).
> 2025-05-06 5:10 pm
>
>[!highlight] Highlight
> More generally, verification methods should allow verifying the behavior of trained models under all circumstances and not just over the expected safe input space. This is particularly useful in cases in which the trained model is part of a larger machine learning-based system [53] and is thus more susceptible to receive unexpected inputs, e.g., from malfunctioning sensors
> 2025-05-06 5:11 pm
>
>[!highlight] Highlight
> In this case, it would be useful to design verification methods able to relate the behavior of the inexact model to the optimal expected behavior, e.g., by determining the worst-case approximation of the trained model.
> 2025-05-06 5:12 pm
>
>[!highlight] Highlight
> Finally, there is a need for further approaches that support diverse model structures, such as recurrent neural networks [95, 175], and that can verify model implementations by taking floating-point rounding errors into account [140, 141, 105].
> 2025-05-06 5:13 pm
>
## Follow-Ups
>[!example]
>For instance, they are used at an industrial level in avionics [146], where the development processes of aircraft software systems have very stringent assurance and verification requirements mandated by international standards (i.e., DO-178C).
>- [x] #Follow-Up ✅ 2025-05-12
>The CompCert certified compiler [103] is used to compile the C code into binary.
>- [ ] #Follow-Up
>[!example]
>Kurd and Kelly [98] were the among the first to propose a characterizations of verification goals for neural networks used in safety-critical applications.
>- [ ] #Follow-Up
>[!example]
>Note that the presence of these non-linear ReLU constraints makes the problem NP-complete [92]
>- [ ] What does 'NP Complete' mean? #Follow-Up
>[!example]
>MILP-based Formal Methods.
>- [ ] What the fuck is MILP #Follow-Up
>[!example]
>Their framework supports the abstract domains of intervals [36], zonotopes [73], and polyhedra [38], as well as their bounded powersets.
>- [ ] My brain hurty. #Follow-Up
>[!example]
>Finally, a couple of theoretical results on the abstract interpretation of neural networks can be found in the literature. Baader et al. [5] show that for any continuous function f there exists a feed-forward fully-connected neural network with ReLU activations whose abstract interpretation using the domains of intervals [36] from an input region B is an arbitrarily close approximation of f on B.
>- [ ] #Follow-Up
>[!example]
>[4] A. Atki, A. Brahmi, D. Delmas, H. Essoussi, T. Marie, and F. Randimbivololona. Formalise to Automate: Deployment of a Safe and Cost-Efficient Process for Avionics Software. In ERTS 2018
>- [ ] #Follow-Up
>[!example]
>[60] R. Ehlers. Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks. In ATVA 2017.
>- [ ] #Follow-Up
>[!example]
>[107] B. Livshits, M. Sridharan, Y. Smaragdakis, O. Lhot ́ak, J. Nelson Amaral, B.-Y. E. Chang, S. Z. Guyer, U. P. Khedker, A. Møller, and Dimitrios Vardoulakis. In Defense of Soundiness: A Manifesto. 2015.
>- [ ] #Follow-Up

View File

@ -118,8 +118,7 @@ Published: {{ date | format("YYYY-MM") }}
{% endif -%}
> {{ a.annotatedText }}
> {{ a.date | format("YYYY-MM-DD h:mm a")}}
{%- if a.comment %}
>{% if a.comment %}
> *{{ a.comment }}*
{% endif -%}
{% endif %}