From 03b523017051e21bda68d0d1f9a443f8daaffd60 Mon Sep 17 00:00:00 2001 From: Dane Sabo Date: Thu, 15 May 2025 14:17:51 -0400 Subject: [PATCH] vault backup: 2025-05-15 14:17:51 --- ...lver for Verifying Deep Neural Networks.md | 38 ++++++++++++++++++- ...fe Reinforcement Learning via Shielding.md | 21 +++++++++- .../NNFM Ontology.canvas | 17 +++++---- ...lver for Verifying Deep Neural Networks.md | 29 ++++++++++++++ ...fe Reinforcement Learning via Shielding.md | 29 ++++++++++++++ 5 files changed, 122 insertions(+), 12 deletions(-) create mode 100644 Paper Notes/Reluplex- An Efficient SMT Solver for Verifying Deep Neural Networks.md create mode 100644 Paper Notes/Safe Reinforcement Learning via Shielding.md diff --git a/Literature Notes/Reluplex - An Efficient SMT Solver for Verifying Deep Neural Networks.md b/Literature Notes/Reluplex - An Efficient SMT Solver for Verifying Deep Neural Networks.md index 42f8ae0c..b17d7d3b 100644 --- a/Literature Notes/Reluplex - An Efficient SMT Solver for Verifying Deep Neural Networks.md +++ b/Literature Notes/Reluplex - An Efficient SMT Solver for Verifying Deep Neural Networks.md @@ -20,7 +20,7 @@ publish_date: 2017-01-01 publisher: "Springer International Publishing" location: "Cham" pages: 97-117 -last_import: 2025-05-12 +last_import: 2025-05-15 --- # Indexing Information @@ -33,7 +33,8 @@ Published: 2017-01 #Airborne-Collision-Avoidance-System, #Deep-Neural-Networks-DNNs, #Rectified-Linear-Unit-ReLU, #ReLU-Function, #Satisfiability-Modulo-Theories-SMT -#ToRead +#InFirstPass + >[!Abstract] @@ -46,5 +47,38 @@ Published: 2017-01 ## Highlights From Zotero + + +>[!done] Important +> The technique is basedon the simplex method, extended to handle the non-convex RectifiedLinear Unit (ReLU ) activation function, which is a crucial ingredientin many modern neural networks. The verification procedure tacklesneural networks as a whole, without making any simplifying assump-tions. We evaluated our technique on a prototype deep neural networkimplementation of the next-generation airborne collision avoidance sys-tem for unmanned aircraft (ACAS Xu). Results show that our techniquecan successfully prove properties of networks that are an order of mag-nitude larger than the largest networks verified using existing methods. +> 2025-05-13 4:42 pm + + + +>[!done] Important +> Past efforts at verifying properties of DNNs with ReLUs have had to make significant simplifying assumptions [3,10]—for instance, by considering only small input regions in which all ReLUs are fixed at either the activeor inactive state [3], hence making the problem convex but at the cost of beingable to verify only an approximation of the desired property +> 2025-05-13 4:45 pm + +>[!tip] Brilliant +> We propose a novel, scalable, and efficient algorithm for verifying propertiesof DNNs with ReLUs. We address the issue of the activation functions headon, by extending the simplex algorithm—a standard algorithm for solving LP instances—to support ReLU constraints. This is achieved by leveraging the piece-wise linear nature of ReLUs and attempting to gradually satisfy the constraintsthat they impose as the algorithm searches for a feasible solution. We call the algorithm Reluplex, for “ReLU with Simplex”. +> 2025-05-13 4:46 pm + +>[!highlight] Highlight +> The problem’s NP-completeness means that we must expect the worst-case performance of the algorithm to be poor. However, as is often the case with SAT and SMT solvers, the performance in practice can be quite reasonable; +> 2025-05-13 4:47 pm + +>[!highlight] Highlight +> Our contributions can be summarized as follows. We (i) present Reluplex,an SMT solver for a theory of linear real arithmetic with ReLU constraints; (ii) show how DNNs and properties of interest can be encoded as inputs to Reluplex; (iii) discuss several implementation details that are crucial to performance and scalability, such as the use of floating-point arithmetic, bound derivation for ReLU variables, and conflict analysis; and (iv) conduct a thorough evaluationon the DNN implementation of the prototype ACAS Xu system, demonstratingthe ability of Reluplex to scale to DNNs that are an order of magnitude largerthan those that can be analyzed using existing techniques. +> 2025-05-13 4:48 pm + + ## Follow-Ups +>[!example] + >are trained over a finiteset of inputs and outputs and are expected to generalize, i.e. to behave correctly for previously-unseen inputs. However, it has been observed that DNNs can react in unexpected and incorrect ways to even slight perturbations of their inputs [34]. + >- [ ] #Follow-Up + +>[!example] + >DNN verification is experi-mentally beyond the reach of general-purpose tools such as linear programming(LP) solvers or existing satisfiability modulo theories (SMT) solvers [3,10,31], and thus far, dedicated tools have only been able to handle very small networks (e.g. a single hidden layer with only 10 to 20 hidden nodes [30,31]). + >- [ ] #Follow-Up + diff --git a/Literature Notes/Safe Reinforcement Learning via Shielding.md b/Literature Notes/Safe Reinforcement Learning via Shielding.md index dec31b83..79f69857 100644 --- a/Literature Notes/Safe Reinforcement Learning via Shielding.md +++ b/Literature Notes/Safe Reinforcement Learning via Shielding.md @@ -18,7 +18,7 @@ publish_date: 2018-04-29 journal: "Proceedings of the AAAI Conference on Artificial Intelligence" volume: 32 issue: 1 -last_import: 2025-05-12 +last_import: 2025-05-15 --- # Indexing Information @@ -29,7 +29,8 @@ Published: 2018-04 #Formal-Methods -#ToRead +#InFirstPass + >[!Abstract] @@ -42,5 +43,21 @@ Published: 2018-04 ## Highlights From Zotero + + + +>[!highlight] Highlight +> . Increasing use of learning-based controllers inphysical systems in the proximity of humans strengthens theconcern of whether these systems will operate safely. +> 2025-05-13 4:28 pm + +>[!tip] Brilliant +> In this paper, we introduce shielded learning, a frame-work that allows applying machine learning to control sys-tems in a way that the correctness of the system’s executionagainst a given specification is assured during the learningand controller execution phases, regardless of how fast the learning process converges. The shield monitors the actionsselected by the learning agent and corrects them if and onlyif the chosen action is unsafe. +> 2025-05-13 4:29 pm + +>[!done] Important +> Last but not least, the shieldingframework is compatible with mechanisms such as functionapproximation, employed by learning algorithms in order to improve their scalability. +> 2025-05-13 4:31 pm + + ## Follow-Ups diff --git a/Neural Network Formal Methods/NNFM Ontology.canvas b/Neural Network Formal Methods/NNFM Ontology.canvas index a9f05daf..6d0fbefb 100644 --- a/Neural Network Formal Methods/NNFM Ontology.canvas +++ b/Neural Network Formal Methods/NNFM Ontology.canvas @@ -16,19 +16,20 @@ {"id":"42d4be73c13f91cb","type":"text","text":"1st Pass","x":-1635,"y":-615,"width":140,"height":60,"color":"3"}, {"id":"a2800119b02cdda0","type":"text","text":"2nd Pass","x":-1467,"y":-615,"width":145,"height":60,"color":"4"}, {"id":"1c13a0968706a8b0","type":"text","text":"Brilliant / Important","x":-1808,"y":-695,"width":243,"height":60,"color":"5"}, - {"id":"53a95b3f2101d32c","x":-720,"y":1260,"width":480,"height":300,"color":"3","type":"file","file":"Literature Notes/The Past, Present and Future of Cyber-Physical Systems - A Focus on Models.md"}, - {"id":"ac067a0584359781","x":-1515,"y":670,"width":297,"height":80,"type":"text","text":"## Cyber-Physical Systems"}, + {"id":"53a95b3f2101d32c","type":"file","file":"Literature Notes/The Past, Present and Future of Cyber-Physical Systems - A Focus on Models.md","x":-720,"y":1260,"width":480,"height":300,"color":"3"}, + {"id":"ac067a0584359781","type":"text","text":"## Cyber-Physical Systems","x":-1515,"y":670,"width":297,"height":80}, {"id":"bbe4a58a5af748eb","type":"text","text":"OLDER","x":-1000,"y":1600,"width":250,"height":60}, {"id":"b081d5b64b515d56","type":"text","text":"2017","x":-1000,"y":890,"width":250,"height":60}, {"id":"2e49adbdc97ce1c2","type":"text","text":"2015","x":-1000,"y":1260,"width":250,"height":60}, {"id":"6b62b3a19b06fc0a","type":"file","file":"Literature Notes/A Review of Formal Methods applied to Machine Learning.md","x":-200,"y":240,"width":520,"height":370,"color":"3"}, {"id":"bac68ab8fa2c1c9b","type":"file","file":"Literature Notes/Formal verification of neural network controlled autonomous systems.md","x":-720,"y":530,"width":485,"height":360,"color":"3"}, - {"id":"7ad1df279923e7f4","x":-720,"y":1660,"width":480,"height":400,"color":"1","type":"file","file":"Literature Notes/A systematic classification of neural-network-based control.md"}, - {"id":"b16d77e25162b164","x":-200,"y":1660,"width":400,"height":400,"color":"1","type":"file","file":"Literature Notes/Nonlinear Identification and Control.md"}, - {"id":"dae189dd0beeb46d","x":-720,"y":100,"width":480,"height":360,"color":"1","type":"file","file":"Literature Notes/Neural network-based flight control systems - Present and future.md"}, - {"id":"8af915dccc8e1757","x":240,"y":1660,"width":400,"height":400,"color":"1","type":"file","file":"Literature Notes/Differential neural networks for robust nonlinear control - identification, state estimation and trajectory tracking.md"}, - {"id":"034773fb85df43e9","x":-200,"y":-183,"width":520,"height":363,"color":"1","type":"file","file":"Literature Notes/Safe Reinforcement Learning via Shielding under Partial Observability.md"}, - {"id":"1b0680305c662125","x":-200,"y":670,"width":520,"height":330,"color":"1","type":"file","file":"Literature Notes/Safe Reinforcement Learning via Shielding.md"} + {"id":"7ad1df279923e7f4","type":"file","file":"Literature Notes/A systematic classification of neural-network-based control.md","x":-720,"y":1660,"width":480,"height":400,"color":"1"}, + {"id":"b16d77e25162b164","type":"file","file":"Literature Notes/Nonlinear Identification and Control.md","x":-200,"y":1660,"width":400,"height":400,"color":"1"}, + {"id":"dae189dd0beeb46d","type":"file","file":"Literature Notes/Neural network-based flight control systems - Present and future.md","x":-720,"y":100,"width":480,"height":360,"color":"1"}, + {"id":"8af915dccc8e1757","type":"file","file":"Literature Notes/Differential neural networks for robust nonlinear control - identification, state estimation and trajectory tracking.md","x":240,"y":1660,"width":400,"height":400,"color":"1"}, + {"id":"034773fb85df43e9","type":"file","file":"Literature Notes/Safe Reinforcement Learning via Shielding under Partial Observability.md","x":-200,"y":-183,"width":520,"height":363,"color":"1"}, + {"id":"1b0680305c662125","type":"file","file":"Literature Notes/Safe Reinforcement Learning via Shielding.md","x":-200,"y":670,"width":520,"height":330,"color":"3"}, + {"id":"0655d596e2c1bd88","x":-1242,"y":100,"width":400,"height":400,"color":"3","type":"file","file":"Literature Notes/Reluplex - An Efficient SMT Solver for Verifying Deep Neural Networks.md"} ], "edges":[ {"id":"766a32fa431c6398","fromNode":"bcc9c2b5dc701338","fromSide":"bottom","toNode":"bac68ab8fa2c1c9b","toSide":"top","color":"2"}, diff --git a/Paper Notes/Reluplex- An Efficient SMT Solver for Verifying Deep Neural Networks.md b/Paper Notes/Reluplex- An Efficient SMT Solver for Verifying Deep Neural Networks.md new file mode 100644 index 00000000..6d163721 --- /dev/null +++ b/Paper Notes/Reluplex- An Efficient SMT Solver for Verifying Deep Neural Networks.md @@ -0,0 +1,29 @@ +# First Pass +**Category:** +Method + +**Context:** +Expanding SMT solvers to work with ReLU functions. They then applied this technology to a prototype DNN for a ACAS Xu (aircraft collision avoidance system for Autonomous Aircraft) system + +**Correctness:** +Seems good to me. Need to do deeper dive, can't really say correct or not at this point. + +**Contributions:** +Reluplex: a SMT algorithim that incorporates functionality for ReLU functions commonly found in DNN. This drastically impacts the types of networks that can be verified in scale. + +**Clarity:** +Well written and clear. + +# Second Pass +**What is the main thrust?** + +**What is the supporting evidence?** + +**What are the key findings?** + +# Third Pass +**Recreation Notes:** + +**Hidden Findings:** + +**Weak Points? Strong Points?** diff --git a/Paper Notes/Safe Reinforcement Learning via Shielding.md b/Paper Notes/Safe Reinforcement Learning via Shielding.md new file mode 100644 index 00000000..28490a39 --- /dev/null +++ b/Paper Notes/Safe Reinforcement Learning via Shielding.md @@ -0,0 +1,29 @@ +# First Pass +**Category:** +Method + +**Context:** +Making RL agents safe by 'shielding' - baking in specification checking into training by providing feedback to the network when an output (even while the system is live) produces an unsafe control + +**Correctness:** +Cited 1000+ times. I think it's correct. + +**Contributions:** +Shielding, a way to use RL in high assurance systems. + +**Clarity:** +Well written and easy to understand. + +# Second Pass +**What is the main thrust?** + +**What is the supporting evidence?** + +**What are the key findings?** + +# Third Pass +**Recreation Notes:** + +**Hidden Findings:** + +**Weak Points? Strong Points?**