Obsidian/Zettelkasten/Permanent Notes/Literature Notes/to_process/Safe Reinforcement Learning via Shielding-Note.md
Dane Sabo 8824324149 Auto sync: 2025-08-26 11:21:28 (90 files changed)
R  "Zettelkasten/Literature Notes/.archive/CH4System_Representation_S2020pdf2254.md" -> ".archive/Literature Notes/.archive/CH4System_Representation_S2020pdf2254.md"

R  "Zettelkasten/Literature Notes/.archive/IntroductionDiffusionModels2022.md" -> ".archive/Literature Notes/.archive/IntroductionDiffusionModels2022.md"

R  "Zettelkasten/Literature Notes/.archive/Kry10TechnicalOverview.md" -> ".archive/Literature Notes/.archive/Kry10TechnicalOverview.md"

R  "Zettelkasten/Literature Notes/.archive/ME2046_Sampled_Data_Analysis_Reading_Chapter_2pdf2254ME.md" -> ".archive/Literature Notes/.archive/ME2046_Sampled_Data_Analysis_Reading_Chapter_2pdf2254ME.md"

R  "Zettelkasten/Literature Notes/.archive/ME2046_The_z_transform_Chapter_3pdf2254ME.md" -> ".archive/Literature Notes/.archive/ME2046_The_z_transform_Chapter_3pdf2254ME.md"

R  "Zettelkasten/Literature Notes/.archive/My Library.bib" -> ".archive/Literature Notes/.archive/My Library.bib"

R  "Zettelkasten/Literature Notes/.archive/aModeladoNucleoAnalisis2023.md" -> ".archive/Literature Notes/.archive/aModeladoNucleoAnalisis2023.md"

R  "Zettelkasten/Literature Notes/.archive/atsumiModifiedBodePlots2012.md" -> ".archive/Literature Notes/.archive/atsumiModifiedBodePlots2012.md"
2025-08-26 11:21:28 -04:00

30 lines
639 B
Markdown

# First Pass
**Category:**
Method
**Context:**
Making RL agents safe by 'shielding' - baking in specification checking into training by providing feedback to the network when an output (even while the system is live) produces an unsafe control
**Correctness:**
Cited 1000+ times. I think it's correct.
**Contributions:**
Shielding, a way to use RL in high assurance systems.
**Clarity:**
Well written and easy to understand.
# Second Pass
**What is the main thrust?**
**What is the supporting evidence?**
**What are the key findings?**
# Third Pass
**Recreation Notes:**
**Hidden Findings:**
**Weak Points? Strong Points?**