vault backup: 2025-05-02 18:18:58

This commit is contained in:
Dane Sabo 2025-05-02 18:18:58 -04:00
parent 7d310f1895
commit ef3cf9c967

View File

@ -20,38 +20,73 @@ No one was injured, but the incident incurred high costs: replacing the CNC-mach
## So, how could rigorous digital engineering solve this problem? ## So, how could rigorous digital engineering solve this problem?
The fundamental problem with this testing set up was not a implementation The fundamental problem with this testing set up was not an implementation
problem. The controller as it was programmed performed the fatigue test problem. The controller as it was programmed performed the fatigue test
exactly as it was asked. Instead, this cyber-physical system failure starts exactly as it was asked. This failure wasnt an implementation bug— the controller did exactly what it was told. The real breakdown occurred at the design stage, where no one had explored all possible operating states or planned for assumption violations.
before implementation and instead at the design stage. The designers of
this system never properly examined the possible states this system could
experience, or provided an opportunity in their design for their assumptions
to be violated.
Here's a simple example. One of the main causes of the failure was that Here's a simple example. One of the main causes of the failure was that
the control system could not reconcile that different pressure sensors may the control system did not anticipate wildly divergent sensor readings.
have wildly different readings. This makes sense for cushions that are This makes sense for cushions that are operating properly where load would
operating properly where load would be evenly distributed, but cases be evenly distributed, but cases where this assumption is violated were never
where this assumption is violated were never examined. In the case of this examined. In the case of this failure, this assumption was violated by a subpar
failure, this assumption was violated by a subpar cushion, but what if a cushion, but what if a sensor was never connected to the system?
sensor was never connected to the system? Presumably, the testing fixture Presumably, the testing fixture would behave this way regardless of the
would behave this way regardless of the cushion being tested, and situations cushion being tested, and situations where sensors may be disconnected
where sensors may disconnected briefly for cleaning or moving the fixture briefly for cleaning or moving the fixture may be extremely common
may be extremely common scenarios. scenarios.
After taking the HACPS class, I think the designers of this testing fixutre After taking the HACPS class, I think the designers of this testing fixture
could have made good use of a model checker like TLA+. An analysis of could have made good use of a model checker like TLA+. An analysis of
testing system through a series of TLA+ modules could avoid these testing system through a series of TLA+ modules could avoid these
disaster scenarios where sensors do not prompt correct control. One could disaster scenarios where sensors do not prompt correct control. One could
do this analysis by examining what 'correct' behavior is: do this analysis by examining what 'correct' behavior is:
For the testing fixture as described, a couple things should ALWAYS For the testing fixture as described, a couple things should ALWAYS
happen: happen (specifications):
1. The sensors on the bottom of the buttocks should ALWAYS experience 1. The sensors on the bottom of the buttocks should ALWAYS experience
more load than the sensors on the side of the buttocks more load than the sensors on the side of the buttocks, except at low load
conditions where noise is the dominant signal.
2. The sensors on the bottom of the buttocks should never have a difference 2. The sensors on the bottom of the buttocks should never have a difference
in pressure from the side buttocks sensors greater than some value $\Delta P$. in pressure from the side buttocks sensors greater than some value $\Delta P$.
3. Sensors across symmetries (left vs. right buttock) should also always have 3. Sensors across symmetries (left vs. right buttock) should also always have
similar values to one another. similar values to one another.
If any of these statements are violated, testing should stop immediately.
How do these How do these specifications actually line up to some requirements?
Let's investigate:
**The buttocks should not destroy themselves with excessive force.**
This overarching requirement can be refined into some more specific
requirements:
1. **The testing rig should be safe if a sensor fails.**
This requirement is satisfied by all of the above specifications, but especially
specification number 3. Realistically, all cushions and buttocks will be fairly
symmetric, so if for example a sensor fails and reports a bogus value,
the complimenting sensor on the other buttock should throw a flag that
this specification has been violated, and the testing should stop.
Specification 1 also addresses this requirement. If one of the bottom senors
fails in a way that the measurement is much lower than expected, the other
side placement sensors will exceed the bottom sensor measurement and stop
the testing protocol.
2. **A broken or absent cushion should stop testing immediately.**
If a cushion fails or is not present, the bottom sensors will contact the testing
rig frame and increase in pressure while the side sensors will not increase
in value at all. The result is that the difference between the bottom and side
sensors will increase dramatically with additional force. Specification 2 will
prevent failure through this mode, as the $\Delta P$ flag will be thrown and testing
will stop.
This analysis could be more rigorously defined and carried out using all
sensors and with more sophisticated logic in TLA+, along with actually
defining the controller logic that runs the test. Here, we've only discussed
failure conditions, but there is all sorts of logic in the real system that decides
when the buttocks should descend, and when a new failure fatigue repetition
begins. All of these behaviors can be modeled in a model checker like TLA+
and compared against safety specifications to ensure that a failure like the
one experienced is not possible from the design level.
## AI Use Statement
I used ChatGPT to help me brainstorm and refine the organization and wording of my report, including rephrasing complex paragraphs for clarity, as well as rephrasing my first assignment for conciseness. All final content decisions, technical details, and edits were made by me.