PWR-HYBRID-3/journal/entries/2026-04-17-controllers-linear-reach.tex
Dane Sabo 7a1023e252 tight-entry heatup PJ: ALL 6 inv1_holds halfspaces discharged
Second heatup PJ probe with tightened X_entry (T_c width 6K vs
baseline 14K) gives:

  T=60s:  5710 sets in 101s — T_c envelope [281.05, 291.0] 
  T=300s: 12932 sets in 206s — T_c envelope [281.05, 291.0] 

T_c envelope STABLE (identical at 60s and 300s) — the tube reached
steady shape and stopped growing. Low-T_avg trip (280) cleared at
lower bound 281.05, ~1K margin.

**First sound nonlinear reach-avoid proof for any mode of this plant:**
for the tightened entry and T = 300s, every inv1_holds halfspace
holds along the tube.  Sound w.r.t. PJ dynamics (<= 0.1% error vs
full state).

The baseline wider-entry run was loose on T_c low bound (272.4),
confirming that the looseness was entry-box-width driven (14K too
wide for TMJets + orderQ=2) rather than intrinsic to the method.
Entry splitting / refinement is the path to the full baseline set.

Also: LaTeX preamble now has the unicode-to-math literate map
attached to the listing STYLES themselves (not just global \lstset),
so terminal-output listings pasted from Julia with Δ, ≥, °,  etc.
render correctly.  Journal 34 pages, clean build.

OVERNIGHT_NOTES.md updated with tight-entry win.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 15:01:13 -04:00

682 lines
30 KiB
TeX

% ---------------------------------------------------------------------------
% 2026-04-17 --- Controllers + linearization + operation-mode linear reach
% Deep / A-style invention-log entry.
% ---------------------------------------------------------------------------
\session{2026-04-17}{full afternoon, $\sim$5 hr}{Build out the three
missing DRC controllers (shutdown, heatup, scram), add a numerical
linearization, design an LQR, and discharge the operation-mode safety
obligation with a hand-rolled zonotope reach. Try a Lyapunov-ellipsoid
barrier certificate; find it fundamentally too coarse for this plant.}
\section{2026-04-17 --- Controllers, linearization, operation-mode reach}
\label{sec:20260417}
\subsection*{Starting state}
Coming into this session the repository had:
\begin{itemize}
\item A 10-state PKE + thermal-hydraulics plant model
(\texttt{plant-model/pke\_th\_rhs.m}) and its parameters
(\texttt{plant-model/pke\_params.m}).
\item Two controllers: \texttt{ctrl\_null.m} (u=0 baseline) and
\texttt{ctrl\_operation.m} (plain P on $T_{\mathrm{avg}}$).
\item A FRET synthesis pipeline producing a discrete controller
(DRC) with four modes (\texttt{q\_shutdown}, \texttt{q\_heatup},
\texttt{q\_operation}, \texttt{q\_scram}) and seven transitions.
\item An empty \texttt{reachability/} directory. The thesis calls
for per-mode reach analysis; nothing has been done yet.
\end{itemize}
\subsection*{Goal}
Move the reachability thrust from zero to a first working artifact.
To do that, the continuous side needs all four modes' controllers
built out (can't reach-analyze a mode with no controller), a
linearization tool (for LQR design and for linear reach set
propagation), and a choice of reach tool. Build all that, then
discharge one mode's safety obligation.
\subsection*{Plant-model recap (what we're controlling)}
The plant is a lumped point-kinetic reactor coupled to a three-node
thermal loop. State vector $x \in \mathbb{R}^{10}$:
$$x = \bigl[\,n,\; C_1,\; C_2,\; C_3,\; C_4,\; C_5,\; C_6,\; T_f,\;
T_c,\; T_{\mathrm{cold}}\,\bigr]^\top$$
with $n$ normalized power, $C_i$ the six delayed-neutron precursor
concentrations, $T_f$ fuel temperature, $T_c$ average coolant
temperature, $T_{\mathrm{cold}}$ cold-leg coolant temperature.
Hot-leg temperature is algebraic: $T_{\mathrm{hot}} = 2T_c - T_{\mathrm{cold}}$
(linear coolant profile).
\begin{derivation}
The full RHS $\dot x = f(x, u, w)$ is:
\begin{align*}
\rho &= u + \alpha_f(T_f - T_{f0}) + \alpha_c(T_c - T_{c0}) \\
\dot n &= \frac{\rho - \beta}{\Lambda} n + \sum_{i=1}^{6} \lambda_i C_i \\
\dot C_i &= \frac{\beta_i}{\Lambda} n - \lambda_i C_i \quad (i = 1,\dots,6) \\
\dot T_f &= \frac{P_0 n - hA(T_f - T_c)}{M_f c_f} \\
\dot T_c &= \frac{hA(T_f - T_c) - 2 W c_c (T_c - T_{\mathrm{cold}})}{M_c c_c} \\
\dot T_{\mathrm{cold}} &= \frac{W c_c (T_{\mathrm{hot}} - T_{\mathrm{cold}}) - Q_{\mathrm{sg}}(t)}{M_{\mathrm{sg}} c_c}
\end{align*}
$u$ is external rod-worth reactivity (the control input).
$Q_{\mathrm{sg}}(t)$ is the steam-generator heat-removal disturbance.
All other symbols are plant constants (see
\texttt{plant-model/pke\_params.m} for sourcing; most are representative
of a 2-loop Westinghouse-class PWR).
\end{derivation}
\subsection*{Part 1: The three new controllers}
The DRC has four modes; we had only one mode's continuous controller
written. Need to fill in shutdown, heatup, and scram.
\subsubsection*{Shutdown and scram --- trivial, picked for physical defensibility}
Both modes are passive: the reactor is parked deeply subcritical, rods
are fully in, no feedback control is applied. Each is a one-liner.
\begin{lstlisting}[language=Matlab,caption={\texttt{plant-model/controllers/ctrl\_shutdown.m}}]
u = -5 * plant.beta; % ~5 $ subcritical
\end{lstlisting}
\begin{lstlisting}[language=Matlab,caption={\texttt{plant-model/controllers/ctrl\_scram.m}}]
u = -8 * plant.beta; % ~8 $, typical regulatory 8-10 $ rod worth
\end{lstlisting}
\begin{decision}
Constants, not feedback laws. Rationale:
\begin{itemize}
\item Real shutdown/scram behavior is ``rods fully inserted, worth
$\sim$5--10~\$, no further action.'' A feedback law would be
un-physical.
\item Constants are trivial to reach-analyze. Closed-loop matrix
$A_{\mathrm{cl}} = A$ (controller adds nothing to dynamics).
\item At cold-start conditions ($T = T_{\mathrm{cold0}} \approx 290\,^\circ\mathrm{C}$),
the linearized feedback from $\alpha_f, \alpha_c$ is about
$+2.79\,\$$ of ``wants-to-be-supercritical'' reactivity. So
$u = -5\,\$$ gives total $\rho \approx -2.2\,\$$, comfortably
subcritical with margin for uncertainty. $-8\,\$$ on scram
gives $\sim$$-5\,\$$ total --- conservative.
\item We could tighten to $-3\,\$$ and still be subcritical at
warm temperatures, but the thesis wants a safety margin
robust to any plausible state.
\end{itemize}
\end{decision}
\subsubsection*{Heatup --- the interesting controller}
Heatup is the one transition mode that does real control work: drive
$T_{\mathrm{avg}}$ from hot-standby conditions up to operating
conditions while keeping the reactor bounded. This one needs design.
The structure I landed on:
\begin{equation}
u(t, x) = \mathrm{sat}\Bigl(u_{\mathrm{ff}}(x) + K_p\bigl(T_{\mathrm{ref}}(t) - T_c\bigr),\; u_{\min},\; u_{\max}\Bigr)
\end{equation}
with three ingredients:
\textbf{Feedback linearization.} Cancel the intrinsic temperature
feedback exactly:
\begin{equation}
u_{\mathrm{ff}}(x) = -\alpha_f(T_f - T_{f0}) - \alpha_c(T_c - T_{c0})
\end{equation}
so that after substitution into the $\rho$ equation,
\begin{equation}
\rho_{\mathrm{total}} = u_{\mathrm{ff}} + K_p(T_{\mathrm{ref}} - T_c) + \underbrace{\alpha_f(T_f - T_{f0}) + \alpha_c(T_c - T_{c0})}_{\text{cancels } u_{\mathrm{ff}}}
= K_p \cdot e
\end{equation}
where $e = T_{\mathrm{ref}} - T_c$. The cancellation is exact only if
the controller's $(\alpha_f, \alpha_c)$ match the plant's. This is the
seat of the controller's robustness risk; see \cref{sub:alpha-drift}.
\textbf{Ramped reference.}
\begin{equation}
T_{\mathrm{ref}}(t) = \min\bigl(T_{\mathrm{start}} + r \cdot t,\; T_{\mathrm{target}}\bigr)
\end{equation}
with $r$ = 28~\unit{\celsius/\hour} (\num{7.78e-3}~\unit{\celsius/\second}),
the tech-spec heatup rate. Clamped at $T_{\mathrm{target}}$ once the
ramp reaches operating conditions.
\textbf{Saturation on $u$.} Cap $|u|$ at $0.5\beta$ on the positive
side and $5\beta$ on the negative side. The upper cap keeps
$\rho < \beta$ (the prompt-criticality line) under any plausible
state, which bounds the neutron-kinetics excursion rate for later
reach analysis.
\begin{deadend}
\textbf{First-pass heatup controller, before saturation was added.}
The initial design was just feedback-linearization + P with no
saturation, starting from an initial condition $n = 10^{-6}$ (decay-heat
level) at $T_{\mathrm{cold0}} = 290\,^\circ\mathrm{C}$ everywhere.
What went wrong: at that cold IC, the feedback terms give about
$+2.79\,\$$ of positive reactivity. With $u_{\mathrm{ff}}$ cancelling
those, $u = K_p \cdot e$ with $K_p = \num{1e-4}$. Error grows as
$T_{\mathrm{ref}}$ ramps away from $T_c$ (which is still cold because
the reactor has no power to heat it yet). Reactivity builds slowly,
but because $n$ is tiny, power is tiny; the system takes
\textbf{about 20 minutes} to generate enough power to heat the coolant,
during which $T_c$ stays at 290~\unit{\celsius} while $T_{\mathrm{ref}}$
ramps to 297. Once power catches up, the large accumulated error
produces a power spike (visible in the first-pass figure: $n$ peaks
near $0.018 = 1.8\%$), and the thermal mass overshoots
$T_{\mathrm{target}}$ by about 7~\unit{\celsius} before the controller
pulls it back.
Why it doesn't actually match physics: real plant heatup doesn't
start from $n \approx 0$. Operators take the reactor critical at
low power \emph{in shutdown mode}, by slowly withdrawing rods, then
the DRC transitions to heatup with $n \sim 0.1\,\%$ already
established. Two fixes landed:
\begin{enumerate}
\item Saturation on $u$: prevents the accumulated-error spike.
\item Start heatup sims from a ``critical-at-low-power'' IC
($n = 10^{-3}$). \texttt{main\_mode\_sweep.m} uses this IC.
\end{enumerate}
The feedback-linearization controller alone doesn't know any of this
--- it just does what the math says. The fixes are a controller design
change (saturation) and an IC assumption. A fancier heatup controller
would also include ramp-rate feedforward, but we don't need it yet.
\end{deadend}
\begin{decision}
\textbf{No integrator.} We stay with P-only (no PI), even knowing
there will be structural ramp-tracking lag on the order of
$r/K_p/\omega_{\mathrm{thermal}}$. Rationale:
\begin{itemize}
\item The plant already has strong integrators via thermal mass
($M_c c_c$, $M_{\mathrm{sg}} c_c$). PI would double-integrate.
\item PI adds state, turning the 10-state reach problem into 11 states.
Modest cost by itself.
\item The dealbreaker: \textbf{anti-windup saturation makes PI a
hybrid system}. Each saturation region (unsat / low-sat /
high-sat) is its own continuous mode with its own dynamics.
Our outer hybrid automaton is the DRC; nesting an inner hybrid
system inside each DRC mode doubles the verification
complexity.
\item The DRC exits heatup on a predicate window
(\texttt{t\_avg\_in\_range} $\wedge$ \texttt{p\_above\_crit}),
not on zero steady-state error. So the structural lag is
acceptable.
\end{itemize}
\end{decision}
Final controller (after iteration) lives in
\texttt{plant-model/controllers/ctrl\_heatup.m}. The relevant body:
\begin{lstlisting}[language=Matlab,caption={\texttt{ctrl\_heatup.m} body (abbreviated)}]
Kp = 1e-4;
T_f = x(8); T_c = x(9); T_avg = T_c;
u_ff = -plant.alpha_f*(T_f - plant.T_f0) ...
-plant.alpha_c*(T_avg - plant.T_c0);
T_ref = min(ref.T_start + ref.ramp_rate * t, ref.T_target);
e = T_ref - T_avg;
u_unsat = u_ff + Kp * e;
u = min(max(u_unsat, -5*plant.beta), 0.5*plant.beta);
\end{lstlisting}
\subsection*{Part 2: Linearization}
We need $(A, B, B_w)$ matrices for (a) LQR gain synthesis and (b)
linear reach set propagation.
\begin{derivation}
Central finite differences on $f(x, u, w)$ at a chosen operating
point $(x^\star, u^\star, w^\star)$:
\begin{align*}
A_{:,k} &\approx \frac{f(x^\star + h_k e_k, u^\star, w^\star) - f(x^\star - h_k e_k, u^\star, w^\star)}{2 h_k} \\
B &\approx \frac{f(x^\star, u^\star + h_u, w^\star) - f(x^\star, u^\star - h_u, w^\star)}{2 h_u} \\
B_w &\approx \frac{f(x^\star, u^\star, w^\star + h_w) - f(x^\star, u^\star, w^\star - h_w)}{2 h_w}
\end{align*}
with $h_k = \max(\epsilon_{\mathrm{rel}} |x^\star_k|,\; \epsilon_{\mathrm{abs}})$
to handle the wildly different magnitudes in $x$ (e.g.\ $n \sim 1$
vs.\ $C_1 \sim 10^2$ vs.\ $T_f \sim 3 \times 10^2$).
\end{derivation}
Validation: simulate the linear model and the nonlinear model in
parallel under a small $Q_{\mathrm{sg}}$ step and compare deviations
from $x^\star$. \Cref{fig:linearize-sanity} shows the result.
\begin{figure}[h]
\centering
\includegraphics[width=0.85\linewidth]{linearize_sanity.png}
\caption{Linear vs.\ nonlinear sanity check under a 5\% $Q_{\mathrm{sg}}$
down-step at $t = 30$~\unit{\second}, $u = 0$. Linear (red dashed)
tracks nonlinear (blue solid) within $4 \times 10^{-4}$ relative
error at 60~\unit{\second}, improving to $5 \times 10^{-6}$ by
300~\unit{\second} as the system relaxes. The match is best on $n$
(tightly coupled), loosest on $C_3$ (slow precursor). Eigenvalues of
$A$ span $[-65.93, -0.0124]$~\unit{\per\second} --- stiffness ratio
$\sim$5000. Conclusion: linearization is quantitatively trustworthy
for perturbations around $x^\star$ in a $\pm 50\,^\circ\mathrm{C}$
window.}
\label{fig:linearize-sanity}
\end{figure}
The code is a straightforward loop (see
\texttt{plant-model/pke\_linearize.m}); nothing subtle, but the
magnitude-aware step size matters --- using a uniform $h = 10^{-6}$
either loses precision on the small states or truncates on the big
ones.
\subsection*{Part 3: LQR controller for operation mode}
With $(A, B)$ in hand, design a linear quadratic regulator around the
operating point.
\begin{derivation}
Solve the continuous-time algebraic Riccati equation (CARE) for
$X \succeq 0$:
$$A^\top X + X A - X B R^{-1} B^\top X + Q = 0$$
then set gain $K = R^{-1} B^\top X$. The closed-loop control law
$u = -K(x - x^\star)$ minimizes
$$\int_0^\infty \left[(x - x^\star)^\top Q (x - x^\star) + u^\top R u\right] dt$$
and guarantees $A_{\mathrm{cl}} = A - BK$ is Hurwitz whenever $(A,B)$
is stabilizable and $Q \succeq 0$.
\end{derivation}
Weight choices:
\begin{lstlisting}[language=Matlab,caption={LQR weights, from \texttt{ctrl\_operation\_lqr.m}}]
Q = diag([1, % n
1e-3, 1e-3, 1e-3, 1e-3, 1e-3, 1e-3, % precursors C_1..C_6
1e-2, % T_f
1e2, % T_c (the primary objective)
1]); % T_cold
R = 1e6;
\end{lstlisting}
\begin{decision}
$Q(9,9) = 10^2$ for $T_c$ is the primary design knob --- heavy because
we care about the coolant average deviation. Precursor weights at
$10^{-3}$ because they're informational (not directly regulated).
$T_{\mathrm{cold}}$ at 1 because it's secondary but couples to $T_c$.
$R = 10^6$ balances so $|u|$ stays in the few-cent range --- below prompt,
above sensor noise. Ballpark numbers; we can retune if the reach tube
comes out too tight or too loose.
\end{decision}
Head-to-head against plain P on the same 100\% $\to$ 80\%
$Q_{\mathrm{sg}}$ step (\cref{fig:p-vs-lqr}):
\begin{figure}[h]
\centering
\includegraphics[width=0.85\linewidth]{mode_sweep_op_P_vs_LQR.png}
\caption{P vs.\ LQR on operation mode under a 100\%$\to$80\%
$Q_{\mathrm{sg}}$ step at $t = 30$~\unit{\second}. Plain P
(\texttt{ctrl\_operation.m}) has a 5~\unit{\degreeFahrenheit} peak
overshoot and 0.8~\unit{\degreeFahrenheit} steady-state lag. LQR
(\texttt{ctrl\_operation\_lqr.m}) holds $T_{\mathrm{avg}}$
essentially at setpoint throughout. Zero extra state in either;
LQR is just using all ten state measurements instead of one.}
\label{fig:p-vs-lqr}
\end{figure}
\subsection*{Part 4: Running all modes end-to-end}
\texttt{plant-model/main\_mode\_sweep.m} exercises all five controllers
(null, shutdown, heatup, operation-P, operation-LQR, scram) back-to-back
from mode-appropriate ICs, and saves figures. The normalized-power
overview (\cref{fig:power-overview}) captures the whole sweep:
\begin{figure}[h]
\centering
\includegraphics[width=0.95\linewidth]{mode_sweep_power_overview.png}
\caption{Normalized power $n(t)$ across the four DRC modes
plus baseline. \emph{Shutdown} (top left, log scale): $n$ decays
from $10^{-6}$ to $10^{-12}$ over 10~\unit{\minute}. \emph{Heatup}
(top right): small power burst around minute 22 as the reactor
goes critical and heats. \emph{Operation} (bottom left): $Q_{\mathrm{sg}}$
step at 30~\unit{\second} drops $n$ to 78\% briefly before settling
at 80\%. \emph{Scram} (bottom right, log scale): $n$ drops $1 \to 10^{-6}$
via prompt jump + delayed-neutron tail over 10~\unit{\minute}. All
four behave as textbook would predict.}
\label{fig:power-overview}
\end{figure}
\subsection*{Part 5: Tool choice for reach analysis}
Three candidate tools were evaluated:
\begin{itemize}
\item \textbf{CORA} (MATLAB) --- mature, stays in the existing
language, handles linear + nonlinear. Downside: $\sim$0.5~GB
install, heavy.
\item \textbf{JuliaReach} --- newer, faster for large reach sets,
rigorous Taylor-model support. Downside: requires porting
the plant model to Julia.
\item \textbf{Flow* / SpaceEx} --- C++ / no-longer-maintained,
both add toolchain friction.
\end{itemize}
\begin{decision}
For this first artifact: \textbf{write the reach tube by hand in pure
MATLAB}. Rationale: linear reach with bounded disturbance has a clean
analytic form --- matrix exponential on state, augmented-matrix integral
for the disturbance contribution --- that compiles to about 50 lines of
MATLAB. No toolbox install, no language switch. The result is a
sound box-shaped over-approximation.
If/when we need nonlinear reach, that's the point to stand up
JuliaReach. (Spoiler for the reader: this did happen in the
\texttt{2026-04-20-evening} session.)
\end{decision}
\subsection*{Part 6: The zonotope reach propagator}
For the LQR-closed-loop linear system
$\dot{\delta x} = A_{\mathrm{cl}} \delta x + B_w w$ with $\delta x(0) \in X_0$
(axis-aligned box) and $w \in [w_{\mathrm{lo}}, w_{\mathrm{hi}}]$, the
analytic solution splits into a homogeneous piece and a disturbance
integral:
\begin{derivation}
\begin{equation}
\delta x(t) = e^{A_{\mathrm{cl}} t} \delta x(0)
+ \int_0^t e^{A_{\mathrm{cl}}(t-s)} B_w w(s)\, ds.
\end{equation}
Discretize with step $\Delta t$ and let
$A_{\Delta} = e^{A_{\mathrm{cl}} \Delta t}$. The disturbance integral
over one step is
$$G_\Delta = \int_0^{\Delta t} e^{A_{\mathrm{cl}} s} B_w \, ds.$$
\textbf{Van Loan's trick (1978):} expand the matrix exponential of an
augmented matrix,
\begin{equation}
\exp\!\left(\begin{bmatrix} A & B_w \\ 0 & 0 \end{bmatrix} \Delta t\right)
= \begin{bmatrix} e^{A \Delta t} & G_\Delta \\ 0 & 1 \end{bmatrix}.
\end{equation}
Read the upper-right block --- $G_\Delta$ is exact to machine precision
without numerical quadrature.
\end{derivation}
Now propagate a box $\delta x \in [c - h,\ c + h]$ (center $c$,
halfwidth $h$) through the linear map $M$:
\begin{derivation}
If $y = M \delta x$ and $\delta x_j \in [c_j - h_j,\ c_j + h_j]$, then
$$y_i = \sum_j M_{ij} \delta x_j \quad \Longrightarrow \quad y_i \in \bigl[\,M c \mp |M| h\,\bigr]_i.$$
The $\pm$ signs of $M_{ij}$ align with opposite corners of the
input box, so the sum extremum takes the absolute value of each
coefficient times the halfwidth. Conclusion: box propagation uses
\emph{elementwise} $|M|$, not signed $M$.
\end{derivation}
\begin{deadend}
\textbf{First version of \texttt{reach\_linear.m} used signed
$A_\Delta$ for halfwidth propagation.} Result: the reach tube came
out suspiciously tight --- maximum $|T_c - T_{c0}|$ over 600~\unit{\second}
was exactly equal to the initial halfwidth, as if the disturbance
wasn't contributing at all.
Diagnosed after about 15~\unit{\minute} of confusion: when I wrote
$S_{\mathrm{half}} \leftarrow A_\Delta S_{\mathrm{half}} + G_\Delta w_{\mathrm{half}}$
with \emph{signed} $A_\Delta$, the positive and negative entries of
$A_\Delta$ across successive steps happened to cancel, so
$S_{\mathrm{half}}$ stayed near zero. The fix is to use elementwise
absolute values per the derivation above:
$$S_{\mathrm{half}} \leftarrow |A_\Delta| S_{\mathrm{half}} + |G_\Delta| w_{\mathrm{half}}.$$
Once the abs was added the disturbance halfwidth grew monotonically
as expected, and the reach tube gained a sensible envelope. This is
a case of ``the bug is easy to catch in the math, harder to catch
when you copy-pasted from the center-propagation code.''
\end{deadend}
The final propagator is the core of
\texttt{reachability/reach\_linear.m} (about 50~lines). The per-step
update is compact:
\begin{lstlisting}[language=Matlab,caption={Halfwidth-propagation core of \texttt{reach\_linear.m}}]
A_step = expm(A_cl * dt);
A_abs_step = abs(A_step);
M_aug = expm([A_cl, B_w; zeros(1, n+1)] * dt);
G_step = M_aug(1:n, n+1);
G_abs_step = abs(G_step);
for k = 2:M
x_center_now = A_step * x_center_now;
halfwidth_now = A_abs_step * halfwidth_now;
S_center = A_step * S_center + G_step * w_mid;
S_half = A_abs_step * S_half + G_abs_step * w_half;
center(:, k) = x_center_now + S_center;
R_lo(:, k) = center(:, k) - halfwidth_now - S_half;
R_hi(:, k) = center(:, k) + halfwidth_now + S_half;
end
\end{lstlisting}
\subsection*{Part 7: Operation-mode reach --- discharging the safety obligation}
Entry box around $x_{\mathrm{op}}$:
\begin{itemize}
\item $n$: $\pm 1\%$ of nominal.
\item $C_i$: $\pm 0.1\%$ of equilibrium (precursors equilibrate
fast; tight entry).
\item $T_f,\ T_c,\ T_{\mathrm{cold}}$: $\pm 0.1$~\unit{\kelvin}.
\end{itemize}
Disturbance: $Q_{\mathrm{sg}} \in [0.85 P_0,\ 1.00 P_0]$ --- a 15\%
load-down envelope (grid-following). Horizon: 600~\unit{\second}.
The reach envelope on $T_c$ is shown in \cref{fig:reach-tc}.
\begin{figure}[h]
\centering
\includegraphics[width=0.95\linewidth]{reach_operation_Tc.png}
\caption{Operation-mode reach tube on $T_{\mathrm{avg}}$, two views.
\emph{Left:} safety-band scale --- reach tube (pink) is barely visible
because LQR holds it tight against the dashed $\pm 5$~\unit{\celsius}
safety band. \emph{Right:} zoomed to reveal the actual tube.
Halfwidth at $t=0$ is 0.1~\unit{\kelvin} (the entry box);
halfwidth at $t=600$~\unit{\second} is 0.033~\unit{\kelvin}. The
tube \emph{contracts} on the regulated direction --- the signature
of tight feedback control. Max $|\delta T_c|$ over the horizon:
0.1~\unit{\kelvin} (the initial halfwidth dominates).}
\label{fig:reach-tc}
\end{figure}
Per-halfspace margins against \texttt{inv2\_holds} (the operation-mode
safety envelope):
\begin{lstlisting}[style=terminal]
=== Operation-mode reach vs inv2_holds safety limits ===
[fuel_centerline] a'x <= 1200.000 | max a'x = 328.934 | margin = +871.066 OK
[t_avg_high_trip] a'x <= 320.000 | max a'x = 308.449 | margin = +11.551 OK
[t_avg_low_trip] a'x <= -280.000 | max a'x = -308.249 | margin = +28.249 OK
[n_high_trip] a'x <= 1.150 | max a'x = 1.012 | margin = +0.138 OK
[n_low_operation] a'x <= -0.150 | max a'x = -0.842 | margin = +0.692 OK
[cold_leg_subcooled] a'x <= 305.000 | max a'x = 292.852 | margin = +12.148 OK
\end{lstlisting}
All six safety halfspaces pass. Tightest margin is
\texttt{n\_high\_trip} --- LQR lets $n$ swing up to $\sim$1.01 to
compensate for load variation, leaving 12\% margin to the high-flux
trip. Temperatures have 10--870~\unit{\kelvin} margin each.
\textbf{Operation-mode obligation discharged}, subject to the
linearization caveat in \cref{sec:limitations-17}.
Per-state reach-set evolution (final halfwidth vs.\ initial):
\begin{center}
\small
\begin{tabular}{l r r r}
\hline
state & init halfwidth & final halfwidth & ratio \\
\hline
$n$ & $0.010$ & $0.083$ & 8.3$\times$ expand \\
$C_1 \dots C_6$ & varies & varies & 160--200$\times$ expand \\
$T_f$ & 0.1~\unit{\kelvin} & 2.08~\unit{\kelvin} & 21$\times$ expand \\
$\boldsymbol{T_c}$ & $\mathbf{0.1}$~\unit{\kelvin} & $\mathbf{0.033}$~\unit{\kelvin} & $\mathbf{0.33\times}$ \textbf{contract} \\
$T_{\mathrm{cold}}$ & 0.1~\unit{\kelvin} & 1.47~\unit{\kelvin} & 15$\times$ expand \\
\hline
\end{tabular}
\end{center}
$T_c$ \emph{contracts} --- LQR drags the regulated direction toward
setpoint faster than disturbance can push it. Uncontrolled states
drift. Precursor expansion ($\sim$$200\times$) is immaterial for
safety (no predicate uses them).
\subsection*{Part 8: Lyapunov barrier attempt}
A reach tube is a numerical certificate: compute, check. A
\emph{barrier certificate} is analytic: produce a function $B(x)$ such
that $B \leq 0$ on the initial set, $B > 0$ on the unsafe set, and
$\dot B \leq 0$ when $B = 0$. Forward-invariance of the safe set
follows from the dynamics without iterating them.
For linear systems with bounded disturbance, the canonical candidate
is a Lyapunov quadratic.
\begin{derivation}
Let $V(\delta x) = \delta x^\top P \delta x$ with $P \succ 0$ solving
$A_{\mathrm{cl}}^\top P + P A_{\mathrm{cl}} + \bar Q = 0$. Along
trajectories of $\dot{\delta x} = A_{\mathrm{cl}} \delta x + B_w w$:
\begin{align}
\dot V &= -\delta x^\top \bar Q \delta x + 2 \delta x^\top P B_w w.
\end{align}
Bound each piece. The dissipation term using the generalized
eigenvalue
$\mu = \lambda_{\min}\bigl(P^{-1/2} \bar Q P^{-1/2}\bigr)$:
$$-\delta x^\top \bar Q \delta x \leq -\mu V.$$
The disturbance kick, Cauchy-Schwarz in the $P$-metric:
$$2 \delta x^\top P B_w w \leq 2 \bar w \sqrt{B_w^\top P B_w}\sqrt{V}.$$
Combining:
$$\dot V \leq -\mu V + 2 \bar w \sqrt{B_w^\top P B_w}\sqrt{V}.$$
Set $s = \sqrt{V}$, requiring $\dot s \leq 0$:
$$s \geq \frac{2 \bar w \sqrt{B_w^\top P B_w}}{\mu}
\quad \Longleftrightarrow \quad V \geq \left(\frac{2 \bar w \sqrt{B_w^\top P B_w}}{\mu}\right)^2 =: c_{\mathrm{inv}}.$$
The sub-level set $\{ V \leq c_{\mathrm{inv}} \}$ is forward-invariant
under any admissible disturbance.
\end{derivation}
So the recipe is: solve Lyapunov, compute $c_{\mathrm{inv}}$, compute
the worst-case deviation of $T_c$ (or any linear functional of state)
on the resulting ellipsoid, and check whether it fits inside the
safety slab.
\begin{derivation}
Max of a linear functional $a^\top \delta x$ over the ellipsoid
$\{\delta x : \delta x^\top P \delta x \leq \gamma\}$:
$$\max a^\top \delta x = \sqrt{\gamma \cdot a^\top P^{-1} a}.$$
(Lagrange multipliers: the maximum is achieved at
$\delta x = \sqrt{\gamma / a^\top P^{-1} a} \cdot P^{-1} a$.)
\end{derivation}
\textbf{Result:} the best quadratic barrier --- across a sweep of
$\bar Q(9,9)$ from 10 to $10^6$ --- allows max $|T_c - T_{c0}| \approx 33$~\unit{\kelvin},
more than six times the 5~\unit{\kelvin} safety band. On the hard
safety halfspaces (\texttt{inv2\_holds}), it says $n$ could deviate by
$\pm 1242$~$\times$ nominal --- physically meaningless.
\begin{limitation}
\textbf{The quadratic Lyapunov barrier fails on this plant. This is
a structural finding, not a tuning failure.}
The safety spec is anisotropic: a thin slab $|T_c| \leq 5$~\unit{\kelvin}
in a 10-D state space where $T_c$ is one direction and the other nine
are precursors / fuel temperature / cold-leg. Any
quadratic $V$ yields axis-uniform ellipsoidal level sets. To contain
the disturbance-driven invariant set, the ellipsoid must be \emph{fat
in the slow directions} (precursors, where Lyapunov dissipation is
weakest). Projecting that fat ellipsoid back onto $T_c$ yields a
range much larger than the physical $T_c$ behavior.
Confirmed the issue is not controller tuning by running the barrier
open-loop (no LQR): the bound becomes $\pm 27{,}000{,}000\,n$. LQR
helps by $\sim$20{,}000$\times$, but still leaves a 3-4 order-of-magnitude
gap to usability. The ceiling is plant anisotropy: prompt-neutron
timescale $\Lambda = 10^{-4}$~\unit{\second} vs.\ thermal timescales
$\sim$10--100~\unit{\second}, a factor-of-$10^4$ spread that forces
$P$ to be ill-conditioned regardless of controller design.
\textbf{Fix:} non-quadratic barrier. Two candidate approaches:
\begin{itemize}
\item \textbf{Polytopic barrier:} $B(\delta x) = \max_i (a_i^\top \delta x - b_i)$.
Piecewise-linear, can match a slab shape directly.
Requires polyhedral-set tooling.
\item \textbf{SOS polynomial barrier:} $B$ a polynomial of
degree $\geq 4$, coefficients found by sum-of-squares
optimization. Requires an SOS solver (Mosek, SDPT3).
\end{itemize}
Neither is implemented; both are in-scope for the thesis chapter.
The quadratic-Lyapunov failure, stated quantitatively, is the clean
motivation.
\end{limitation}
\subsection*{Session result}
Artifacts landing, in order:
\begin{enumerate}
\item \texttt{plant-model/controllers/ctrl\_shutdown.m},
\texttt{ctrl\_scram.m}, \texttt{ctrl\_heatup.m} (three new
controllers).
\item \texttt{plant-model/pke\_linearize.m} and
\texttt{plant-model/test\_linearize.m}.
\item \texttt{plant-model/controllers/ctrl\_operation\_lqr.m}.
\item \texttt{plant-model/main\_mode\_sweep.m}.
\item \texttt{reachability/reach\_linear.m} and
\texttt{reachability/reach\_operation.m}.
\item \texttt{reachability/barrier\_lyapunov.m}.
\item Eleven figures in \texttt{docs/figures/}.
\end{enumerate}
Key claims established:
\begin{itemize}
\item LQR closed-loop operation mode satisfies the safety obligation
under $\pm 15\%$ $Q_{\mathrm{sg}}$ variation (approximate,
via linear reach).
\item Quadratic Lyapunov barrier insufficient on this plant
(open-loop or closed-loop). Need polytopic or SOS.
\item LQR adds zero state, dominates plain P by $\sim 5\times$
on tracking.
\end{itemize}
\subsection*{Limitations recorded at close}
\label{sec:limitations-17}
\begin{limitation}
All reach results are \emph{approximate}, not sound: they are reach
tubes of the \emph{linearized} closed-loop. The linearization error
--- the gap between $f_{\mathrm{nl}}(x, u)$ and $(A x + B u)$ --- is not
propagated into the tube. For a real safety claim, either
(a)~nonlinear reach directly or (b)~linear reach plus Taylor-remainder
inflation. Neither is done.
\end{limitation}
\begin{limitation}
\label{sub:alpha-drift}
The feedback-linearization in \texttt{ctrl\_heatup} assumes exact
$\alpha_f, \alpha_c$. Real PWRs have $\alpha$ drift: $\sim$20\%
over burnup; $\sim$10$\times$ across soluble-boron dilution; xenon
transients worth 2--3~\$. Robust version would treat $\alpha$ as
parametric interval and propagate through reach.
\end{limitation}
\begin{limitation}
Saturation in \texttt{ctrl\_heatup} is formally a 3-mode piecewise-affine
sub-system. Operation-mode LQR is dormant-saturated (trivially); a
rigorous heatup reach would need explicit region decomposition or a
dormancy proof.
\end{limitation}
\begin{limitation}
The model's $\alpha_f, \alpha_c$ are linearized at the hot full-power
operating point. Trust region is $\pm 50$~\unit{\celsius} around $x^\star$.
True cold shutdown ($\sim$50~\unit{\celsius}) is out of scope. What
the DRC calls \texttt{q\_shutdown} we interpret as hot standby
($\sim$290~\unit{\celsius}).
\end{limitation}
\subsection*{Open at close}
\begin{itemize}
\item Nonlinear reach, at minimum on one mode, to retire the
soundness asterisk.
\item Polytopic or SOS barrier to retire the analytic-certificate
asterisk.
\item Parametric $\alpha$ uncertainty in the reach machinery.
\item Heatup reach --- ramped reference needs LTV or nonlinear
treatment.
\item Shutdown + scram reach (trivial forever-invariance /
bounded-time respectively, but not yet done).
\end{itemize}