PWR-HYBRID-3/journal/entries/2026-04-17-controllers-linear-reach.tex

% ---------------------------------------------------------------------------
% 2026-04-17 --- Controllers + linearization + operation-mode linear reach
% Deep / A-style invention-log entry.
% ---------------------------------------------------------------------------

\session{2026-04-17}{full afternoon, $\sim$5 hr}{Build out the three
missing DRC controllers (shutdown, heatup, scram), add a numerical
linearization, design an LQR, and discharge the operation-mode safety
obligation with a hand-rolled zonotope reach.  Try a Lyapunov-ellipsoid
barrier certificate; find it fundamentally too coarse for this plant.}

\section{2026-04-17 --- Controllers, linearization, operation-mode reach}
\label{sec:20260417}

\subsection*{Starting state}

Coming into this session the repository had:
\begin{itemize}
  \item A 10-state PKE + thermal-hydraulics plant model
        (\texttt{plant-model/pke\_th\_rhs.m}) and its parameters
        (\texttt{plant-model/pke\_params.m}).
  \item Two controllers: \texttt{ctrl\_null.m} (u=0 baseline) and
        \texttt{ctrl\_operation.m} (plain P on $T_{\mathrm{avg}}$).
  \item A FRET synthesis pipeline producing a discrete controller
        (DRC) with four modes (\texttt{q\_shutdown}, \texttt{q\_heatup},
        \texttt{q\_operation}, \texttt{q\_scram}) and seven transitions.
  \item An empty \texttt{reachability/} directory.  The thesis calls
        for per-mode reach analysis; nothing has been done yet.
\end{itemize}

\subsection*{Goal}

Move the reachability thrust from zero to a first working artifact.
To do that, the continuous side needs all four modes' controllers
built out (can't reach-analyze a mode with no controller), a
linearization tool (for LQR design and for linear reach set
propagation), and a choice of reach tool.  Build all that, then
discharge one mode's safety obligation.

\subsection*{Plant-model recap (what we're controlling)}

The plant is a lumped point-kinetic reactor coupled to a three-node
thermal loop.  State vector $x \in \mathbb{R}^{10}$:
$$x = \bigl[\,n,\; C_1,\; C_2,\; C_3,\; C_4,\; C_5,\; C_6,\; T_f,\;
           T_c,\; T_{\mathrm{cold}}\,\bigr]^\top$$
with $n$ normalized power, $C_i$ the six delayed-neutron precursor
concentrations, $T_f$ fuel temperature, $T_c$ average coolant
temperature, $T_{\mathrm{cold}}$ cold-leg coolant temperature.
Hot-leg temperature is algebraic: $T_{\mathrm{hot}} = 2T_c - T_{\mathrm{cold}}$
(linear coolant profile).

\begin{derivation}
The full RHS $\dot x = f(x, u, w)$ is:
\begin{align*}
\rho &= u + \alpha_f(T_f - T_{f0}) + \alpha_c(T_c - T_{c0}) \\
\dot n &= \frac{\rho - \beta}{\Lambda} n + \sum_{i=1}^{6} \lambda_i C_i \\
\dot C_i &= \frac{\beta_i}{\Lambda} n - \lambda_i C_i \quad (i = 1,\dots,6) \\
\dot T_f &= \frac{P_0 n - hA(T_f - T_c)}{M_f c_f} \\
\dot T_c &= \frac{hA(T_f - T_c) - 2 W c_c (T_c - T_{\mathrm{cold}})}{M_c c_c} \\
\dot T_{\mathrm{cold}} &= \frac{W c_c (T_{\mathrm{hot}} - T_{\mathrm{cold}}) - Q_{\mathrm{sg}}(t)}{M_{\mathrm{sg}} c_c}
\end{align*}
$u$ is external rod-worth reactivity (the control input).
$Q_{\mathrm{sg}}(t)$ is the steam-generator heat-removal disturbance.
All other symbols are plant constants (see
\texttt{plant-model/pke\_params.m} for sourcing; most are representative
of a 2-loop Westinghouse-class PWR).
\end{derivation}

\subsection*{Part 1: The three new controllers}

The DRC has four modes; we had only one mode's continuous controller
written.  Need to fill in shutdown, heatup, and scram.

\subsubsection*{Shutdown and scram --- trivial, picked for physical defensibility}

Both modes are passive: the reactor is parked deeply subcritical, rods
are fully in, no feedback control is applied.  Each is a one-liner.

\begin{lstlisting}[language=Matlab,caption={\texttt{plant-model/controllers/ctrl\_shutdown.m}}]
u = -5 * plant.beta;   % ~5 $ subcritical
\end{lstlisting}

\begin{lstlisting}[language=Matlab,caption={\texttt{plant-model/controllers/ctrl\_scram.m}}]
u = -8 * plant.beta;   % ~8 $, typical regulatory 8-10 $ rod worth
\end{lstlisting}

\begin{decision}
Constants, not feedback laws.  Rationale:
\begin{itemize}
  \item Real shutdown/scram behavior is ``rods fully inserted, worth
        $\sim$5--10~\$, no further action.''  A feedback law would be
        un-physical.
  \item Constants are trivial to reach-analyze.  Closed-loop matrix
        $A_{\mathrm{cl}} = A$ (controller adds nothing to dynamics).
  \item At cold-start conditions ($T = T_{\mathrm{cold0}} \approx 290\,^\circ\mathrm{C}$),
        the linearized feedback from $\alpha_f, \alpha_c$ is about
        $+2.79\,\$$ of ``wants-to-be-supercritical'' reactivity.  So
        $u = -5\,\$$ gives total $\rho \approx -2.2\,\$$, comfortably
        subcritical with margin for uncertainty.  $-8\,\$$ on scram
        gives $\sim$$-5\,\$$ total --- conservative.
  \item We could tighten to $-3\,\$$ and still be subcritical at
        warm temperatures, but the thesis wants a safety margin
        robust to any plausible state.
\end{itemize}
\end{decision}

\subsubsection*{Heatup --- the interesting controller}

Heatup is the one transition mode that does real control work: drive
$T_{\mathrm{avg}}$ from hot-standby conditions up to operating
conditions while keeping the reactor bounded.  This one needs design.

The structure I landed on:
\begin{equation}
u(t, x) = \mathrm{sat}\Bigl(u_{\mathrm{ff}}(x) + K_p\bigl(T_{\mathrm{ref}}(t) - T_c\bigr),\; u_{\min},\; u_{\max}\Bigr)
\end{equation}
with three ingredients:

\textbf{Feedback linearization.}  Cancel the intrinsic temperature
feedback exactly:
\begin{equation}
u_{\mathrm{ff}}(x) = -\alpha_f(T_f - T_{f0}) - \alpha_c(T_c - T_{c0})
\end{equation}
so that after substitution into the $\rho$ equation,
\begin{equation}
\rho_{\mathrm{total}} = u_{\mathrm{ff}} + K_p(T_{\mathrm{ref}} - T_c) + \underbrace{\alpha_f(T_f - T_{f0}) + \alpha_c(T_c - T_{c0})}_{\text{cancels } u_{\mathrm{ff}}}
= K_p \cdot e
\end{equation}
where $e = T_{\mathrm{ref}} - T_c$.  The cancellation is exact only if
the controller's $(\alpha_f, \alpha_c)$ match the plant's.  This is the
seat of the controller's robustness risk; see \cref{sub:alpha-drift}.

\textbf{Ramped reference.}
\begin{equation}
T_{\mathrm{ref}}(t) = \min\bigl(T_{\mathrm{start}} + r \cdot t,\; T_{\mathrm{target}}\bigr)
\end{equation}
with $r$ = 28~\unit{\celsius/\hour} (\num{7.78e-3}~\unit{\celsius/\second}),
the tech-spec heatup rate.  Clamped at $T_{\mathrm{target}}$ once the
ramp reaches operating conditions.

\textbf{Saturation on $u$.}  Cap $|u|$ at $0.5\beta$ on the positive
side and $5\beta$ on the negative side.  The upper cap keeps
$\rho < \beta$ (the prompt-criticality line) under any plausible
state, which bounds the neutron-kinetics excursion rate for later
reach analysis.

\begin{deadend}
\textbf{First-pass heatup controller, before saturation was added.}
The initial design was just feedback-linearization + P with no
saturation, starting from an initial condition $n = 10^{-6}$ (decay-heat
level) at $T_{\mathrm{cold0}} = 290\,^\circ\mathrm{C}$ everywhere.

What went wrong: at that cold IC, the feedback terms give about
$+2.79\,\$$ of positive reactivity.  With $u_{\mathrm{ff}}$ cancelling
those, $u = K_p \cdot e$ with $K_p = \num{1e-4}$.  Error grows as
$T_{\mathrm{ref}}$ ramps away from $T_c$ (which is still cold because
the reactor has no power to heat it yet).  Reactivity builds slowly,
but because $n$ is tiny, power is tiny; the system takes
\textbf{about 20 minutes} to generate enough power to heat the coolant,
during which $T_c$ stays at 290~\unit{\celsius} while $T_{\mathrm{ref}}$
ramps to 297.  Once power catches up, the large accumulated error
produces a power spike (visible in the first-pass figure: $n$ peaks
near $0.018 = 1.8\%$), and the thermal mass overshoots
$T_{\mathrm{target}}$ by about 7~\unit{\celsius} before the controller
pulls it back.

Why it doesn't actually match physics: real plant heatup doesn't
start from $n \approx 0$.  Operators take the reactor critical at
low power \emph{in shutdown mode}, by slowly withdrawing rods, then
the DRC transitions to heatup with $n \sim 0.1\,\%$ already
established.  Two fixes landed:
\begin{enumerate}
  \item Saturation on $u$: prevents the accumulated-error spike.
  \item Start heatup sims from a ``critical-at-low-power'' IC
        ($n = 10^{-3}$).  \texttt{main\_mode\_sweep.m} uses this IC.
\end{enumerate}

The feedback-linearization controller alone doesn't know any of this
--- it just does what the math says.  The fixes are a controller design
change (saturation) and an IC assumption.  A fancier heatup controller
would also include ramp-rate feedforward, but we don't need it yet.
\end{deadend}

\begin{decision}
\textbf{No integrator.}  We stay with P-only (no PI), even knowing
there will be structural ramp-tracking lag on the order of
$r/K_p/\omega_{\mathrm{thermal}}$.  Rationale:
\begin{itemize}
  \item The plant already has strong integrators via thermal mass
        ($M_c c_c$, $M_{\mathrm{sg}} c_c$).  PI would double-integrate.
  \item PI adds state, turning the 10-state reach problem into 11 states.
        Modest cost by itself.
  \item The dealbreaker: \textbf{anti-windup saturation makes PI a
        hybrid system}.  Each saturation region (unsat / low-sat /
        high-sat) is its own continuous mode with its own dynamics.
        Our outer hybrid automaton is the DRC; nesting an inner hybrid
        system inside each DRC mode doubles the verification
        complexity.
  \item The DRC exits heatup on a predicate window
        (\texttt{t\_avg\_in\_range} $\wedge$ \texttt{p\_above\_crit}),
        not on zero steady-state error.  So the structural lag is
        acceptable.
\end{itemize}
\end{decision}

Final controller (after iteration) lives in
\texttt{plant-model/controllers/ctrl\_heatup.m}.  The relevant body:

\begin{lstlisting}[language=Matlab,caption={\texttt{ctrl\_heatup.m} body (abbreviated)}]
Kp = 1e-4;
T_f   = x(8); T_c = x(9); T_avg = T_c;
u_ff  = -plant.alpha_f*(T_f - plant.T_f0) ...
       -plant.alpha_c*(T_avg - plant.T_c0);
T_ref = min(ref.T_start + ref.ramp_rate * t, ref.T_target);
e     = T_ref - T_avg;
u_unsat = u_ff + Kp * e;
u = min(max(u_unsat, -5*plant.beta), 0.5*plant.beta);
\end{lstlisting}

\subsection*{Part 2: Linearization}

We need $(A, B, B_w)$ matrices for (a) LQR gain synthesis and (b)
linear reach set propagation.

\begin{derivation}
Central finite differences on $f(x, u, w)$ at a chosen operating
point $(x^\star, u^\star, w^\star)$:
\begin{align*}
A_{:,k} &\approx \frac{f(x^\star + h_k e_k, u^\star, w^\star) - f(x^\star - h_k e_k, u^\star, w^\star)}{2 h_k} \\
B &\approx \frac{f(x^\star, u^\star + h_u, w^\star) - f(x^\star, u^\star - h_u, w^\star)}{2 h_u} \\
B_w &\approx \frac{f(x^\star, u^\star, w^\star + h_w) - f(x^\star, u^\star, w^\star - h_w)}{2 h_w}
\end{align*}
with $h_k = \max(\epsilon_{\mathrm{rel}} |x^\star_k|,\; \epsilon_{\mathrm{abs}})$
to handle the wildly different magnitudes in $x$ (e.g.\ $n \sim 1$
vs.\ $C_1 \sim 10^2$ vs.\ $T_f \sim 3 \times 10^2$).
\end{derivation}

Validation: simulate the linear model and the nonlinear model in
parallel under a small $Q_{\mathrm{sg}}$ step and compare deviations
from $x^\star$.  \Cref{fig:linearize-sanity} shows the result.

\begin{figure}[h]
  \centering
  \includegraphics[width=0.85\linewidth]{linearize_sanity.png}
  \caption{Linear vs.\ nonlinear sanity check under a 5\% $Q_{\mathrm{sg}}$
  down-step at $t = 30$~\unit{\second}, $u = 0$.  Linear (red dashed)
  tracks nonlinear (blue solid) within $4 \times 10^{-4}$ relative
  error at 60~\unit{\second}, improving to $5 \times 10^{-6}$ by
  300~\unit{\second} as the system relaxes.  The match is best on $n$
  (tightly coupled), loosest on $C_3$ (slow precursor).  Eigenvalues of
  $A$ span $[-65.93, -0.0124]$~\unit{\per\second} --- stiffness ratio
  $\sim$5000.  Conclusion: linearization is quantitatively trustworthy
  for perturbations around $x^\star$ in a $\pm 50\,^\circ\mathrm{C}$
  window.}
  \label{fig:linearize-sanity}
\end{figure}

The code is a straightforward loop (see
\texttt{plant-model/pke\_linearize.m}); nothing subtle, but the
magnitude-aware step size matters --- using a uniform $h = 10^{-6}$
either loses precision on the small states or truncates on the big
ones.

\subsection*{Part 3: LQR controller for operation mode}

With $(A, B)$ in hand, design a linear quadratic regulator around the
operating point.

\begin{derivation}
Solve the continuous-time algebraic Riccati equation (CARE) for
$X \succeq 0$:
$$A^\top X + X A - X B R^{-1} B^\top X + Q = 0$$
then set gain $K = R^{-1} B^\top X$.  The closed-loop control law
$u = -K(x - x^\star)$ minimizes
$$\int_0^\infty \left[(x - x^\star)^\top Q (x - x^\star) + u^\top R u\right] dt$$
and guarantees $A_{\mathrm{cl}} = A - BK$ is Hurwitz whenever $(A,B)$
is stabilizable and $Q \succeq 0$.
\end{derivation}

Weight choices:
\begin{lstlisting}[language=Matlab,caption={LQR weights, from \texttt{ctrl\_operation\_lqr.m}}]
Q = diag([1,                                % n
          1e-3, 1e-3, 1e-3, 1e-3, 1e-3, 1e-3,  % precursors C_1..C_6
          1e-2,                              % T_f
          1e2,                               % T_c  (the primary objective)
          1]);                               % T_cold
R = 1e6;
\end{lstlisting}

\begin{decision}
$Q(9,9) = 10^2$ for $T_c$ is the primary design knob --- heavy because
we care about the coolant average deviation.  Precursor weights at
$10^{-3}$ because they're informational (not directly regulated).
$T_{\mathrm{cold}}$ at 1 because it's secondary but couples to $T_c$.
$R = 10^6$ balances so $|u|$ stays in the few-cent range --- below prompt,
above sensor noise.  Ballpark numbers; we can retune if the reach tube
comes out too tight or too loose.
\end{decision}

Head-to-head against plain P on the same 100\% $\to$ 80\%
$Q_{\mathrm{sg}}$ step (\cref{fig:p-vs-lqr}):

\begin{figure}[h]
  \centering
  \includegraphics[width=0.85\linewidth]{mode_sweep_op_P_vs_LQR.png}
  \caption{P vs.\ LQR on operation mode under a 100\%$\to$80\%
  $Q_{\mathrm{sg}}$ step at $t = 30$~\unit{\second}.  Plain P
  (\texttt{ctrl\_operation.m}) has a 5~\unit{\degreeFahrenheit} peak
  overshoot and 0.8~\unit{\degreeFahrenheit} steady-state lag.  LQR
  (\texttt{ctrl\_operation\_lqr.m}) holds $T_{\mathrm{avg}}$
  essentially at setpoint throughout.  Zero extra state in either;
  LQR is just using all ten state measurements instead of one.}
  \label{fig:p-vs-lqr}
\end{figure}

\subsection*{Part 4: Running all modes end-to-end}

\texttt{plant-model/main\_mode\_sweep.m} exercises all five controllers
(null, shutdown, heatup, operation-P, operation-LQR, scram) back-to-back
from mode-appropriate ICs, and saves figures.  The normalized-power
overview (\cref{fig:power-overview}) captures the whole sweep:

\begin{figure}[h]
  \centering
  \includegraphics[width=0.95\linewidth]{mode_sweep_power_overview.png}
  \caption{Normalized power $n(t)$ across the four DRC modes
  plus baseline.  \emph{Shutdown} (top left, log scale): $n$ decays
  from $10^{-6}$ to $10^{-12}$ over 10~\unit{\minute}.  \emph{Heatup}
  (top right): small power burst around minute 22 as the reactor
  goes critical and heats.  \emph{Operation} (bottom left): $Q_{\mathrm{sg}}$
  step at 30~\unit{\second} drops $n$ to 78\% briefly before settling
  at 80\%.  \emph{Scram} (bottom right, log scale): $n$ drops $1 \to 10^{-6}$
  via prompt jump + delayed-neutron tail over 10~\unit{\minute}.  All
  four behave as textbook would predict.}
  \label{fig:power-overview}
\end{figure}

\subsection*{Part 5: Tool choice for reach analysis}

Three candidate tools were evaluated:
\begin{itemize}
  \item \textbf{CORA} (MATLAB) --- mature, stays in the existing
        language, handles linear + nonlinear.  Downside: $\sim$0.5~GB
        install, heavy.
  \item \textbf{JuliaReach} --- newer, faster for large reach sets,
        rigorous Taylor-model support.  Downside: requires porting
        the plant model to Julia.
  \item \textbf{Flow* / SpaceEx} --- C++ / no-longer-maintained,
        both add toolchain friction.
\end{itemize}

\begin{decision}
For this first artifact: \textbf{write the reach tube by hand in pure
MATLAB}.  Rationale: linear reach with bounded disturbance has a clean
analytic form --- matrix exponential on state, augmented-matrix integral
for the disturbance contribution --- that compiles to about 50 lines of
MATLAB.  No toolbox install, no language switch.  The result is a
sound box-shaped over-approximation.

If/when we need nonlinear reach, that's the point to stand up
JuliaReach.  (Spoiler for the reader: this did happen in the
\texttt{2026-04-20-evening} session.)
\end{decision}

\subsection*{Part 6: The zonotope reach propagator}

For the LQR-closed-loop linear system
$\dot{\delta x} = A_{\mathrm{cl}} \delta x + B_w w$ with $\delta x(0) \in X_0$
(axis-aligned box) and $w \in [w_{\mathrm{lo}}, w_{\mathrm{hi}}]$, the
analytic solution splits into a homogeneous piece and a disturbance
integral:

\begin{derivation}
\begin{equation}
\delta x(t) = e^{A_{\mathrm{cl}} t} \delta x(0)
            + \int_0^t e^{A_{\mathrm{cl}}(t-s)} B_w w(s)\, ds.
\end{equation}
Discretize with step $\Delta t$ and let
$A_{\Delta} = e^{A_{\mathrm{cl}} \Delta t}$.  The disturbance integral
over one step is
$$G_\Delta = \int_0^{\Delta t} e^{A_{\mathrm{cl}} s} B_w \, ds.$$
\textbf{Van Loan's trick (1978):} expand the matrix exponential of an
augmented matrix,
\begin{equation}
\exp\!\left(\begin{bmatrix} A & B_w \\ 0 & 0 \end{bmatrix} \Delta t\right)
= \begin{bmatrix} e^{A \Delta t} & G_\Delta \\ 0 & 1 \end{bmatrix}.
\end{equation}
Read the upper-right block --- $G_\Delta$ is exact to machine precision
without numerical quadrature.
\end{derivation}

Now propagate a box $\delta x \in [c - h,\ c + h]$ (center $c$,
halfwidth $h$) through the linear map $M$:

\begin{derivation}
If $y = M \delta x$ and $\delta x_j \in [c_j - h_j,\ c_j + h_j]$, then
$$y_i = \sum_j M_{ij} \delta x_j \quad \Longrightarrow \quad y_i \in \bigl[\,M c \mp |M| h\,\bigr]_i.$$
The $\pm$ signs of $M_{ij}$ align with opposite corners of the
input box, so the sum extremum takes the absolute value of each
coefficient times the halfwidth.  Conclusion: box propagation uses
\emph{elementwise} $|M|$, not signed $M$.
\end{derivation}

\begin{deadend}
\textbf{First version of \texttt{reach\_linear.m} used signed
$A_\Delta$ for halfwidth propagation.}  Result: the reach tube came
out suspiciously tight --- maximum $|T_c - T_{c0}|$ over 600~\unit{\second}
was exactly equal to the initial halfwidth, as if the disturbance
wasn't contributing at all.

Diagnosed after about 15~\unit{\minute} of confusion: when I wrote
$S_{\mathrm{half}} \leftarrow A_\Delta S_{\mathrm{half}} + G_\Delta w_{\mathrm{half}}$
with \emph{signed} $A_\Delta$, the positive and negative entries of
$A_\Delta$ across successive steps happened to cancel, so
$S_{\mathrm{half}}$ stayed near zero.  The fix is to use elementwise
absolute values per the derivation above:
$$S_{\mathrm{half}} \leftarrow |A_\Delta| S_{\mathrm{half}} + |G_\Delta| w_{\mathrm{half}}.$$
Once the abs was added the disturbance halfwidth grew monotonically
as expected, and the reach tube gained a sensible envelope.  This is
a case of ``the bug is easy to catch in the math, harder to catch
when you copy-pasted from the center-propagation code.''
\end{deadend}

The final propagator is the core of
\texttt{reachability/reach\_linear.m} (about 50~lines).  The per-step
update is compact:

\begin{lstlisting}[language=Matlab,caption={Halfwidth-propagation core of \texttt{reach\_linear.m}}]
A_step     = expm(A_cl * dt);
A_abs_step = abs(A_step);
M_aug      = expm([A_cl, B_w; zeros(1, n+1)] * dt);
G_step     = M_aug(1:n, n+1);
G_abs_step = abs(G_step);

for k = 2:M
    x_center_now  = A_step     * x_center_now;
    halfwidth_now = A_abs_step * halfwidth_now;
    S_center = A_step     * S_center + G_step     * w_mid;
    S_half   = A_abs_step * S_half   + G_abs_step * w_half;

    center(:, k) = x_center_now + S_center;
    R_lo(:, k)   = center(:, k) - halfwidth_now - S_half;
    R_hi(:, k)   = center(:, k) + halfwidth_now + S_half;
end
\end{lstlisting}

\subsection*{Part 7: Operation-mode reach --- discharging the safety obligation}

Entry box around $x_{\mathrm{op}}$:
\begin{itemize}
  \item $n$: $\pm 1\%$ of nominal.
  \item $C_i$: $\pm 0.1\%$ of equilibrium (precursors equilibrate
        fast; tight entry).
  \item $T_f,\ T_c,\ T_{\mathrm{cold}}$: $\pm 0.1$~\unit{\kelvin}.
\end{itemize}
Disturbance: $Q_{\mathrm{sg}} \in [0.85 P_0,\ 1.00 P_0]$ --- a 15\%
load-down envelope (grid-following).  Horizon: 600~\unit{\second}.

The reach envelope on $T_c$ is shown in \cref{fig:reach-tc}.

\begin{figure}[h]
  \centering
  \includegraphics[width=0.95\linewidth]{reach_operation_Tc.png}
  \caption{Operation-mode reach tube on $T_{\mathrm{avg}}$, two views.
  \emph{Left:} safety-band scale --- reach tube (pink) is barely visible
  because LQR holds it tight against the dashed $\pm 5$~\unit{\celsius}
  safety band.  \emph{Right:} zoomed to reveal the actual tube.
  Halfwidth at $t=0$ is 0.1~\unit{\kelvin} (the entry box);
  halfwidth at $t=600$~\unit{\second} is 0.033~\unit{\kelvin}.  The
  tube \emph{contracts} on the regulated direction --- the signature
  of tight feedback control.  Max $|\delta T_c|$ over the horizon:
  0.1~\unit{\kelvin} (the initial halfwidth dominates).}
  \label{fig:reach-tc}
\end{figure}

Per-halfspace margins against \texttt{inv2\_holds} (the operation-mode
safety envelope):

\begin{lstlisting}[style=terminal]
=== Operation-mode reach vs inv2_holds safety limits ===
  [fuel_centerline]     a'x <= 1200.000 | max a'x =  328.934 | margin = +871.066  OK
  [t_avg_high_trip]     a'x <=  320.000 | max a'x =  308.449 | margin =  +11.551  OK
  [t_avg_low_trip]      a'x <= -280.000 | max a'x = -308.249 | margin =  +28.249  OK
  [n_high_trip]         a'x <=    1.150 | max a'x =    1.012 | margin =   +0.138  OK
  [n_low_operation]     a'x <=   -0.150 | max a'x =   -0.842 | margin =   +0.692  OK
  [cold_leg_subcooled]  a'x <=  305.000 | max a'x =  292.852 | margin =  +12.148  OK
\end{lstlisting}

All six safety halfspaces pass.  Tightest margin is
\texttt{n\_high\_trip} --- LQR lets $n$ swing up to $\sim$1.01 to
compensate for load variation, leaving 12\% margin to the high-flux
trip.  Temperatures have 10--870~\unit{\kelvin} margin each.
\textbf{Operation-mode obligation discharged}, subject to the
linearization caveat in \cref{sec:limitations-17}.

Per-state reach-set evolution (final halfwidth vs.\ initial):

\begin{center}
\small
\begin{tabular}{l r r r}
\hline
state & init halfwidth & final halfwidth & ratio \\
\hline
$n$ & $0.010$ & $0.083$ & 8.3$\times$ expand \\
$C_1 \dots C_6$ & varies & varies & 160--200$\times$ expand \\
$T_f$ & 0.1~\unit{\kelvin} & 2.08~\unit{\kelvin} & 21$\times$ expand \\
$\boldsymbol{T_c}$ & $\mathbf{0.1}$~\unit{\kelvin} & $\mathbf{0.033}$~\unit{\kelvin} & $\mathbf{0.33\times}$ \textbf{contract} \\
$T_{\mathrm{cold}}$ & 0.1~\unit{\kelvin} & 1.47~\unit{\kelvin} & 15$\times$ expand \\
\hline
\end{tabular}
\end{center}

$T_c$ \emph{contracts} --- LQR drags the regulated direction toward
setpoint faster than disturbance can push it.  Uncontrolled states
drift.  Precursor expansion ($\sim$$200\times$) is immaterial for
safety (no predicate uses them).

\subsection*{Part 8: Lyapunov barrier attempt}

A reach tube is a numerical certificate: compute, check.  A
\emph{barrier certificate} is analytic: produce a function $B(x)$ such
that $B \leq 0$ on the initial set, $B > 0$ on the unsafe set, and
$\dot B \leq 0$ when $B = 0$.  Forward-invariance of the safe set
follows from the dynamics without iterating them.

For linear systems with bounded disturbance, the canonical candidate
is a Lyapunov quadratic.

\begin{derivation}
Let $V(\delta x) = \delta x^\top P \delta x$ with $P \succ 0$ solving
$A_{\mathrm{cl}}^\top P + P A_{\mathrm{cl}} + \bar Q = 0$.  Along
trajectories of $\dot{\delta x} = A_{\mathrm{cl}} \delta x + B_w w$:
\begin{align}
\dot V &= -\delta x^\top \bar Q \delta x + 2 \delta x^\top P B_w w.
\end{align}
Bound each piece.  The dissipation term using the generalized
eigenvalue
$\mu = \lambda_{\min}\bigl(P^{-1/2} \bar Q P^{-1/2}\bigr)$:
$$-\delta x^\top \bar Q \delta x \leq -\mu V.$$
The disturbance kick, Cauchy-Schwarz in the $P$-metric:
$$2 \delta x^\top P B_w w \leq 2 \bar w \sqrt{B_w^\top P B_w}\sqrt{V}.$$
Combining:
$$\dot V \leq -\mu V + 2 \bar w \sqrt{B_w^\top P B_w}\sqrt{V}.$$
Set $s = \sqrt{V}$, requiring $\dot s \leq 0$:
$$s \geq \frac{2 \bar w \sqrt{B_w^\top P B_w}}{\mu}
\quad \Longleftrightarrow \quad V \geq \left(\frac{2 \bar w \sqrt{B_w^\top P B_w}}{\mu}\right)^2 =: c_{\mathrm{inv}}.$$
The sub-level set $\{ V \leq c_{\mathrm{inv}} \}$ is forward-invariant
under any admissible disturbance.
\end{derivation}

So the recipe is: solve Lyapunov, compute $c_{\mathrm{inv}}$, compute
the worst-case deviation of $T_c$ (or any linear functional of state)
on the resulting ellipsoid, and check whether it fits inside the
safety slab.

\begin{derivation}
Max of a linear functional $a^\top \delta x$ over the ellipsoid
$\{\delta x : \delta x^\top P \delta x \leq \gamma\}$:
$$\max a^\top \delta x = \sqrt{\gamma \cdot a^\top P^{-1} a}.$$
(Lagrange multipliers: the maximum is achieved at
$\delta x = \sqrt{\gamma / a^\top P^{-1} a} \cdot P^{-1} a$.)
\end{derivation}

\textbf{Result:} the best quadratic barrier --- across a sweep of
$\bar Q(9,9)$ from 10 to $10^6$ --- allows max $|T_c - T_{c0}| \approx 33$~\unit{\kelvin},
more than six times the 5~\unit{\kelvin} safety band.  On the hard
safety halfspaces (\texttt{inv2\_holds}), it says $n$ could deviate by
$\pm 1242$~$\times$ nominal --- physically meaningless.

\begin{limitation}
\textbf{The quadratic Lyapunov barrier fails on this plant.  This is
a structural finding, not a tuning failure.}

The safety spec is anisotropic: a thin slab $|T_c| \leq 5$~\unit{\kelvin}
in a 10-D state space where $T_c$ is one direction and the other nine
are precursors / fuel temperature / cold-leg.  Any
quadratic $V$ yields axis-uniform ellipsoidal level sets.  To contain
the disturbance-driven invariant set, the ellipsoid must be \emph{fat
in the slow directions} (precursors, where Lyapunov dissipation is
weakest).  Projecting that fat ellipsoid back onto $T_c$ yields a
range much larger than the physical $T_c$ behavior.

Confirmed the issue is not controller tuning by running the barrier
open-loop (no LQR): the bound becomes $\pm 27{,}000{,}000\,n$.  LQR
helps by $\sim$20{,}000$\times$, but still leaves a 3-4 order-of-magnitude
gap to usability.  The ceiling is plant anisotropy: prompt-neutron
timescale $\Lambda = 10^{-4}$~\unit{\second} vs.\ thermal timescales
$\sim$10--100~\unit{\second}, a factor-of-$10^4$ spread that forces
$P$ to be ill-conditioned regardless of controller design.

\textbf{Fix:} non-quadratic barrier.  Two candidate approaches:
\begin{itemize}
  \item \textbf{Polytopic barrier:} $B(\delta x) = \max_i (a_i^\top \delta x - b_i)$.
        Piecewise-linear, can match a slab shape directly.
        Requires polyhedral-set tooling.
  \item \textbf{SOS polynomial barrier:} $B$ a polynomial of
        degree $\geq 4$, coefficients found by sum-of-squares
        optimization.  Requires an SOS solver (Mosek, SDPT3).
\end{itemize}
Neither is implemented; both are in-scope for the thesis chapter.
The quadratic-Lyapunov failure, stated quantitatively, is the clean
motivation.
\end{limitation}

\subsection*{Session result}

Artifacts landing, in order:
\begin{enumerate}
  \item \texttt{plant-model/controllers/ctrl\_shutdown.m},
        \texttt{ctrl\_scram.m}, \texttt{ctrl\_heatup.m} (three new
        controllers).
  \item \texttt{plant-model/pke\_linearize.m} and
        \texttt{plant-model/test\_linearize.m}.
  \item \texttt{plant-model/controllers/ctrl\_operation\_lqr.m}.
  \item \texttt{plant-model/main\_mode\_sweep.m}.
  \item \texttt{reachability/reach\_linear.m} and
        \texttt{reachability/reach\_operation.m}.
  \item \texttt{reachability/barrier\_lyapunov.m}.
  \item Eleven figures in \texttt{docs/figures/}.
\end{enumerate}

Key claims established:
\begin{itemize}
  \item LQR closed-loop operation mode satisfies the safety obligation
        under $\pm 15\%$ $Q_{\mathrm{sg}}$ variation (approximate,
        via linear reach).
  \item Quadratic Lyapunov barrier insufficient on this plant
        (open-loop or closed-loop).  Need polytopic or SOS.
  \item LQR adds zero state, dominates plain P by $\sim 5\times$
        on tracking.
\end{itemize}

\subsection*{Limitations recorded at close}
\label{sec:limitations-17}

\begin{limitation}
All reach results are \emph{approximate}, not sound: they are reach
tubes of the \emph{linearized} closed-loop.  The linearization error
--- the gap between $f_{\mathrm{nl}}(x, u)$ and $(A x + B u)$ --- is not
propagated into the tube.  For a real safety claim, either
(a)~nonlinear reach directly or (b)~linear reach plus Taylor-remainder
inflation.  Neither is done.
\end{limitation}

\begin{limitation}
\label{sub:alpha-drift}
The feedback-linearization in \texttt{ctrl\_heatup} assumes exact
$\alpha_f, \alpha_c$.  Real PWRs have $\alpha$ drift: $\sim$20\%
over burnup; $\sim$10$\times$ across soluble-boron dilution; xenon
transients worth 2--3~\$.  Robust version would treat $\alpha$ as
parametric interval and propagate through reach.
\end{limitation}

\begin{limitation}
Saturation in \texttt{ctrl\_heatup} is formally a 3-mode piecewise-affine
sub-system.  Operation-mode LQR is dormant-saturated (trivially); a
rigorous heatup reach would need explicit region decomposition or a
dormancy proof.
\end{limitation}

\begin{limitation}
The model's $\alpha_f, \alpha_c$ are linearized at the hot full-power
operating point.  Trust region is $\pm 50$~\unit{\celsius} around $x^\star$.
True cold shutdown ($\sim$50~\unit{\celsius}) is out of scope.  What
the DRC calls \texttt{q\_shutdown} we interpret as hot standby
($\sim$290~\unit{\celsius}).
\end{limitation}

\subsection*{Open at close}

\begin{itemize}
  \item Nonlinear reach, at minimum on one mode, to retire the
        soundness asterisk.
  \item Polytopic or SOS barrier to retire the analytic-certificate
        asterisk.
  \item Parametric $\alpha$ uncertainty in the reach machinery.
  \item Heatup reach --- ramped reference needs LTV or nonlinear
        treatment.
  \item Shutdown + scram reach (trivial forever-invariance /
        bounded-time respectively, but not yet done).
\end{itemize}