背景

在前边的文章《连续系统的LQR推导》以及《连续系统的LQR变分法推导》中，我们已经推导出了连续系统的LQR控制器。不过在上述推导中，我们都假定控制目标是 $0$（这从其中代价函数的形式中就可以看出来）。但是在实际应用中，我们往往需要系统能够跟踪某个给定的轨迹。因此，本文将讨论连续系统的LQR轨迹跟踪问题。

问题描述

连续系统的LQR轨迹跟踪问题可以用如下方式描述：

给定一个连续系统：

\[\begin{aligned} \dot{x}(t) = Ax(t) + Bu(t) \end{aligned} \tag{1}\]

我们定义LQR轨迹跟踪问题的代价函数为：

\[\begin{aligned} J &= \frac{1}{2}(x(t_f)-r(t_f))^TP_f(x(t_f)-r(t_f))\\ &+\frac{1}{2}\int_{t_0}^{t_f} \left( (x(t)-r(t))^TQ(x(t)-r(t)) + u^T(t)Ru(t) \right) dt \end{aligned} \tag{2}\]

其中 $r(t)$ 是给定的轨迹，$P_f$ 是最终状态的权重矩阵，$Q$ 是状态的权重矩阵，$R$ 是控制的权重矩阵。

解决方案

根据前述文章中的结论，我们可以得到上述问题的哈密顿量（Hamiltonian）：

\[\begin{aligned} \mathscr{H} &= \frac{1}{2}(x(t)-r(t))^TQ(x(t)-r(t)) + \frac{1}{2}u^TRu + p^T(Ax+Bu)\\ &= \frac{1}{2}x^TQx - x^TQr + \frac{1}{2}u^TRu + p^T(Ax+Bu) \end{aligned} \tag{3}\]

其中 $p$ 是拉格朗日乘子。

根据变分法的结论，我们可以得到最优控制率 $\hat{u}$：

\[\begin{aligned} \hat{u} &= -R^{-1}B^Tp \end{aligned} \tag{4}\]

除此之外还可以得到最优控制率下的状态和协状态的微分方程：

\[\begin{aligned} \dot{\hat x} &= \frac{\partial \hat{\mathscr{H}}}{\partial p} = A\hat x + B\hat u = A\hat x - BR^{-1}B^Tp\\ \dot{\hat p} &= -\frac{\partial \hat{\mathscr{H}}}{\partial x} = -Q\hat x + Qr - A^T\hat p \end{aligned} \tag{5}\]

我们将上述状态和协状态的微分方程写成矩阵形式：

\[\begin{aligned} \begin{bmatrix} \dot{\hat x}\\ \dot{\hat p} \end{bmatrix} &= \begin{bmatrix} A & -BR^{-1}B^T\\ -Q & -A^T \end{bmatrix}\begin{bmatrix} \hat x\\ \hat p \end{bmatrix} + \begin{bmatrix} 0\\ Qr \end{bmatrix} \end{aligned} \tag{6}\]

上述形式在《连续系统的LQR变分法推导》中已经见过类似的形式，我们通过简单的变量替换，不难发现上述微分方程的解具有如下形式：

\[\begin{aligned} \hat p(t) = P(t)\hat x(t) + s(t) \end{aligned} \tag{7}\]

其中关于 $P(t)$ 可以得到同样形式的Riccati方程：

\[\begin{aligned} -\dot P(t)&=A^TP(t)+P(t)A+Q-P(t)BR^{-1}B^TP(t)\\ P(t_f) &= P_f \end{aligned} \tag{8}\]

下面我们需要做的就是求解变量 $s(t)$ 。对公式 $(7)$ 两边同时求微分：

\[\begin{aligned} \dot{\hat p}(t) &= \dot P(t)\hat x(t) + P(t)\dot{\hat x}(t) + \dot s(t) \end{aligned} \tag{9}\]

我们将其代入到公式 $(6)$ 中：

\[\begin{aligned} \dot P \hat x + P\dot{\hat x} + \dot s = \dot P \hat x + PA\hat x - PBR^{-1}B^TP\hat x - PBR^{-1}B^Ts + \dot s &= -Q\hat x + Qr - A^TP\hat x - A^Ts \Rightarrow \\ (\dot P + PA + A^TP - PBR^{-1}B^TP + Q)\hat x &= Qr - A^Ts + PBR^{-1}B^Ts - \dot s \\ \end{aligned} \tag{10}\]

带入Riccati方程 $(8)$ ，可以得到：

\[\begin{aligned} \dot s = -(A^T - PBR^{-1}B^T)s + Qr \end{aligned} \tag{11}\]

这就是变量 $s(t)$ 的微分方程。

下边我们根据边界条件，有：

\[\begin{aligned} \hat p_f = P_f\hat x(t_f) - P_f r(t_f) \end{aligned} \tag{12}\]

于是有：

\[\begin{aligned} s(t_f) = -P_f r(t_f) \end{aligned} \tag{13}\]

因此可以根据边界条件和微分方程 $(11)$ 和 $(8)$ 求解变量 $s(t)$ 和 $P(t)$。最后我们可以得到最优控制率 $\hat u$ 为：

\[\begin{aligned} \hat u = -R^{-1}B^T\hat p = -R^{-1}B^T(P\hat x + s) \end{aligned} \tag{14}\]

卡尔曼状态估计问题和LQR问题的对偶性讨论

拉格朗日量