QUaternion ESTimator as described by Shuster in [Shuster1981] and [Shuster1978].

We start to define the goal of finding an orthogonal matrix \(\mathbf{A}\) that minimizes the loss function:

\[L(\mathbf{A}) = \frac{1}{2}\sum_{i=1}^n |\hat{\mathbf{W}}_i - \mathbf{A}\hat{\mathbf{V}}_i|^2\]

where \(a_i\) are a set of non-negative weights such that \(\sum_{i=1}^na_i=1\), \(\hat{\mathbf{V}}_i\) are nonparallel reference vectors, and \(\hat{\mathbf{W}}_i\) are the corresponding observation vectors.

The gain function \(g(\mathbf{A})\) is defined by

\[g(\mathbf{A}) = 1 - L(\mathbf{A}) = \sum_{i=1}^na_i\,\hat{\mathbf{W}}_i^T\mathbf{A}\hat{\mathbf{V}}_i\]

The loss function \(L(\mathbf{A})\) is at its minimum when the gain function \(g(\mathbf{A})\) is at its maximum. The gain function can be reformulated as:

\[g(\mathbf{A}) = \sum_{i=1}^na_i\mathrm{tr}\big(\hat{\mathbf{W}}_i^T\mathbf{A}\hat{\mathbf{V}}_i\big) = \mathrm{tr}(\mathbf{AB}^T)\]

where \(\mathrm{tr}\) is the trace of a matrix, and \(\mathbf{B}\) is the attitude profile matrix:

\[\mathbf{B} = \sum_{i=1}^na_i\hat{\mathbf{W}}_i^T\hat{\mathbf{V}}_i\]

The quaternion \(\bar{\mathbf{q}}\) representing a rotation is defined by Shuster as:

\[\begin{split}\bar{\mathbf{q}} = \begin{bmatrix}\mathbf{Q} \\ q\end{bmatrix} = \begin{bmatrix}\hat{\mathbf{X}}\sin\frac{\theta}{2} \\ \cos\frac{\theta}{2}\end{bmatrix}\end{split}\]

where \(\hat{\mathbf{X}}\) is the axis of rotation, and \(\theta\) is the angle of rotation about \(\hat{\mathbf{X}}\).


The definition of a quaternion used by Shuster sets the vector part \(\mathbf{Q}\) followed by the scalar part \(q\). This module, however, will return the estimated quaternion with the scalar part first and followed by the vector part: \(\bar{\mathbf{q}} = \begin{bmatrix}q & \mathbf{Q}\end{bmatrix}\)

Because the quaternion works as a versor, it must satisfy:

\[\bar{\mathbf{q}}^T\bar{\mathbf{q}} = |\mathbf{Q}|^2 + q^2 = 1\]

The attitude matrix \(\mathbf{A}\) is related to the quaternion by:

\[\mathbf{A}(\bar{\mathbf{q}}) = (q^2+\mathbf{Q}\cdot\mathbf{Q})\mathbf{I} + 2\mathbf{QQ}^T + 2q\lfloor\mathbf{Q}\rfloor_\times\]

where \(\mathbf{I}\) is the identity matrix, and \(\lfloor\mathbf{Q}\rfloor_\times\) is the antisymmetric matrix of \(\mathbf{Q}\), a.k.a. the skew-symmetric matrix:

\[\begin{split}\lfloor\mathbf{Q}\rfloor_\times = \begin{bmatrix}0 & Q_3 & -Q_2 \\ -Q_3 & 0 & Q_1 \\ Q_2 & -Q_1 & 0\end{bmatrix}\end{split}\]

Now the gain function can be rewritten again, but in terms of quaternions:

\[g(\bar{\mathbf{q}}) = (q^2-\mathbf{Q}\cdot\mathbf{Q})\mathrm{tr}\mathbf{B}^T + 2\mathrm{tr}\big(\mathbf{QQ}^T\mathbf{B}^T\big) + 2q\mathrm{tr}\big(\lfloor\mathbf{Q}\rfloor_\times\mathbf{B}^T\big)\]

A further simplification gives:

\[g(\bar{\mathbf{q}}) = \bar{\mathbf{q}}^T\mathbf{K}\bar{\mathbf{q}}\]

where the \(4\times 4\) matrix \(\mathbf{K}\) is given by:

\[\begin{split}\mathbf{K} = \begin{bmatrix} \mathbf{S} - \sigma\mathbf{I} & \mathbf{Z} \\ \mathbf{Z}^T & \sigma \end{bmatrix}\end{split}\]

using the helper values:

\[\begin{split}\begin{array}{rcl} \sigma &=& \mathrm{tr}\mathbf{B} \\ && \\ \mathbf{S} &=& \mathbf{B} + \mathbf{B}^T \\ && \\ \mathbf{Z} &=& \sum_{i=1}^na_i\big(\hat{\mathbf{W}}_i\times\hat{\mathbf{V}}_i\big) \end{array}\end{split}\]


\(\mathbf{Z}\) can be also defined from \(\lfloor\mathbf{Z}\rfloor_\times = \mathbf{B} - \mathbf{B}^T\)

A new gain function \(g'(\bar{\mathbf{q}})\) with Lagrange multipliers is defined:

\[g'(\bar{\mathbf{q}}) = \bar{\mathbf{q}}^T\mathbf{K}\bar{\mathbf{q}} - \lambda\bar{\mathbf{q}}^T\bar{\mathbf{q}}\]

It is verified that \(\mathbf{K}\bar{\mathbf{q}}=\lambda\bar{\mathbf{q}}\). Thus, \(g(\bar{\mathbf{q}})\) will be maximized if \(\bar{\mathbf{q}}_\mathrm{opt}\) is chosen to be the eigenvector of \(\mathbf{K}\) belonging to the largest eigenvalue of \(\mathbf{K}\):

\[\mathbf{K}\bar{\mathbf{q}}_\mathrm{opt} = \lambda_\mathrm{max}\bar{\mathbf{q}}_\mathrm{opt}\]

which is the desired result. This equation can be rearranged to read, for any eigenvalue \(\lambda\):

\[\lambda = \sigma + \mathbf{Z}\cdot\mathbf{Y}\]

where \(\mathbf{Y}\) is the Gibbs vector, a.k.a. the Rodrigues vector, defined as:

\[\mathbf{Y} = \frac{\mathbf{Q}}{q} = \hat{\mathbf{X}}\tan\frac{\theta}{2}\]

rewriting the quaternion as:

\[\begin{split}\bar{\mathbf{q}} = \frac{1}{\sqrt{1+|\mathbf{Y}|^2}} = \begin{bmatrix}\mathbf{Y}\\ 1 \end{bmatrix}\end{split}\]

\(\mathbf{Y}\) and \(\bar{\mathbf{q}}\) are representations of the optimal attitude solution when \(\lambda\) is equal to \(\lambda_\mathrm{max}\), leading to an equation for the eigenvalues:

\[\lambda = \sigma + \mathbf{Z}^T \frac{1}{(\lambda+\sigma)\mathbf{I}-\mathbf{S}}\mathbf{Z}\]

which is equivalent to the characteristic equation of the eigenvalues of \(\mathbf{K}\)

With the aid of Cayley-Hamilton theorem we can get rid of the Gibbs vector to find a more convenient expression of the characteristic equation:



\[\begin{split}\begin{array}{rcl} a &=& \sigma^2-\kappa \\ && \\ b &=& \sigma^2 + \mathbf{Z}^T\mathbf{Z} \\ && \\ c &=& \Delta + \mathbf{Z}^T\mathbf{SZ} \\ && \\ d &=& \mathbf{Z}^T\mathbf{S}^2\mathbf{Z} \\ && \\ \sigma &=& \frac{1}{2}\mathrm{tr}\mathbf{S} \\ && \\ \kappa &=& \mathrm{tr}\big(\mathrm{adj}(\mathbf{S})\big) \\ && \\ \Delta &=& \mathrm{det}(\mathbf{S}) \end{array}\end{split}\]

To find \(\lambda\) we implement the Newton-Raphson method using the sum of the weights \(a_i\) (in the beginning is constrained to be equal to 1) as a starting value.

\[\lambda_{t+1} \gets \lambda_t - \frac{f(\lambda)}{f'(\lambda)} = \lambda_t - \frac{\lambda^4-(a+b)\lambda^2-c\lambda+(ab+c\sigma-d)}{4\lambda^3-2(a+b)\lambda-c}\]

For sensor accuracies better than 1 arc-min (1 degree) the accuracy of a 64-bit word is exhausted after only one iteration.

Finally, the optimal quaternion describing the attitude is found as:

\[\begin{split}\bar{\mathbf{q}}_\mathrm{opt} = \frac{1}{\sqrt{\gamma^2+|\mathbf{X}|^2}} \begin{bmatrix}\mathbf{X}\\ \gamma \end{bmatrix}\end{split}\]


\[\begin{split}\begin{array}{rcl} \mathbf{X} &=& (\alpha\mathbf{I} + \beta\mathbf{S} + \mathbf{S}^2)\mathbf{Z} \\ && \\ \gamma &=& (\lambda + \sigma)\alpha - \Delta \\ && \\ \alpha &=& \lambda^2 - \sigma^2 + \kappa \\ && \\ \beta &=& \lambda - \sigma \end{array}\end{split}\]

This solution can still lead to an indeterminant result if both \(\gamma\) and \(\mathbf{X}\) vanish simultaneously. \(\gamma\) vanishes if and only if the angle of rotation is equal to \(\pi\), even if \(\mathbf{X}\) does not vanish along.



Shuster, M.D. and Oh, S.D. “Three-Axis Attitude Determination from Vector Observations,” Journal of Guidance and Control, Vol.4, No.1, Jan.-Feb. 1981, pp. 70-77.


Shuster, Malcom D. Approximate Algorithms for Fast Optimal Attitude Computation, AIAA Guidance and Control Conference. August 1978. (

class ndarray | None = None, mag: ndarray | None = None, **kw)#

QUaternion ESTimator

  • acc (numpy.ndarray, default: None) – N-by-3 array with measurements of acceleration in in m/s^2

  • mag (numpy.ndarray, default: None) – N-by-3 array with measurements of magnetic field in mT

  • weights (array-like) – Array with two weights. One per sensor measurement.

  • magnetic_dip (float) – Local magnetic inclination angle, in degrees.

  • gravity (float) – Local normal gravity, in m/s^2.

  • acc (numpy.ndarray) – N-by-3 array with N accelerometer samples.

  • mag (numpy.ndarray) – N-by-3 array with N magnetometer samples.

  • w (numpy.ndarray) – Weights for each observation.


ValueError – When dimension of input arrays acc and mag are not equal.

estimate(acc: ndarray, mag: ndarray) ndarray#

Attitude Estimation.

  • acc (numpy.ndarray) – Sample of tri-axial Accelerometer in m/s^2

  • mag (numpy.ndarray) – Sample of tri-axial Magnetometer in T


q – Estimated attitude as a quaternion.

Return type: