Quaternions in Signal and Image Processing

Quaternions in Signal and Image Processing: A comprehensive and objective overviewSebastian Miron, Julien Flamant, Nicolas Le Bihan, Pierre Chainais, David Brie40msp06-miron-opener-3278071Â©SHUTTERSTOCK.COM/METAMORWORKSQuaternions are still largely misunderstood and often considered an â€œexoticâ€ signal representation without much practical utility despite the fact that they have been around the signal and image processing community for more than 30 years now. The main aim of this article is to counter this misconception and to demystify the use of quaternion algebra for solving problems in signal and image processing. To this end, we propose a comprehensive and objective overview of the key aspects of quaternion representations, models, and methods and illustrate our journey through the literature with flagship applications. We conclude this work by an outlook on the remaining challenges and open problems in quaternion signal and image processing.History, background, and aim of the articleQuaternions were first introduced by Irish mathematician Sir William Rowan Hamilton in 1843 as a result of his dedication to generalizing complex numbers in more than two dimensions. He spent many years trying in vain to define a 3D algebra based on a system of triplets. Story has it that he discovered quaternions when walking along the Royal Canal in Dublin, Ireland, on 16 October 1843 and immediately carved the fundamental equation for quaternion algebra in the stone of the nearby Brougham Bridge:

\[{\boldsymbol{i}}^{2} = {\boldsymbol{j}}^{2} = {\boldsymbol{k}}^{2} = {\boldsymbol{ijk}} = {-}{1} \tag{1} \]

where i, j, and k define imaginary units. While the carving has now disappeared, a plaque honoring Hamiltonâ€™s memory can be found at the same place today. Hamilton devoted his last 20 years to the study of his quaternions, which culminated in his book Elements of Quaternions. After his death in 1865, quaternions remained fashionable for some time, but they were rapidly superseded by the advent of linear algebra as we know it today through the work of Gibbs and Heaviside at the end of the 19th century. To learn more about the fascinating history of quaternions and linear algebra, we recommend reading A History of Vector Analysis: Evolution of the Idea of a Vectorial System, by Michael J. Crowe, Dover Publications, 1994. Still, Hamilton was a precursor in many aspects and influenced many. For instance, he invented the term vector well before the advent of modern linear algebra; at the time, it simply referred to the 3D imaginary part of a quaternion.The set of quaternions is usually denoted by

${\Bbb{H}}$

as a tribute to Hamiltonâ€™s discovery. Just as complex numbers are well known to describe algebraically the geometry of the 2D plane, quaternion algebra permits straightforward descriptions of geometric transformations in 3D and 4D spaces. As a generalization of complex numbers to higher dimensions, quaternions are the first and simplest example of hypercomplex numbers. For more details on the topic of hypercomplex algebras, we refer the interested reader to Hypercomplex Numbers: An Elementary Introduction to Algebras, by I.L. Kantor and A.S. Solodovnikov, Springer, 1989. Formally, a quaternion is defined by a real (or scalar) part and an imaginary (or vector) part made of three components along imaginary units i, j, and k. This close relationship between purely imaginary (or, simply, pure) quaternions and vectors in

${\Bbb{R}}^{3}$

is fundamental. In fact, the triplet of imaginary units (i, j, k) can be identified with the canonical Cartesian basis of

${\Bbb{R}}^{3}$

given by

${(}{\bf{e}}_{1},{\bf{e}}_{2},{\bf{e}}_{1}\,{\times}\,{\bf{e}}_{2}{)}$

, where Ã— denotes the cross product between vectors of

${\Bbb{R}}^{3}$

. Remarkably, quaternion algebra encodes the cross product operation in a natural way since

${\boldsymbol{ij}} = {\boldsymbol{k}}$

,

${\boldsymbol{jk}} = {\boldsymbol{i}}$

, or

${\boldsymbol{ki}} = {\boldsymbol{j}}$

. More generally, the product of two quaternions involves 3D scalar products and cross products. This also explains why the multiplication of two quaternions is noncommutative: it results from the well-known noncommutativity of the cross product and translates the fact that geometric transformations in 3D and higher dimensions lack commutativity as well. For later reference, Table 1 collects essential definitions, sets, properties, and polar forms related to quaternion algebra.Table 1. A handbook of quaternion algebra.miron_t1-3278071Perhaps one of the most striking examples of their utilization in todayâ€™s applications lies in quaternionsâ€™ ability to represent 3D rotations. Representing a 3D rotation with a single-unit quaternion has many benefits over standard Euler angles rotation matrices: a lower number of parameters, no gimbal lock singularities (this refers to the loss of one degree of freedom that can occur when using Euler angles to parameterize 3D rotations, causing important practical issues when representing a sequence of rotations), and nice interpolation properties between rotations. These advantages have been acknowledged for a long time in robotics [1] and computer graphics [2], where the use of quaternions is well established. On the contrary, the use of quaternions in signal and image processing is still blooming, with the first works dating back to the early 1990s [3].This article aims at providing an overview of the current use of quaternions in signal and image processing, ranging from data representation using quaternions to dedicated quaternion domain methods and algorithms. It is intended to demystify the field for newcomers and make it accessible to the many. We hope to demonstrate that, up to the special care required to extend standard signal and image processing tools to quaternion algebra, the use of quaternion domain approaches enables a compact, elegant, and interpretable way to handle geometric properties of signals and images.Representing signals and images with quaternionsMany physical phenomena can be probed using (electronic) sensors. From a very broad viewpoint, sensors transform complex physical properties into electrical properties (such as output voltage or current) that can be processed by further electronics. For that reason, physical measurements always boil down to acquiring real values: intensity of light passing through a color filter or variations of amplitude along one direction in an accelerometer, for instance. Data recordings therefore correspond to arrays of real numbers, such as vectors (e.g., univariate signals) or matrices (e.g., gray-scale images). However, even if raw data are intrinsically real valued, one often takes advantage of other representations to facilitate modeling, analysis, or processing. One of the most striking examples is, perhaps, the use of complex numbers in signal and image processing. Complex numbers arise naturally when transforming raw data by using the (complex) Fourier transform. Such a manipulation enables many insights that would otherwise be (almost) impossible. For instance, complex numbers define unambiguously the essential notions of magnitude and phase, which are pivotal to signal processing practice: spectral analysis, filter design, time-frequency analysis, array processing, and so on. They also provide a compact and elegant way to write pairs of signals, such as in-phase and quadrature components in communications or functional magnetic resonance imaging. These several convenient properties explain the popularity of complex-valued representations in signal and image processing.Quaternions are no different in that respect. Just like complex numbers, they offer a novel representation space, which exhibits several unique properties, such as polar forms and natural handling of 3D geometry, which can be interesting to exploit in applications. More importantly, quaternions define a (skew) field. This means that except for the noncommutativity of the quaternion product, quaternions have the same desirable properties as the real and complex fields. This ensures that the mathematical foundations crucial to signal processing (the Fourier transform, vector spaces, linear algebra, and so on) can all be defined in a meaningful way. Moreover, the similarity among methodologies developed for quaternion-valued signal and image processing and their real counterparts tend to demonstrate that noncommutativity is not an issue in general, provided that it is handled in an adequate manner. Since their introduction in the signal processing community more than three decades ago, the usage of quaternion-valued representations has focused on two complementary settings, namely, the encoding of 3D and 4D signals and the construction of interpretable algebraic embeddings of signals and images.Encoding 3D and 4D vector signalsThis first setting may arguably be seen as the most natural one. The main idea is to encode the components of 3D or 4D vector signals on the three (imaginary only) or four (real and imaginary) parts of a quaternion. This allows us to extend the standard arithmetic operations over real numbers (addition, subtraction, multiplication, and division) to 3D and 4D real vectors. In the case of 2D real vectors, this extension is naturally performed by complex numbers. This way, one can handle vector quantities by using algebraic operations in a way similar to what can be done with scalars. This can be very helpful, especially for the case when 3D or 4D vector data are acquired with respect to one or two diversities (time, space, wavelength, and so on). As an illustrative example, consider the case of a color image defined by the triplet of real matrices

${\left\{{\boldsymbol{R}},{\boldsymbol{G}},{\boldsymbol{B}}\right\}}$

encoding red, green, and blue color channels, respectively. This triplet can be conveniently represented as the pure quaternion matrix

${\boldsymbol{Q}} = {\boldsymbol{iR}} + {\boldsymbol{jG}} + {\boldsymbol{kB}}$

. This algebraic representation follows directly from the identification of the imaginary units

${\boldsymbol{i}},{\boldsymbol{j}},{\text{ and }}{\boldsymbol{k}}$

with the canonical Cartesian basis of

${\Bbb{R}}^{3}$

. It permits us to separate the internal multivariate nature of the color image (i.e., a 3D vector encoding colors at each pixel) and its external multidimensional nature (an array of M Ã— N spatial pixels) in an elegant way. In comparison, the equivalent real domain representation of such data is often cumbersome and usually handled by stacking the three or four components in a single long vector or matrix. While this stacking procedure is mathematically sound, it may interfere with the intimate relationships between the internal components and the geometric properties of such vector data. On the contrary, the algebraic quaternion encoding of 3D and 4D vectors enables natural representations of multidiversity vector data as quaternion vectors and quaternion matrices. This also means that many fundamental signal processing operations for 3D and 4D vector data can be formulated in terms of quaternion linear algebra operations in a rather straightforward way. Building on these advantages, quaternions have been effectively employed to encode vector measurements in seismology [4], wind and temperature forecasting [5], electromagnetics [6], telecommunications [7], and color image processing [8], [9], to name a few.Algebraic embeddings of signals and imagesThis second setting is a little bit more intricate. It relies on carefully designed transforms that map signals or images into the quaternion domain. These transforms define quaternion embeddings, which facilitate the analysis, understanding, and processing of the original data. They ship with highly interpretable parameters, making it possible to decipher geometric features of the original signal or image. So far, most of the research toward interpretable quaternion-valued embeddings has focused on two areas: the construction of quaternion transforms for analyzing local features in gray-scale images and the development of a geometric signal processing toolbox for bivariate signals. It is worth noting that although being apparently unrelated, both approaches consider generalizations of the analytic signal in higher dimensions by exploiting quaternion algebra; they also both leverage extensively quaternion polar forms for meaningful interpretations of the embeddings. These two areas are reviewed in detail in the following.Quaternion transforms for gray-scale image analysisThe importance of the analytic signal for the understanding and modeling of the instantaneous amplitude and phase of real-valued signals has been recognized for a long time in signal processing. This motivated the study of its generalizations in higher dimensions, the most prominent one being the definition of a meaningful 2D â€œanalytic signalâ€ to analyze the local content of images (gray scale). The main difficulty in directly extending definitions from the 1D case is that higher dimensions (notably, 2D) lack a natural multidimensional Hilbert transform. To fill this gap, several approaches have been designed for the 2D case. The most salient ones exploit the higher degrees of freedom offered by quaternion algebra to formulate meaningful 2D counterparts of the 1D analytic signal. A first approach, proposed by BÃ¼low and Sommer [10], uses a carefully designed 2D quaternion Fourier transform (QFT) enjoying desirable symmetry properties for gray-scale images. This permits us to define a one-to-one mapping between a gray-scale image and a quaternion-valued image obtained by restricting its QFT to the first orthant. Further decomposing this quaternion-valued signal by using the Euler quaternion polar form (see Table 1) allows identification of a local amplitude and three phases, which are meaningful for texture analysis. This QFT-based approach was further explored in [11] with the design of a dual-tree quaternion wavelet transform for coherent multiscale analysis of gray-scale images.Another line of work, perhaps the most popular one, revolves around the monogenic signal. It was first introduced by Felsberg and Sommer [12] as a generalization of the analytic signal to the 2D case. The monogenic signal is a quaternion-valued image built from the original gray-scale image and two Riesz transforms. The interpretation as a 2D â€œanalytic signalâ€ essentially comes from the intuition that â€œthe Riesz transform is to the Hilbert transform what the gradient is to the derivative operator,â€ to quote [13]. Given a gray-scale image

${f}{(}{\boldsymbol{r}}{)}$

with spatial coordinates

${\boldsymbol{r}} = {(}{r}_{1},{r}_{2}{)}$

, the Riesz transform

${\cal{R}}{f} = {(}{\cal{R}}_{1}{f},{\cal{R}}_{2}{f}{)}$

is defined in the spatial domain as

\[{\cal{R}}_{i}{f}{(}{\boldsymbol{r}}{)} = {\text{p.v. }} \frac{1}{\pi} \mathop{\iint}\nolimits_{{\Bbb{R}}^{2}} \frac{{(r}_{i}{-}{r'}_{i}{)}}{{\left\Vert{\boldsymbol{r}}{-}{\boldsymbol{r'}}\right\Vert}_{2}^{3}}{f}{(}{\boldsymbol{r}}{)}{d}{\boldsymbol{r}},\,\,{i} = {1},{2} \tag{2} \]

where p.v. stands for Cauchy principal value. The Riesz transform is translation and scale invariant. It also exhibits nice compatibility with 2D rotations, a property known as steerability. The monogenic signal

${\cal{M}}{f}$

is constructed in the quaternion domain as

${\cal{M}}{f} = {f} + {\boldsymbol{i}}{\cal{R}}_{1}{f} + {\boldsymbol{j}}{\cal{R}}_{2}{f}$

. Being quaternion valued, it can be uniquely decomposed using the quaternion polar form

${q} = {\left\vert{q}\right\vert}{e}^{{\mu}_{q}{\phi}_{q}}$

, where the axis

${\mu}_{q}$

is a pure unit quaternion (i.e., such that

${\mu}_{q}^{2} = {-}{1}$

) and

${\phi}_{q}\,{\in}\,{[}{0},{\pi}{)}$

is the phase. Applying this polar decomposition to the monogenic signal enables identification of local features of the image

${f}{(}{\boldsymbol{r}}{)}$

in a straightforward way. It reads

\[{\cal{M}}{f}{(}{\boldsymbol{r}}{)} = {A}{(}{\boldsymbol{r}}{)}{\exp}{(}{\mu}_{\theta}{(}{\boldsymbol{r}}{)}{\phi}{(}{\boldsymbol{r}}{)}{)} \tag{3} \]

where

${A}{(}{\boldsymbol{r}}{):} = {\left\vert{\cal{M}}{f}{(}{\boldsymbol{r}}{)}\right\vert}$

defines the local amplitude,

${\phi}{(}{\boldsymbol{r}}{)}$

is the local phase, and the axis

${\mu}_{\theta}{(}{\boldsymbol{r}}{)} = {\boldsymbol{i}}\,{\cos}\,{\theta}{(}{\boldsymbol{r}}{)} + {\boldsymbol{j}}\,{\sin}\,{\theta}{(}{\boldsymbol{r}}{)}$

defines a local orientation

${\theta}{(}{\boldsymbol{r}}{)}\,{\in}\,{[}{-}{\pi},{\pi}{)}$

. Note that the axis

${\mu}_{\theta}{(}{\boldsymbol{r}}{)}$

has no k component, as a result of the construction of the monogenic signal

${\cal{M}}{f}{(}{\boldsymbol{r}}{)}$

along the imaginary axes i and j. As a first example, consider a plane wave

${f}{(}{\boldsymbol{r}}{)} = {A}_{0}{\cos}{\left({\kappa}\,{\cdot}\,{\boldsymbol{r}}\right)}$

, where

${\kappa} = {(}{\kappa}_{1},{\kappa}_{2}{)}\,{\in}\,{\Bbb{R}}^{2}$

is the wavenumber vector. Direct computations of the Riesz transform yield the monogenic signal

${\cal{M}}{f}{(}{\boldsymbol{r}}{)} = {A}_{0}{\exp}{\left[{(}{\kappa}\,{\cdot}\,{\boldsymbol{r}}{)}{(}{\boldsymbol{i}}\,{\cos}\,{\theta}_{0} + {\boldsymbol{j}}\,{\sin}{\theta}_{0}{)}{)}\right]}$

, with

${\theta}_{0} = {\arg}{(}{k}_{1} + {\boldsymbol{i}}{k}_{2}{)}$

. Hence, the local amplitude is constant

${A}{(}{\boldsymbol{r}}{)} = {A}_{0}$

, the local phase

${\phi}{(}{\boldsymbol{r}}{)} = {\kappa}\,{\cdot}\,{\boldsymbol{r}}$

is directly that of the cosine wave, and the local orientation is constant

${\theta}{(}{\boldsymbol{r}}{)} = {\theta}_{0}$

, corresponding to that of the wavenumber vector k in the 2D plane. Figure 1 depicts a more sophisticated example corresponding to a 2D amplitude modulation (AM)-frequency modulation (FM) mode. The monogenic signal permits direct identification of Gaussian AM kernel

${A}{(}{\boldsymbol{r}}{)}$

and local orientation

${\theta}{(}{\boldsymbol{r}}{)}$

. The local phase

${\phi}{(}{\boldsymbol{r}}{)}$

allows us to detect lines

${(}{\phi}{(}{\boldsymbol{r}}{)} = {0}{\text{ mod}}{\pi}{)}$

simultaneously with contours

${(}{\phi}{(}{\boldsymbol{r}}{)} = {\pi} / {2}{)}$

.miron01-3278071Figure 1. Monogenic signal analysis of an amplitude modulation-frequency modulation mode. The quaternion polar form enables identification of the local amplitude (Gaussian kernel envelope); local orientation (shown as direction for visualization purposes), which gives the orientation of the tangent vector of contour lines; and local phase, which encodes image lines (zero or pi values) and contours (

${\pi} / {2}$

values). The (a) original image, (b) local amplitude, (c) location orientation (mod

${\pi}$

), and (d) local phase.The monogenic signal provides key insights into the geometry of 2D gray-scale images. However, it suffers from the same limitations as the standard 1D analytic signal approach. It exhibits poor performance in noisy settings and fails to capture meaningful local features when considering multicomponent 2D signals, such as the superposition of 2D AMâ€“FM modes. Therefore, an important line of research has focused on extending the monogenic signal approach toward multiscale or multiresolution analysis of gray-scale images. The general idea is to devise multiple filter banks built from monogenic wavelet functions at different scales. Then, at each scale or resolution, one can identify corresponding local features by computing the quaternion polar form of coefficients. For completeness, it is worth noting that not all works subsequent to the original seminal paper [12] use an explicit quaternion formulation of the monogenic signal. However, they largely make use of local angle and axis features, which are naturally connected to the quaternion polar form (3) of the monogenic signal. Therefore, these approaches can be labeled as quaternion based in a broad sense. Hereafter, we mention some of these extensions. One is the monogenic continuous wavelet transform [14], which can be seen as a generalization of the 1D analytic wavelet transform to the case of gray-scale images. In the discrete case, a minimally redundant monogenic multiresolution analysis was proposed in [13], using so-called Rieszâ€“Laplace wavelets. A generalization of the curvelet transform to the monogenic case, called the monogenic curvelet transform, was proposed in [15]. Other proposed approaches include transposing ideas from mode reconstructions in time-frequency analysis to the case of the monogenic signal, leading to the monogenic synchrosqueezing transform [16], or extending monogenic wavelet decompositions to the case of color images [17]. Monogenic signal-based approaches have found many applications (e.g., texture segmentation, target recognition, and boundary detection) in various domains, such as medical imaging, synthetic aperture radar imaging, and geophysics.QFT for bivariate signal processingBivariate signals appear in a broad range of applications where the joint analysis of two real-valued time series is required: polarized waveforms in seismology and optics, eastward and northward current velocities in oceanography, or even gravitational waves (GWs) emitted by coalescing compact binaries. In such applications, it is crucial to provide clear and straightforward interpretations of the joint geometric and dynamic properties of the two components

${x}_{1}{(t)}$

and

${x}_{2}{(t)}$

that define the bivariate signal. Formally, a bivariate signal can be represented in two equivalent ways: a 2D time-evolving vector

${\bf{x}}{(}{t}{)} = {[}{x}_{1}{(}{t}{),}{x}_{2}{(}{t}{)}{]}\,{\in}\,{\Bbb{R}}^{2}$

or a complex-valued signal

${x}{(}{t}{)} = {x}_{1}{(}{t}{)} + {\boldsymbol{i}}{x}_{2}{(}{t}{)}\,{\in}\,{\Bbb{C}}$

encoding the two components on its real and imaginary parts. While the vector representation is generic (meaning that it is not restricted to the bivariate case), it also hinders a natural understanding of the geometric properties of bivariate signals. On the other hand, the complex representation permits the definition of a meaningful quaternion framework for bivariate signals, relying on 1) a dedicated QFT and 2) the extensive use of quaternion calculus (such as polar forms) to extract relevant physical and geometric information.The key intuition for a quaternion spectral representation of bivariate signals is rather simple. For real-valued (that is, univariate) signals, the use of the standard complex Fourier transform enables a complex-valued spectral representation. This complex embedding of univariate signals is at the heart of definitions of amplitude and phase, which are crucial to many tasks of signal processing, such as spectral analysis, filtering, or time-frequency analysis. Now, if one represents bivariate signals as complex-valued signals, a quaternion embedding can be constructed in a similar way. First, observe that

${x}{(}{t}{)} = {x}_{1}{(}{t}{)} + {\boldsymbol{i}}{x}_{2}{(}{t}{)}\,{\in}\,{\Bbb{C}}_{i}\,{\subset}\,{\Bbb{H}}$

: it is a special case of a quaternion-valued signal. However, contrary to the complex Fourier transform, the QFT has no unique (or canonical) definition. The freedom of definition comes from the position of the exponential, which can appear either left or right of the signal x(t), and from the choice of the axis

${\mu}$

(a pure unit quaternion such that

${\mu}^{2} = {-}{1}$

) in the exponential. For instance, by choosing

${\mu} = {\boldsymbol{i}}$

, with

${x}{(}{t}{)}\,{\in}\,{\Bbb{C}}_{\boldsymbol{i}}$

, one recovers the standard complex Fourier transform. For bivariate signals, the right-sided QFT definition with

${\mu} = {\boldsymbol{j}}$

is usually adopted [18]:

\[{X}{\left({f}\right)} = \mathop{\int}\nolimits_{\Bbb{R}}{x}{(}{t}{)}{e}^{{-}{\boldsymbol{j}}{2}{\pi}{ft}}{dt},\,\,{x}{(}{t}{)}\,{\in}\,{\Bbb{C}}_{i}{.} \tag{4} \]

The definition (4) exhibits every desirable property of Fourier transforms: it is well defined for typical bivariate signals, it preserves energy and inner products (the Parseval-Plancherel theorem), and it can be computed efficiently with two fast Fourier transforms by observing that

${X}{\left({f}\right)} = {X}_{1}{\left({f}\right)} + {\boldsymbol{i}}{X}_{2}{\left({f}\right)}$

, where

${X}_{1}{\left({f}\right)},{X}_{2}{\left({f}\right)}$

are standard (

${\Bbb{C}}_{\boldsymbol{j}}$

-valued) complex Fourier transforms of

${x}_{1}{(t)}$

and

${x}_{2}{(t)}$

, respectively. More importantly, for bivariate signals viewed as

${(}{\Bbb{C}}_{\boldsymbol{i}}$

-valued signals, it exhibits a Hermitian-like symmetry

${X}{\left({-}{f}\right)} = {-}{\boldsymbol{iX}}{\left({f}\right)}{\boldsymbol{i}}$

, meaning that only the positive frequency spectrum carries relevant information. This makes it possible to define the quaternion embedding of a bivariate signal by canceling out the negative frequency spectrum. This bivariate analog of the well-known analytic signal of real-valued univariate signals is defined as

\[{x}_{+}{(}{t}{)} = \mathop{\int}\nolimits_{{\Bbb{R}}_{+}}{X}{\left({f}\right)}{e}^{{\boldsymbol{j}}{2}{\pi}{ft}}{df}{.} \tag{5} \]

The signal

${x}_{+}{(t)}$

is quaternion valued. Therefore, at each time instant, it can be decomposed thanks to the Euler polar form of a quaternion

${q} = {ae}^{{\boldsymbol{i}}{\theta}}{e}^{{-}{\boldsymbol{k}}{\chi}}{e}^{{\boldsymbol{j}}{\phi}}$

, which identifies a magnitude

${a}{:} = {\left\vert{q}\right\vert}$

and three phases corresponding to successive rotations around axes

${\boldsymbol{i}},{\boldsymbol{j}}$

, and k. The Euler polar form plays the same role as the standard polar form for the usual analytic signal. It establishes a one-to-one mapping between the original bivariate signal x(t) and a canonical quadruplet of instantaneous parameters

${[}{a}{(}{t}{),}{\theta}{(}{t}{),}{\chi}{(}{t}{),}{\phi}{(}{t}{)}{]}$

, obtained by decomposing the quaternion embedding

${x}_{+}{(t)}$

as

\[{x}_{+}{(}{t}{)} = {a}{(}{t}{)}{e}^{{\boldsymbol{i}}{\theta}{(}{t}{)}}{e}^{{-}{\boldsymbol{k}}{\chi}{(}{t}{)}}{e}^{{\boldsymbol{j}}{\phi}{(}{t}{)}}{.} \tag{6} \]

Under a classical narrowband assumption on x(t), it is now possible to attach a very insightful interpretation to the canonical parameters

${[}{a}{(}{t}{),}{\theta}{(}{t}{),}{\chi}{(}{t}{),}{\phi}{(}{t}{)}{]}$

.Figure 2(a) displays the instantaneous ellipse traced out by x(t) in the

${(}{x}_{1},{x}_{2}{)}$

plane. The ellipse is characterized by its size a(t), orientation

${\theta}{(}{t}{)}$

, and shape

${\chi}{(}{t}{)}$

: the last canonical parameter

${\phi}{(}{t}{)}$

corresponds to the dynamical phase, i.e., the instantaneous position of x(t) within the ellipse. This shows that the instantaneous parameters have a natural geometric interpretation, which also corresponds to the physical notion of polarization in optics. The geometric insights provided by the quaternion representation do not stop there. Figure 2(b) gives the PoincarÃ© sphere of polarization states, where each point of the sphere corresponds to a given polarization state. Strikingly, it can be shown that the two quadratic quantities appearing in the Parsevalâ€“Plancherel theorem for the QFT correspond directly to the physical Stokes parameters

${S}_{0},{S}_{1},{S}_{2},{\text{ and }}{S}_{3}$

, a set of four real-valued parameters widely used in optics to describe the polarization properties of light. The PoincarÃ© sphere representation is particularly helpful, as it shows that normalized Stokes parameters

${S}_{1} / {S}_{0},{S}_{2} / {S}_{0},{\text{ and }}{S}_{3} / {S}_{0}$

are Cartesian coordinates for polarization states. For instance, the poles of the PoincarÃ© sphere correspond to counterrotating circularly polarized states, whereas its equator contains linear polarization states, with orientation given by the longitude. This key property enables the definition of power spectral densities or time-frequency representations (such as spectrograms or scalograms) that have a straightforward interpretation in terms of physical polarization parameters.miron02-3278071Figure 2. (a) Instantaneous ellipse parameters

${[}{a},{\theta},{\chi},{\phi}{]}$

associated to the bivariate signal

${x}{(t)}$

thanks to the Euler polar form (6) of its quaternion embedding. (b) The PoincarÃ© sphere of polarization states, a visual tool appearing naturally in the quaternion spectral representation of bivariate signals. Second-order properties of bivariate signals have a straightforward interpretation in terms of physical Stokes parameters

${S}_{0},{S}_{1},{S}_{2}$

, and

${S}_{3}$

. In turn, normalized Stokes parameters

${S}_{1} / {S}_{0},{S}_{2} / {S}_{0}$

, and

${S}_{3} / {S}_{0}$

correspond to Cartesian coordinates of points on the PoincarÃ© sphere. (c) A typical GW bivariate signal emitted by a compact binary black hole system, exhibiting the precession of the orbital plane. (d) A quaternion spectrogram of the bivariate signal depicted in (c). It describes the time-frequency content of the signal, both in terms of energy

${(}{S}_{0}{)}$

and polarization properties (normalized Stokes parameters

${S}_{1} / {S}_{0},{S}_{2} / {S}_{0}$

, and

${S}_{3} / {S}_{0}$

).Figure 2(c) and (d) showcases the time-frequency analysis of a GW bivariate signal thanks to the QFT framework. GWs are bivariate signals, with two components

${x}_{1}{(t)}$

and

${x}_{2}{(t)}$

corresponding to plus- and cross-GW polarizations, respectively. Figure 2(c) depicts a simulated GW emitted by a binary black hole, a typical GW astrophysical source. The instantaneous variations of the interrelations between the amplitude and phase of each component yield polarization modulation, which provides crucial insights toward the orbital motion of the GW (e.g., precession or orbital eccentricity). Figure 2(d) represents the quaternion spectrogram of the bivariate GW signal depicted in Figure 2(c). The quaternion spectrogram is a natural generalization of the classical spectrogram: it relies on the definition of the quaternion short-term Fourier transform (QSTFT), which follows directly by considering the QFT (4) of windowed time-translated signal segments

${x}{(}{t}{)}{g}{(}{t}{-}{\tau}{)}$

, where

${\tau}$

is a time shift and g(t) is usually a real symmetric window with compact support [18]. Second-order conservation properties of the QSTFT lead to representing and interpreting the quaternion spectrogram in terms of time-frequency Stokes parameters. These four time-frequency maps must be analyzed jointly: ${S}_{0}$ is like the standard well-known spectrogram and gives the energy distribution of the signal in the time-frequency plane; the normalized Stokes parameters

${S}_{1} / {S}_{0},{S}_{2} / {S}_{0},{\text{ and }}{S}_{3} / {S}_{0}$

reveal the instantaneous polarization state along ridges (i.e., lines of maximal energy in

${S}_{0}$

).More generally, the quaternion spectral representation of bivariate signals enables the definition of a full algebraic signal processing framework. It puts polarization at the heart of many fundamental tools of bivariate signal processing, such as general time-frequency or timescale representations [18], the definition of power spectral densities and their estimation [19], and the design of linear time-invariant filters [20]. Quaternions provide considerable insights into the geometric properties of bivariate signals by leveraging the well-established language of polarization in optics, without sacrificing the usual mathematical guarantees or the availability of computationally cheap implementations.The quaternion toolbox for signal processingThe interesting benefits of quaternion-valued representations immediately raise a natural question: Is it possible to extend standard signal processing tools [e.g., low-rank decompositions, minimization of cost functions, least-mean-squares (LMS) algorithms, and so on] to the quaternion setting in some natural way? If the answer were negative, then quaternion-valued representations would be of little practical use since the insights gained by choosing quaternion representations would be canceled out by cumbersome processing. Fortunately, this is not the case: standard tools of signal and image processing extend nicely over quaternion algebra. Still, this is not straightforward: the noncommutativity of the multiplication of two quaternions usually prevents direct extensions from the real and complex cases. Over the past three decades, researchers have dedicated a lot of energy to establishing a meaningful signal processing toolbox to deal with quaternion-valued data. We propose a tour of these important tools and illustrate their relevance on flagship applications.Low-rank quaternion modelsAmong the numerous applications of linear algebra in signal processing, low-rank approximation plays a central role, and algorithms for performing eigenvalue decomposition (EVD) or singular value decomposition (SVD) of data matrices are used on a daily basis in a wide range of applications. The existence of EVD and SVD for quaternion matrices has been known for a long time, even though standard concepts of linear algebra have to be considered with caution because of the noncommutativity of the quaternion product. For instance, one must distinguish between right and left eigenvalues, depending on the side position of the scalar in eigenvalue problems. Given a square quaternion-valued matrix

${\boldsymbol{A}}\,{\in}\,{\Bbb{H}}^{{N}\,{\times}\,{N}}$

, its right eigenvalues

${\lambda}_{r}\,{\in}\,{\Bbb{H}}$

are defined as solutions of

${\boldsymbol{Ax}} = {\boldsymbol{x}}{\lambda}_{r}$

, where

${\boldsymbol{x}}\,{\in}\,{\Bbb{H}}^{N}$

is a nonzero quaternion vector. Similarly, left eigenvalues

${\lambda}_{l}\,{\in}\,{\Bbb{H}}$

are defined as solutions of

${\boldsymbol{Ax}} = {\lambda}_{l}{\boldsymbol{x}}$

. Right eigenvalues are well understood and have been used in many quaternion signal processing applications [21], [22], [23]. On the contrary, left eigenvalues are much more cumbersome; see, e.g., [24] for more details. For signal processing purposes, dealing with right eigenvalues is sufficient: they define natural quaternion counterparts of usual real and complex eigenvalues and can be computed efficiently. They also share some similar properties with their classical counterparts: for instance, a Hermitian matrix A [i.e.,

${\boldsymbol{A}}^{\sf{H}} = {\boldsymbol{A}}$

, where

${(}\cdot{)}^{\sf{H}}$

is the usual conjugate transpose operator] necessarily has real-valued right eigenvalues. Moreover, a necessary and sufficient condition for Hermitian A to be positive semidefinite (i.e.,

${\boldsymbol{x}}^{\sf{H}}{\boldsymbol{Ax}}\,{\geq}\,{0}$

for any x) is to have all its right eigenvalues nonnegative.The quaternion SVD (Q-SVD) of a rectangular quaternion-valued matrix

${\boldsymbol{A}}\,{\in}\,{\Bbb{H}}^{{N}\,{\times}\,{P}}$

is defined as [24]

\[{\boldsymbol{A}} = {\boldsymbol{U}}\,{\bigtriangleup}\,{\boldsymbol{V}}^{\sf{H}},{\text{ with }}{\boldsymbol{U}}\,{\in}\,{\Bbb{H}}^{{N}\,{\times}\,{N}},\,{\boldsymbol{V}}\,{\in}\,{\Bbb{H}}^{{P}\,{\times}\,{P}},{\text{ and }}\,{\bigtriangleup}\,{\in}\,{\Bbb{R}}_{+}^{{N}\,{\times}\,{P}}{.} \tag{7} \]

The expression of Q-SVD looks very familiar:

${\bigtriangleup}$

is a rectangular diagonal matrix with nonnegative real numbers (the singular values of A) on its diagonal

${\bigtriangleup}_{ii} = {\delta}_{i}$

, and U and V are unitary quaternion matrices, meaning that

${\boldsymbol{U}}^{\sf{H}}{\boldsymbol{U}} = {\boldsymbol{I}}_{N}$

and

${\boldsymbol{V}}^{\sf{H}}{\boldsymbol{V}} = {\boldsymbol{I}}_{P}$

, where I is the identity matrix. Columns of U define the left singular vectors of A, whereas columns of V give the right singular vectors. Moreover, Q-SVD enables the computation of the best rank-

${\ell}$

approximation of a quaternion matrix in the Frobenius norm: one simply truncates the Q-SVD by keeping only the first

${\ell}$

terms corresponding to the

${\ell}$

largest singular values. As an important special case of Q-SVD, consider the case of a Hermitian matrix

${\boldsymbol{A}}\,{\in}\,{\Bbb{H}}^{{N}\,{\times}\,{N}}$

. Its Q-SVD reads

${\boldsymbol{A}} = {\boldsymbol{U}}{\Sigma}{\boldsymbol{U}}^{\sf{H}}$

, where

${\Sigma} = {\text{diag}}{\left({\left\vert{\lambda}_{1}\right\vert},{\ldots},{\left\vert{\lambda}_{N}\right\vert}\right)}$

and

${\lambda}_{1},{\ldots},{\lambda}_{N}\,{\in}\,{\Bbb{R}}$

are the (right) eigenvalues of A. Therefore, if A is positive semidefinite, then its right eigenvalues are real nonnegative, and the Q-SVD of A is identical to its quaternion right EVD (Q-EVD). In particular, columns of U correspond to the (right) eigenvector of A. As a result, computing the Q-EVD of a positive definite Hermitian matrix comes down computing its Q-SVD, for which efficient algorithms exist [25]. This result is particularly important for subspace methods in array processing, which rely on diagonalization of covariance matrices built from quaternion-valued samples, such as quaternion MUSIC (Q-MUSIC). See the details in the following.Q-SVD plays a crucial role in the definition of any subspace methods over quaternions. In vector sensor array processing, for example, Q-SVD allows us to separate quaternion data into signal and noise subspaces and to take advantage of vector sensor measurements to perform source localization [25], [26], [27], [28]. However, one may question the relevance of using quaternions for subspace estimation. The mathematical soundness of quaternion subspace methods relies on the orthogonality constraint inherited from the scalar product over

${\Bbb{H}}^{N}$

. The columns of U and of V in (7) are orthogonal, meaning that

${\boldsymbol{u}}_{\ell}^{\sf{H}}{\boldsymbol{u}}_{\ell'} = {0}$

and

${\boldsymbol{v}}_{\ell}^{\sf{H}}{\boldsymbol{v}}_{\ell'} = {0}$

for

${\ell}\,{\ne}\,{\ell'}$

. This actually imposes that the real and three imaginary parts of this scalar product cancel out simultaneously. Such a constraint is much stronger than complex orthogonality (considering a pair of complex-valued vectors rather than quaternion vectors). As a consequence, quaternion subspace methods are usually more robust to model errors than their long-vector complex counterparts.Q-MUSIC for vector sensor array processingAn illustrative example of the application of subspace methods in signal processing is the estimation of the direction of arrival (DOA) of polarized sources impinging on a vector sensor array by using the Q-MUSIC algorithm introduced in [26]. For 2D vector sensors (crossed dipoles, geophones, acoustic vector sensors, and so on), a narrowband fully polarized source can be represented as the quaternion embedding of a bivariate signal (6) with linear phase

${\phi}{(}{t}{)} = {\omega}_{0}{t}$

, slowly varying amplitude a(t), and constant polarization parameters

${[}{\theta},{\chi}{]}$

. Consider a uniform linear array made of M 2D vector sensors, identically oriented, with equal spacing

${\bigtriangleup}{x}$

. Assuming a single far-field source, the response of the mth sensor reads

${x}_{m}{(}{t}{)} = {a}{(}{t}{)}{\exp}{[}{\boldsymbol{i}}{\theta}{]}{\exp}{[}{-}{\boldsymbol{k}}{\chi}{]}{\exp}{[}{\boldsymbol{j}}{\omega}_{0}{t}{]}{\exp}{[}{-}{\boldsymbol{j}}{\tau}_{m}{(}{\alpha}{)}{]}$

, where

${\tau}_{m}{(}{\alpha}{)} = {(}{2}{\pi} / {\lambda}{)}{(}{m}{-}{1}{)}{\bigtriangleup}{x}\,{\sin}\,{\alpha}$

is the phase delay between the first and mth sensors. Here,

${\lambda}$

denotes the wavelength and

${\alpha}$

the DOA of the source in the array plane. Therefore, a snapshot of the array at time instant t can be written as the quaternion-valued vector

${\boldsymbol{x}}{(}{t}{)}\,{\in}\,{\Bbb{H}}^{M}$

such that

\[{\boldsymbol{x}}{(}{t}{)} = {x}_{0}{(}{t}{)}{\boldsymbol{v}}{(}{\alpha}{),}{\text{ with }}{\boldsymbol{v}}{(}{\alpha}{)} = {\left[{1},{e}^{{-}{\boldsymbol{j}}{\tau}_{2}{(}{\alpha}{)}},{\ldots},{e}^{{-}{\boldsymbol{j}}{\tau}_{M}{(}{\alpha}{)}}\right]}^{\top} \tag{8} \]

where

${x}_{0}{(t)}$

is the quaternion-valued signal measured by the first sensor and

${(}\cdot{)}^{\top}$

denotes the transpose of a matrix. Generalizing this model to a noise-corrupted linear superposition of K far-field polarized sources, a snapshot of the output of the sensor array can be expressed as

\[{\boldsymbol{y}}{(}{t}{)} = \mathop{\sum}\limits_{{k} = {1}}\limits^{K}{x}_{0k}{(}{t}{)}{\boldsymbol{v}}_{k}{(}{\alpha}_{k}{)} + {\boldsymbol{n}}{(}{t}{)} \tag{9} \]

with

${\boldsymbol{v}}_{k}$

the steering vector associated to source k of DOA ${\alpha}_{k}$ and where

${\boldsymbol{n}}{(}{t}{)}\,{\in}\,{\Bbb{H}}^{M}$

is the noise vector on the array. Following the rationale of subspace methods developed in the real and complex cases, one forms the quaternion-valued covariance matrix of the data

${\boldsymbol{C}}_{\boldsymbol{yy}} = {E}{\{}{\boldsymbol{y}}{(}{t}{)}{\boldsymbol{y}}^{H}{(}{t}{)}{\}}\,{\in}\,{\Bbb{H}}^{{M}\,{\times}\,{M}}$

, where

${E}{\{}{\cdot}{\}}$

denotes the mathematical expectation. Then, under some commonly accepted assumptions (uncorrelated sources and white noise), the matrix

${\boldsymbol{C}}_{\boldsymbol{yy}}$

is shown [26] to be written as

\[{\boldsymbol{C}}_{\boldsymbol{yy}} = {\boldsymbol{D}}{\Sigma}{\boldsymbol{D}}^{\sf{H}} + {\sigma}_{n}^{2}{\boldsymbol{I}}_{M} \tag{10} \]

where the columns of

${\boldsymbol{D}}\,{\in}\,{\Bbb{H}}^{{M}\,{\times}\,{K}}$

correspond to the K source vectors;

${\boldsymbol{d}}_{k}{(}{\theta}_{k},{\chi}_{k},{\alpha}_{k}{)} = {e}^{{\boldsymbol{i}}{\theta}_{k}}{e}^{{-}{\boldsymbol{k}}{\chi}_{k}}{\boldsymbol{v}}_{k}{(}{\alpha}_{k}{),}\,{\Sigma} = {\text{diag}}{(}{\sigma}_{1}^{2},{\ldots},{\sigma}_{K}^{2}{)}$

, with

${\sigma}_{k}^{2} = {\left\vert{x}_{0k}{(t)}\right\vert}^{2}$

the kth source power; and

${\sigma}_{n}^{2}$

is the noise power. Now, classical arguments yield that the signal subspace of the observations is spanned by the K quaternion eigenvectors associated to the K largest eigenvalues of

${\boldsymbol{C}}_{\boldsymbol{yy}}$

. The remaining M â€“ K eigenvectors form an orthonormal basis in the noise subspace. Recall that in this case, the Q-EVD of

${\boldsymbol{C}}_{\boldsymbol{yy}}$

boils down to the Q-SVD (since it is Hermitian positive semidefinite). By leveraging the orthogonality between the signal and noise subspaces, the expression of the Q-MUSIC functional is

\[{\cal{S}}_{\text{Q-MUSIC}}{(}{\theta},{\chi},{\alpha}{)} = \frac{1}{{\boldsymbol{d}}^{\sf{H}}{(}{\theta},{\chi},{\alpha}{)}{\bf{\Pi}}{\boldsymbol{d}}{(}{\theta},{\chi},{\alpha}{)}} \tag{11} \]

where

${\boldsymbol{d}}{(}{\theta},{\chi},{\alpha}{)} = {e}^{{\boldsymbol{i}}{\theta}}{e}^{{-}{\boldsymbol{k}}{\chi}}{\boldsymbol{v}}{(}{\alpha}{)}$

is the steering vector and

${bf{\Pi}} = {\boldsymbol{U}}_{{K} + {1}{:}}{\boldsymbol{U}}_{{K} + {1}{:}}^{\sf{H}}$

is the orthogonal projector on the noise subspace spanned by

${\boldsymbol{U}}_{{K} + {1}{:}} = {[}{\boldsymbol{u}}_{{K} + {1}},{\ldots},{\boldsymbol{u}}_{M}{]}$

. Similar to the standard MUSIC algorithm for scalar sensors, the K highest local maxima of (11) correspond to the DOAs and polarization parameters of the K sources.Classically, when dealing with 2D vector sensors, the two complex-valued components of size M of the array are simply concatenated in a long vector of size 2M. Then, the standard MUSIC algorithm is applied to this enhanced data vector. This is commonly known as long-vector MUSIC (LV-MUSIC). Compared to LV-MUSIC, where classical orthogonality is imposed globally over the two stacked components of the array, the quaternion vector orthogonality associated with Q-MUSIC implies stronger orthogonality constraints (component-wise) between signal and noise subspaces. It was shown in [26] that these quaternion constraints enable a better separation between subspaces. This results in improved performance (in terms of estimation error and resolution) compared to LV-MUSIC while reducing the size of the data covariance representation. Figure 3 provides a brief illustration of the advantages of Q-MUSIC compared to its long-vector counterpart. Following these results, several other quaternion methods [4], [6], [29] extending the standard subspace-based DOA estimation algorithms to the quaternion framework have been proposed.miron03-3278071Figure 3. A comparison of Q-MUSIC and LV-MUSIC for a uniform linear array with

${M} = {10}$

sensors with spacing

${\Delta}{x} = {\lambda} / {4}$

. The signal-to-noise ratio (SNR) was set to 0 dB, assuming

${\Bbb{H}}$

-proper Gaussian noise. A sample covariance matrix was computed using 32 samples in all experiments. (a) Plots of the Q-MUSIC (red) and LV-MUSIC (blue) functionals for two far-field polarized sources with DOAs

${\alpha}_{1}$

and

${\alpha}_{2}$

. The two panels depict values of functionals with respect to angle

${\alpha}$

along two slices corresponding to polarization parameters of each source (left:

${\theta} = {\pi} / {3},{\chi} = {\pi} / {5}$

; right:

${\theta} = {-}{\pi} / {6},{\chi} = {-}{\pi} / {5}$

). (b) A performance comparison in terms of the mean square error (MSE) on the DOA in a single-source scenario. The curves depict average values computed using 1,000 independent runs.Solving optimization problems in the quaternion domainModern signal and image processing problems can often be formulated as solutions of optimization problems. Quaternion domain problems are no exception. For instance, the quaternion LMS (QLMS) algorithm described in the â€œStatistics for Quaternion Random Variablesâ€ section or the sparse coding of 3D/4D data (see the following) can all be formulated as the problem of minimizing a real-valued cost function f with quaternion arguments. Formally, one is interested in solving optimization problems of the form

\[{\hat{q}} = \mathop{\arg\min}\limits_{{q}\,{\in}\,{\Bbb{H}}}{f}{(}{q}{)} \tag{12} \]

where

${f}{:}{\Bbb{H}}\,{\rightarrow}\,{\Bbb{R}}$

is a cost function (e.g., a quadratic form) and

${\hat{q}}$

is the solution of the minimization problem. This formulation extends easily to the case of quaternion vectors

${\boldsymbol{q}}\,{\in}\,{\Bbb{H}}^{M}$

and quaternion matrices

${\boldsymbol{Q}}\,{\in}\,{\Bbb{H}}^{{M}\,{\times}\,{N}}$

variables. Now, how can one solve (12)? If q were a standard real-valued variable, it would be sufficient to explore the stationary points of f (points where the derivative of f vanishes, at least in the differentiable case) to obtain the different local minima of the cost function f. Unfortunately, the same approach cannot be applied directly with quaternion variables. The main reason is that quaternion differentiability (in a quaternion analysis sense) can be defined only for analytic functions

${f}{:}{\Bbb{H}}\,{\rightarrow}\,{\Bbb{H}}$

. In particular, cost functions, which are of interest for signal and image processing, are not analytic, as they take values only in

${\Bbb{R}}$

. On the other hand, the real-valued cost function f in (12) of quaternion argument

${q} = {q}_{a} + {\boldsymbol{i}}{q}_{b} + {\boldsymbol{j}}{q}_{c} + {\boldsymbol{k}}{q}_{d}$

can also be considered a function of its four real-valued components

${q}_{a},{q}_{b},{q}_{c},{a}{n}{d}{ }{q}_{d}$

such that

${f}{(}{q}{)} = {f}{(}{q}_{a},{q}_{b},{q}_{c},{q}_{d}{)}$

, with little abuse of notation. This formal identification suggests that one could compute the derivative of the function f as if it were a function of

${\Bbb{R}}^{4}$

. This apparent contradiction between real-domain and quaternion domain derivatives is addressed by the theory of (generalized)

${\Bbb{HR}}$

calculus, developed in the first half of the 2010 decade [30], [31], [32], [33]. In essence, it forms a complete set of calculus rules that permit computations of derivatives of functions of quaternion variables as if they were functions of their real and imaginary parts. Still, it operates directly in the quaternion domain, meaning that the burden of transforming (12) into an equivalent real-domain optimization problem can be completely avoided.Before going further, let us mention that the construction of

${\Bbb{HR}}$

calculus is closely related to the framework of

${\Bbb{CR}}$

calculus (also known as Wirtinger calculus) for solving optimization problems in complex variables, which are often encountered in complex-valued signal processing. Still, the derivation of

${\Bbb{HR}}$

calculus is far from being trivial. Many differences and subtleties arise when generalizing from complex variables to quaternion variables, in particular due to the noncommutativity of the quaternion product. While we survey only the basics of

${\Bbb{HR}}$

calculus here to illustrate its usefulness, we refer the interested reader to the seminal papers [30], [31], and [32] for detailed proofs and extensive results.Let us consider a cost function

${f}{:}{\Bbb{H}}\,{\rightarrow}\,{\Bbb{R}}$

. We suppose, for convenience, that f is real differentiable [32]; that is, the function

${f}{(}{q}_{a},{q}_{b},{q}_{c},{q}_{d}{)}$

is differentiable with respect to its four variables

${q}_{a},{q}_{b},{q}_{c}$

, and

${q}_{d}$

. The

${\Bbb{HR}}$

derivatives of f with respect to q and

${\bar{q}}$

are defined as [33]

\begin{align*} \frac{\partial{f}}{\partial{q}} & = {\frac{1}{4}}{\left(\frac{\partial{f}}{\partial{q}_{a}}{-}\frac{\partial{f}}{\partial{q}_{b}}{\boldsymbol{i}}{-}\frac{\partial{f}}{\partial{q}_{c}}{\boldsymbol{j}}{-}\frac{\partial{f}}{\partial{q}_{d}}{\boldsymbol{k}}\right)} \tag{13} \\ \frac{\partial{f}}{\partial{\bar{q}}} & = {\frac{1}{4}}{\left(\frac{\partial{f}}{\partial{q}_{a}} + \frac{\partial{f}}{\partial{q}_{b}}{\boldsymbol{i}} + \frac{\partial{f}}{\partial{q}_{c}}{\boldsymbol{j}} + \frac{\partial{f}}{\partial{q}_{d}}{\boldsymbol{k}}\right)}{.} \tag{14} \end{align*}

More generally, it is also possible to define

${\Bbb{HR}}$

derivatives with respect to canonical involutions

${q}^{\boldsymbol{i}},{q}^{\boldsymbol{j}}{,}{\text{ and }}{q}^{\boldsymbol{k}}$

and their respective conjugates. Equations (13) and (14) show that

${\Bbb{HR}}$

derivatives are quaternion valued, as they collect partial derivatives of f(q) with respect to each one of the components of q along the canonical basis

${\{}{1},{\boldsymbol{i}},{\boldsymbol{j}},{\boldsymbol{k}}{\}}$

. Note that since we assume f to be real valued, the two

${\Bbb{HR}}$

derivatives are related through conjugation

${\overline{{(}\partial{f} / \partial{q}{)}}} = {\partial{f}} / {\partial{\bar{q}}}$

. However, to turn the fundamental definitions (13) and (14) into a practical framework for computing quaternion derivatives, it is necessary to consider additional derivatives with respect to

${q}^{\mu}$

and

${\overline{{q}^{\mu}}}$

, where

${\mu}\,{\in}\,{\Bbb{H}}$

and

${\mu}\,{\ne}\,{0}$

. This leads to generalized

${\Bbb{HR}}$

calculus [32], which equips (13) and (14) with convenient product and chain rules. It is worth noting that these properties hold for arbitrary quaternion-valued functions

${f}{:}{\Bbb{H}}\,{\rightarrow}\,{\Bbb{H}}$

; i.e., the theory is not limited to cost functions. Importantly, the (generalized)

${\Bbb{HR}}$

calculus framework has been extended [30] to compute derivatives with respect to quaternion vector and matrix variables. Vector and matrix counterparts of

${\partial{f}} / {\partial{q}}$

and

${\partial{f}} / {\partial{\bar{q}}}$

are denoted in quite a natural way; i.e.,

${\nabla}_{\boldsymbol{q}}{f}$

and

${\nabla}_{\bar{\boldsymbol{q}}}{f}$

denote

${\Bbb{HR}}$

gradients of

${f}{(}{\boldsymbol{q}}{)}$

, while

${\nabla}_{\boldsymbol{Q}}{f}$

and

${\nabla}_{\bar{\boldsymbol{Q}}}{f}$

correspond to matrix derivatives of the function

${f}{(}{\boldsymbol{Q}}{)}$

. Table 2 collects such derivatives for some elementary cost functions in the scalar, vector, and matrix cases. Only derivatives with respect to conjugates

${\bar{q}},{\bar{\boldsymbol{q}}}$

, and

${\bar{\boldsymbol{Q}}}$

are given; since the considered functions are real valued, it suffices to take the quaternion conjugate to obtain derivatives with respect to

${q},{\boldsymbol{q}}$

, or Q. Moreover, observe the similarity between

${\Bbb{HR}}$

derivatives and usual real-domain functions: the only noticeable difference is the unusual 1/4 prefactor arising from definitions (13) and (14).Table 2. The

${\Bbb{HR}}$

derivatives of usual cost functions of quaternion scalar, vector, and matrix arguments.miron_t2-3278071As explained earlier, solving (12) requires the ability to find stationary points of the cost function. The

${\Bbb{HR}}$

calculus enables their characterization as one would expect: for instance, a vector

${\boldsymbol{q}}_{\star}$

is a stationary point of

${f}{(}{\boldsymbol{q}}{)}$

if the

${\Bbb{HR}}$

gradient vanishes at this point; i.e.,

${\nabla}_{\boldsymbol{q}}{f}{(}{\boldsymbol{q}}_{\star}{)} = {\nabla}_{\bar{\boldsymbol{q}}}{f}{(}{\boldsymbol{q}}_{\star}{)} = {\boldsymbol{0}}$

. Moreover, the direction of the maximum rate of change of f at point q is given by

${\nabla}_{\bar{\boldsymbol{q}}}{f}{(}{\boldsymbol{q}}{)}$

, that is, the

${\Bbb{HR}}$

gradient with respect to

${\bar{\boldsymbol{q}}}$

. This yields immediately the steepest gradient descent method in the quaternion domain as iterations:

\[{\boldsymbol{q}}^{{(}{k} + {1}{)}} = {\boldsymbol{q}}^{(k)}{-}{\eta}_{k}{\nabla}_{\bar{\boldsymbol{q}}}{f}{\left({\boldsymbol{q}}^{(k)}\right)} \tag{15} \]

where

${\eta}_{k}\,{>}\,{0}$

is the step size at iteration k. Equation (15) is one of the first building blocks of quaternion domain optimization. For instance, it represents a cornerstone of the derivation of the QLMS algorithm and its variants, as we explain in detail in the â€œStatistics for Quaternion Random Variablesâ€ section. The generalized

${\Bbb{HR}}$

derivatives framework can be further refined. For instance, it is possible to define a meaningful quaternion Hessian matrix, thus paving the way to extend well-known second-order algorithms, such as the Newton method, to the quaternion domain. We refer the interested reader to [31] for further details on this topic.To further illustrate the generalized

${\Bbb{HR}}$

calculus framework, let us consider the quaternion sparse coding problem. Let

${\boldsymbol{y}}\,{\in}\,{\Bbb{H}}^{M}$

be a vector of observations that one seeks to represent in a dictionary

${\boldsymbol{D}}\,{\in}\,{\Bbb{H}}^{{M}\,{\times}\,{N}}$

as

${\boldsymbol{y}}\,{\approx}\,{\boldsymbol{Dq}}$

, where q is a sparse vector and N > M. This problem appears naturally in color imaging, where y is a color patch encoded as a pure quaternion vector, D is an overcomplete collection of color atoms, and q is a sparse quaternion vector. Just like in the real and complex cases, this problem can be solved using greedy algorithms based on the

${\ell}_{0}$

penalty, such as quaternion orthogonal matching pursuit [8], [34]. On the other hand, following standard practice, one can relax the nonconvex

${\ell}_{0}$

penalty into a convex

${\ell}_{1}$

-norm regularization, leading to the quaternion least absolute shrinkage and selection operator (LASSO) problem [35], [36] (or equivalently, quaternion basis pursuit denoising):

\[\mathop{\arg\min}\limits_{{\boldsymbol{q}}\,{\in}\,{\Bbb{H}}^{M}}{\left\Vert{\boldsymbol{y}}{-}{\boldsymbol{Dq}}\right\Vert}_{2}^{2} + {\lambda}{\left\Vert{\boldsymbol{q}}\right\Vert}_{1},\,\,{\lambda}\,{\geq}\,{0}{.} \tag{16} \]

In (16), the quaternion

${\ell}_{1}$

norm of vector

${\boldsymbol{q}}\,{\in}\,{\Bbb{H}}^{M}$

is defined as the sum of moduli of its entries

${\left\Vert{\boldsymbol{q}}\right\Vert}_{1}{:} = {\Sigma}_{{m} = {1}}^{M}{\left\vert{q}_{m}\right\vert}$

. Interestingly, since the quaternion

${\ell}_{1}$

norm collects a sum of moduli, it can be interpreted as a mixed norm

${\ell}_{2,1}$

on the real matrix A obtained by concatenating the four real-valued components of q such that

${\boldsymbol{A}} = {\left[{\boldsymbol{q}}_{a}{\boldsymbol{q}}_{b}{\boldsymbol{q}}_{c}{\boldsymbol{q}}_{d}\right]}\,{\in}\,{\Bbb{R}}^{{M}\,{\times}\,{4}}$

; i.e.,

${\left\Vert{\boldsymbol{q}}\right\Vert}_{1} = {\left\Vert{\boldsymbol{A}}\right\Vert}_{2,1}$

. This means that the Q-LASSO (16) can be interpreted in the real domain as a group LASSO with M groups of size four.The problem (16) defines a convex optimization problem in quaternion variables. In practice, quaternion convex optimization problems can be solved by translating usual real-domain algorithms to the quaternion case by leveraging generalized

${\Bbb{HR}}$

calculus. Still, this generalization is not trivial and requires special care [37]. In particular, the general form of equality constraints in quaternion convex optimization problems is widely affine; i.e., it reads

${\boldsymbol{A}}_{1}{\boldsymbol{q}} + {\boldsymbol{A}}_{2}{\boldsymbol{q}}^{i} + {\boldsymbol{A}}_{3}{\boldsymbol{q}}^{j} + {\boldsymbol{A}}_{3}{\boldsymbol{q}}^{k} = {\boldsymbol{b}}$

, where matrices and vectors are quaternion valued with appropriate sizes. In comparison, in real-domain convex optimization, only affine equality constraints of the form

${\boldsymbol{Ax}} = {\boldsymbol{b}}$

, where vectors are all real valued, are considered. To solve the (unconstrained) problem (16), one can adapt the celebrated iterative shrinkage-thresholding algorithm (ISTA) to handle quaternion variables. Generalized

${\Bbb{HR}}$

calculus makes it possible to derive the quaternion ISTA iterations in an intuitive way. Letting

${f}{(}{\boldsymbol{q}}{)} = {\left\Vert{\boldsymbol{y}}{-}{\boldsymbol{Dq}}\right\Vert}_{2}^{2}$

, the iterations read

\[{\boldsymbol{q}}^{{(}{k} + {1}{)}} = {\cal{T}}_{{{\lambda}{\eta}}_{k}}{\left\{{\boldsymbol{q}}^{(k)}{-}{\eta}_{k}{\nabla}_{\bar{\boldsymbol{q}}}{f}{(}{\boldsymbol{q}}^{(k)}{)}\right\}} \tag{17} \]

where

${\eta}_{k}\,{>}\,{0}$

is the step size at iteration k and

${\cal{T}}_{\beta}$

is the soft thresholding operator (i.e., the proximal operator associated to the quaternion

${\ell}_{1}$

norm) given entry-wise by [38]:

${\cal{T}}_{\beta}{(}{q}{)} = {\max}{(}{0},{1}{-}{\beta} / {\left\vert{q}\right\vert}{)}{q}$

. Note that in this case, the gradient can be directly computed, thanks to Table 2, as

${\bigtriangledown}_{\bar{\boldsymbol{q}}}{f}{(}{\boldsymbol{q}}{)} = {(}{1} / {2}{)}{\boldsymbol{D}}^{\sf{H}}{(}{\boldsymbol{Dq}}{-}{\boldsymbol{b}}{)}$

.Of course, more sophisticated optimization problems can be formulated and solved directly in the quaternion domain. For instance, in the case of 3D data sparse coding (such as color images) that uses pure quaternions, it might be relevant to impose that the solution

${\boldsymbol{q}}_{\star}$

satisfies

${\text{Re}}{(}{\boldsymbol{Dq}}_{\star}{)} = {\boldsymbol{0}}$

. This constraint is widely linear, meaning that constraining the problem (16) preserves its convex nature. Solving this type of constrained quaternion optimization problem can be carried out within the same general framework, for instance, using the quaternion alternating direction of multipliers method, as explained in [37].Statistics for quaternion random variablesStatistics for quaternion random variables and vectors has been developed to extend classical signal processing algorithms relying on probabilistic models. As 4D variables, quaternions can be understood as either 4D real vector-valued random variables, i.e., variables in

${\Bbb{R}}^{4}$

, or as 2D complex random vectors, i.e., variables in

${\Bbb{C}}^{2}$

. Just like for complex-valued random variables and vectors, detecting and taking into account symmetries in the probability density function (pdf) of quaternion random variables is essential in devising powerful signal processing tools. The notion of properness (also known as second-order circularity) captures such rotation invariance of the pdf. It was considered in many scenarios and exploited in several algorithms either as an extra parameter in the signal model or as a signature of the absence/presence of a targeted signal hidden in a noisy environment. A solid amount of literature is available for the complex case (see, e.g., [39] and the references therein).In the quaternion case, the original study and definition of properness traces back to Vakhaniaâ€™s work [40] before being considered by the signal processing community [41], [42], [43], [44]. The major difference between the complex and quaternion cases is the existence of different levels of properness over

${\Bbb{H}}$

, while only one level can be identified over

${\Bbb{C}}$

. The three levels of properness of a quaternion-valued random variable are denoted

${\Bbb{R}}$

properness,

${(}{1},{\mu}{)}$

properness (where

${\mu}$

is a pure unit quaternion, also denoted

${\Bbb{C}}^{\mu}$

properness), and

${\Bbb{H}}$

properness [42], [44]. (The notion can be directly extended to vectors, but we illustrate here the concept on scalar-valued variables for clarity.) These levels correspond to different invariance properties of the random variable distribution. In the Gaussian case, properness means invariance properties of the 4 Ã— 4 real (augmented) covariance matrix

${\bf{\Gamma}}_{\Bbb{R}} = {E}{\{}{\boldsymbol{q}}_{\Bbb{R}}{\boldsymbol{q}}_{\Bbb{R}}^{\top}{\}}$

, where

${\boldsymbol{q}}_{\Bbb{R}} = {\left[{q}_{a},{q}_{b},{q}_{c},{q}_{d}\right]}^{\top}\,{\in}\,{\Bbb{R}}^{4}$

.Thus, symmetries of the quaternion Gaussian pdf can be interpreted as symmetries of the equivalent real covariance matrix of the augmented vector

${\boldsymbol{q}}_{\Bbb{R}}$

gathering the real and imaginary parts of

${q}\,{\in}\,{\Bbb{H}}$

. Figure 4 gives an illustration of the structure induced by two levels of properness on the 4 Ã— 4 real covariance matrix

${\bf{\Gamma}}_{\Bbb{R}}$

in the case of centered quaternion-valued Gaussian variables. While the

${\Bbb{H}}$

properness level is the â€œsimplestâ€ (the four components of the quaternion variables are uncorrelated and have the same variance), the

${(}{1},{\mu}{)}$

-proper case has a much more complicated correlation structure (here,

${\mu} = {\boldsymbol{j}}$

for illustration purposes).miron04-3278071Figure 4. Two different properness levels for a centered quaternion scalar Gaussian variable: (a)

${\Bbb{H}}$

proper and (b)

${(}{1},{\boldsymbol{j}}{\boldsymbol{)}}$

proper. The insets show the corresponding structure of the augmented covariance matrix

${\bf{\Gamma}}_{\Bbb{R}}$

, where identical colors indicate equal values.Accounting for properness is essential in quaternion-valued signal processing. It means that the symmetries, or invariances, of the distribution are correctly taken into account. Therefore, an important research effort has been dedicated to developing and formulating statistical tools to deal with proper and improper quaternion signals. This includes extensions of the generalized likelihood ratio method to test properness levels of quaternion variables [45] and the development of quaternion independent component analysis based on second-order statistics [46]. In estimation, accounting for properness naturally yields to the notion of widely linear processing [42], [47], which exploits the dependence between a quaternion random variable x and its three canonical involutions

${\boldsymbol{x}}^{\boldsymbol{i}}$

,

${\boldsymbol{x}}^{\boldsymbol{j}}$

, and

${\boldsymbol{x}}^{\boldsymbol{k}}$

to capture the full second-order information, hence providing optimal estimation performance.Quaternion adaptive filtering for 3D and 4D signalsAnother important application of quaternion-valued signal processing is provided by the quaternion implementation of the popular LMS adaptive filter. When dealing with 3D and 4D signals, using quaternion representations along with the linear quaternion model allows exploiting the geometric coupling between channels in a natural and structured manner. This is what motivated the development of the QLMS algorithm. The first formulation of the QLMS was proposed in [5] and applied to temperature and 3D wind data measurements. Subsequently, various forms of quaternion adaptive filters have been introduced along with their performance analysis [21], [23], [32], [48], [49], [50].Hereafter, we provide an overview of the derivation of the update expression for the QLMS filter weights. The QLMS algorithm is generally used to estimate the quaternion-valued vector w of coefficients of a desired filter by minimizing the LMS of the error signal (the difference between the desired and actual signal). If n is the number of the current input sample

${\boldsymbol{x}}{(n)}$

, the QLMS aims at minimizing the real-valued cost function of quaternion variables

${\cal{J}}{(}{n}{)} = {\left\vert{e}{(}{n}{)}\right\vert}^{2}$

, where

${e}{(}{n}{)} = {d}{(}{n}{)}{-}{y}{(}{n}{)}$

is the error between the desired signal d(n) and the filter output

${y}{(}{n}{)} = {\boldsymbol{w}}{(}{n}{)}^{\sf{H}}{\boldsymbol{x}}{(}{n}{)}$

. This minimization is performed using a steepest-descent procedure, and the QLMS weights update is given by [see (15)]:

\[{\boldsymbol{w}}{(}{n} + {1}{)} = {\boldsymbol{w}}{(}{n}{)}{-}{\eta}{\nabla}_{\bar{\boldsymbol{w}}}{\cal{J}}{(}{n}{)} \tag{18} \]

where

${\eta}\,{>}\,{0}$

is the step size. Using the

${\Bbb{HR}}$

calculus rules from Table 2 and the general quaternion calculus rules, it can be easily shown that

${\nabla}_{\bar{\boldsymbol{w}}}{\cal{J}}{(}{n}{)} = {-}{\boldsymbol{x}}{(}{n}{)}{\bar{e}}{(n)}/{2}$

. Thus, absorbing the scalar factor in the step size

${\eta}$

, the QLMS update of the weight vector reads

\[{\boldsymbol{w}}{(}{n} + {1}{)} = {\boldsymbol{w}}{(}{n}{)}{-}{\eta}{\boldsymbol{x}}{(}{n}{)}{\bar{e}}{(n)} \tag{19} \]

One can observe that the quaternion update (19) has a similar form to the one obtained in the complex case. It is well known that the performance of the LMS filter depends on the second-order statistics on the input vector x. When x is

${\Bbb{H}}$

proper, the standard Hermitian covariance matrix

${\boldsymbol{C}}_{\boldsymbol{xx}} = {E}{\{}{\boldsymbol{xx}}^{\sf{H}}{\}}$

contains the full second-order information of the data. However, when x is not

${\Bbb{H}}$

proper, it is necessary to consider the so-called augmented statistics, i.e., the correlations between x and its three involutions

${\boldsymbol{x}}^{\boldsymbol{i}},{\boldsymbol{x}}^{\boldsymbol{j}}$

, and

${\boldsymbol{x}}^{\boldsymbol{k}}$

to capture the full second-order information. This leads the widely linear QLMS (WL-QLMS) [32], [49], [50], which has a similar form to (19) but for which the input vector x is replaced by the augmented input

${\boldsymbol{x}}_{a} = {[}{\boldsymbol{x}}^{T},{\boldsymbol{x}}^{{\boldsymbol{i}}^{T}},{\boldsymbol{x}}^{{\boldsymbol{j}}^{T}},{\boldsymbol{x}}^{{\boldsymbol{k}}^{T}}{]}^{T}$

. A detailed analysis of QLMS performance for different types of properness can be found in [49]. Application of the QLMS and WL-QLMS on benchmark 3D data (Lorenz attractor) as well as real-world 3D and 4D data [5] has shown that the quaternion approach outperforms the LMS and complex LMS in terms of accuracy, proving the applicability of quaternion models to fusion heterogenous data sources.Remaining challenges and open problems in quaternion-valued signal and image processingIt has been more than three decades since the first appearance of quaternions in the signal and image processing literature. Still, the use of quaternions in these domains regularly raises skepticism. For instance, the question â€œDo quaternion-valued approaches perform better than standard real or complex-valued methods?â€ is perhaps the most encountered. Even though this might be an interesting question to ask, it misses a more fundamental point. Rather, we believe one should be questioning whether the use of a quaternion-valued representation should be preferred over a more conventional real-valued representation. Answering this question explicitly is crucial. It justifies the use of quaternions in the first place, which can be motivated, e.g., by the geometric insights offered by the algebra or by the ease of formulating compact and explicit models for the data at hand. In essence, one must identify the unique properties offered by quaternion algebra that are relevant to the considered setting. Once the motivation for using quaternion representations is clearly established, the comparison of quaternion methods with other real-valued approaches can be conducted. It is important to keep in mind that quaternion models can always be transformed into equivalent (often structured) real-domain models. In that case, comparisons between quaternion and real approaches are usually meaningless since they describe the same model or methodology, just expressed using different algebra rules. However, one can legitimately wonder whether such an equivalent real-domain model could have been directly formulated without the design of a quaternion model in the first place. Therefore, we believe that discussions should focus on pros and cons for the respective modeling approaches.Quaternion representations call for dedicated tools. Indeed, if one wishes to advocate for the benefits of quaternion-valued representations and models, then one should also develop the associated signal processing methodologies in the quaternion domain. Doing otherwise, e.g., resorting to an equivalent real-domain formulation to solve the problem at hand, would imply losing any advantages gained using the original quaternion formulation. The tour of the current quaternion-valued signal and image processing toolbox presented in the â€œThe Quaternion Toolbox for Signal Processingâ€ section paves the way for end-to-end quaternion domain approaches. Nonetheless, many questions remain open. In a nutshell, these define exciting research directions for quaternion-valued signal and image processing: the search for new insightful quaternion embeddings, the design of dedicated quaternion-valued statistical models, novel decompositions for large quaternion datasets, and machine learning with quaternions. These promising avenues of research are briefly outlined in the following:

Novel models in the quaternion domain: Often, models for quaternion signals and data are built by mimicking standard linear algebra. This is the case, e.g., for the linear mixing model used in many quaternion signal processing scenarios. However, the intrinsic link between quaternions and 3D/4D geometry, such as rotations or reflections, allows for much more diverse modeling opportunities. Future research should take into account this specificity of quaternion algebra to develop original and innovative models capable of revealing further geometric insights into the studied phenomena. This includes, for instance, models capable of identifying and separating features in polarimetric or color imaging that are invariant to some geometric transformation.

Statistical quaternion signal processing: While there exists important literature in complex-valued statistical signal processing [39], the quaternion domain counterpart still exhibits plenty of opportunities for theoretical and methodological developments. Indeed, until now, most of the research effort has been concentrated toward establishing the notions of properness and widely linear filtering. As a result, fundamental building blocks of quaternion statistical signal processing are still missing. For example, the definition of general quaternion probability distributions beyond the Gaussian case is still a largely open question. Constructing a satisfying and complete probabilistic framework for quaternions would pave the way for meaningful end-to-end quaternion domain statistical approaches, such as Bayesian inference, hypothesis testing, performance bounds for estimators, and so on.

Structured decompositions of multidimensional quaternion arrays: We have illustrated in this article the fact that orthogonal low-rank decompositions of quaternion-valued matrices (Q-SVD and Q-EVD) have been successfully used in various applications, such as color image compression or vector sensor array processing. However, these orthogonal decompositions quickly reach their limits when we must deal with either structured (e.g., Hankel matrices) or constrained (e.g., nonnegative) data, as is often the case in machine learning applications, for example. There is a need for a new versatile low-rank decomposition tool allowing us to account for these various constraints. Several authors already proposed such structured decompositions for dictionary learning [8], sparse coding [35], [36], and learning features from polarized imaging data [51], but this research area is still in its infancy. Another interesting research direction is the extension of multilinear (tensor) algebra models and decompositions to quaternions. Because of the noncommutativity of quaternion multiplication, a general definition of multilinearity in the quaternion framework is a challenging but important theoretical question, for which there are already some promising leads [52].

Quaternion neural networks: Motivated by the success of quaternion encoding of 3D/4D signals, recent years have witnessed the development of quaternion (convolutional and recurrent) neural networks (NNs) [53]. By leveraging the benefits of quaternion algebra that permit efficient accounting for interchannel relations, such NNs allow improved feature learning from 3D and 4D data. This usually results in overall better performance than standard real-valued NNs in terms of classification/prediction rates and in a lower number of learned parameters. This is the case, for example, for the convolutional NNs that exploit correlations between the red, green, and blue channels of a color image seen as a quaternion matrix. However, there is still a long way to go before using these NNs on a large scale. One of the main bottlenecks for these approaches is the hardware implementation of the quaternion product, which is a powerful but expensive operation in the context of quaternion NNs. Optimized GPU implementations for this operation would allow us to drastically improve computation efficiency for these systems and, therefore, their practical utility. An exciting prospect lies in devising quaternion NNs that take full advantage of quaternion-based optimization

$(\Bbb{HR}$

calculus) and that are likely to reduce the computational burden and improve the efficiency of learning algorithms in applications relying on quaternion representations.

Almost 180 years after their discovery near Brougham bridge, quaternions are still surrounded by a hint of mystery for the newcomer. We hope the present article helps to demystify their use and benefits to researchers in signal and image processing. If a large set of works have already illustrated the power of quaternions to propose elegant and interpretable solutions to difficult problems, there remains some necessary effort to even better understand and exploit their potential. Quaternions will certainly continue to intrigue and inspire the signal processing community for the years to come.AcknowledgmentThis work was supported, in part, by the French National Research Agency through Grant RICOCHET ANR-21-CE48-0013.AuthorsSebastian Miron (sebastian.miron@univ-lorraine.fr) received his Ph.D. degree in signal, image, and speech processing from the Grenoble Institute of Technology, France, in 2005. He is currently an associate professor with the CRAN, UniversitÃ© de Lorraine, F-54000 Nancy, France. His research interests include multidimensional signal processing, tensor algebra, quaternion signal processing, spatiotemporal data analysis, polarization processing, and neural network compression.Julien Flamant (julien.flamant@univ-lorraine.fr) received his Ph.D. degree from Ecole Centrale de Lille, France, in 2018. He is a CNRS research scientist with the CRAN, UniversitÃ© de Lorraine, F-54000 Nancy, France. His research interests include bivariate signal processing, polarimetric imaging, quaternion-valued signal processing, and applications to physical sciences.Nicolas Le Bihan (nicolas.le-bihan@gipsa-lab.fr) received his Ph.D. degree in signal processing in 2001 from the Grenoble Institute of Technology. He is a CNRS research scientist and currently works at the Gipsa-lab, F-38000 Grenoble, France. His research interests include statistical signal processing on groups and manifolds, multivariate time series analysis, and applications in polarized wave physics.Pierre Chainais (pierre.chainais@centralelille.fr) received his Ph.D. degree in physics from the Ecole Normale Superieure de Lyon, France, in 2001. He joined the University Blaise Pascal at Clermont-Ferrand as an associate professor in signal processing in 2002. He moved to Centrale Lille Institute, F-59000 Lille, France in 2011, where he currently is a full professor in signal processing at CRIStAL Laboratory. His research interests include statistical signal processing and machine learning, with applications to physical systems.David Brie (david.brie@univ-lorraine.fr) received his Ph.D. degree in signal processing from UniversitÃ© Henri PoincarÃ©, Nancy, France, in 1992. He is currently a full professor with the UniversitÃ© de Lorraine, F-54000 Nancy, France and is doing his research with CRAN, Nancy, France. His research interests include multidimensional statistical signal processing, low rank matrix and tensor decompositions and inverse problems.References[1] J. Chou, â€œQuaternion kinematic and dynamic differential equations,â€ IEEE Trans. Robot. Autom., vol. 8, no. 1, pp. 53â€“64, Feb. 1992, doi: 10.1109/70.127239. [Online] . Available: https://ieeexplore.ieee.org/document/127239[2] K. Shoemake, â€œAnimating rotation with quaternion curves,â€ in Proc. 12th Annu. Conf. Comput. Graph. Interactive Techn. (SIGGRAPH), ACM Press, 1985, pp. 245â€“254, doi: 10.1145/325334.325242. [Online] . Available: http://portal.acm.org/citation.cfm?doid=325334.325242[3] H.-D. Schutte and J. Wenzel, â€œHypercomplex numbers in digital signal processing,â€ in Proc. IEEE Int. Symp. Circuits Syst., New Orleans, LA, USA: IEEE, 1990, pp. 1557â€“1560, doi: 10.1109/ISCAS.1990.112431. [Online] . Available: http://ieeexplore.ieee.org/document/112431/[4] M. Hobiger, N. Le Bihan, C. Cornou, and P. Bard, â€œMulticomponent signal processing for Rayleigh wave ellipticity estimation: Application to seismic hazard assessment,â€ IEEE Signal Process. Mag., vol. 29, no. 3, pp. 29â€“39, May 2012, doi: 10.1109/MSP.2012.2184969. [Online] . Available: http://ieeexplore.ieee.org/document/6179823/[5] C. Took and D. Mandic, â€œThe quaternion LMS algorithm for adaptive filtering of hypercomplex processes,â€ IEEE Trans. Signal Process., vol. 57, no. 4, pp. 1316â€“1327, Apr. 2009, doi: 10.1109/TSP.2008.2010600. [Online] . Available: http://ieeexplore.ieee.org/document/4696053/[6] H. Chen, W. Wang, and W. Liu, â€œAugmented quaternion ESPRIT-type DOA estimation with a crossed-dipole array,â€ IEEE Commun. Lett., vol. 24, no. 3, pp. 548â€“552, Mar. 2020, doi: 10.1109/LCOMM.2019.2962463. [Online] . Available: https://ieeexplore.ieee.org/document/8943327/[7] J.-F. Gu and K. Wu, â€œQuaternion modulation for dual-polarized antennas,â€ IEEE Commun. Lett., vol. 21, no. 2, pp. 286â€“289, Feb. 2017, doi: 10.1109/LCOMM.2016.2614512. [Online] . Available: http://ieeexplore.ieee.org/document/7579660/[8] Q. Barthelemy, A. Larue, and J. I. Mars, â€œColor sparse representations for image processing: Review, models, and prospects,â€ IEEE Trans. Image Process., vol. 24, no. 11, pp. 3978â€“3989, Nov. 2015, doi: 10.1109/TIP.2015.2458175. [Online] . Available: http://ieeexplore.ieee.org/document/7161332/[9] Y. Chen, X. Xiao, and Y. Zhou, â€œLow-rank quaternion approximation for color image processing,â€ IEEE Trans. Image Process., vol. 29, pp. 1426â€“1439, 2020, doi: 10.1109/TIP.2019.2941319. [Online] . Available: https://ieeexplore.ieee.org/document/8844978/[10] T. Bulow and G. Sommer, â€œHypercomplex signals-a novel extension of the analytic signal to the multidimensional case,â€ IEEE Trans. Signal Process., vol. 49, no. 11, pp. 2844â€“2852, Nov. 2001, doi: 10.1109/78.960432. [Online] . Available: http://ieeexplore.ieee.org/document/960432/[11] W. L. Chan, H. Choi, and R. Baraniuk, â€œCoherent multiscale image processing using dual-tree quaternion wavelets,â€ IEEE Trans. Image Process., vol. 17, no. 7, pp. 1069â€“1082, Jul. 2008, doi: 10.1109/TIP.2008.924282. [Online] . Available: http://ieeexplore.ieee.org/document/4526699/[12] M. Felsberg and G. Sommer, â€œThe monogenic signal,â€ IEEE Trans. Signal Process., vol. 49, no. 12, pp. 3136â€“3144, Dec. 2001, doi: 10.1109/78.969520. [Online] . Available: http://ieeexplore.ieee.org/document/969520/[13] M. Unser, D. Sage, and D. Van De Ville, â€œMultiresolution monogenic signal analysis using the Rieszâ€“Laplace wavelet transform,â€ IEEE Trans. Image Process., vol. 18, no. 11, pp. 2402â€“2418, Nov. 2009, doi: 10.1109/TIP.2009.2027628. [Online] . Available: http://ieeexplore.ieee.org/document/5164973/[14] S. Olhede and G. Metikas, â€œThe monogenic wavelet transform,â€ IEEE Trans. Signal Process., vol. 57, no. 9, pp. 3426â€“3441, Sep. 2009, doi: 10.1109/TSP.2009.2023397. [Online] . Available: http://ieeexplore.ieee.org/document/4956988/[15] M. Storath, â€œDirectional multiscale amplitude and phase decomposition by the monogenic curvelet transform,â€ SIAM J. Imag. Sci., vol. 4, no. 1, pp. 57â€“78, Jan. 2011, doi: 10.1137/100803924.[16] M. Clausel, T. Oberlin, and V. Perrier, â€œThe monogenic synchrosqueezed wavelet transform: A tool for the decomposition/demodulation of AMâ€“FM images,â€ Appl. Comput. Harmon. Anal., vol. 39, no. 3, pp. 450â€“486, Nov. 2015, doi: 10.1016/j.acha.2014.10.003. [Online] . Available: https://linkinghub.elsevier.com/retrieve/pii/S1063520314001274[17] R. Soulard and P. CarrÃ©, â€œElliptical monogenic wavelets for the analysis and processing of color images,â€ IEEE Trans. Signal Process., vol. 64, no. 6, pp. 1535â€“1549, Mar. 2016, doi: 10.1109/TSP.2015.2505664. [Online] . Available: http://ieeexplore.ieee.org/document/7347476/[18] J. Flamant, N. Le Bihan, and P. Chainais, â€œTimeâ€“frequency analysis of bivariate signals,â€ Appl. Comput. Harmon. Anal., vol. 46, no. 2, pp. 351â€“383, Mar. 2019, doi: 10.1016/j.acha.2017.05.007. [Online] . Available: https://linkinghub.elsevier.com/retrieve/pii/S1063520317300507[19] J. Flamant, N. Le Bihan, and P. Chainais, â€œSpectral analysis of stationary random bivariate signals,â€ IEEE Trans. Signal Process., vol. 65, no. 23, pp. 6135â€“6145, Dec. 2017, doi: 10.1109/TSP.2017.2736494. [Online] . Available: http://ieeexplore.ieee.org/document/8003393/[20] J. Flamant, P. Chainais, and N. Le Bihan, â€œA complete framework for linear filtering of bivariate signals,â€ IEEE Trans. Signal Process., vol. 66, no. 17, pp. 4541â€“4552, Sep. 2018, doi: 10.1109/TSP.2018.2855659. [Online] . Available: https://ieeexplore.ieee.org/document/8410469/[21] C. C. Took and D. P. Mandic, â€œA quaternion widely linear adaptive filter,â€ IEEE Trans. Signal Process., vol. 58, no. 8, pp. 4427â€“4431, Aug. 2010, doi: 10.1109/TSP.2010.2048323. [Online] . Available: https://ieeexplore.ieee.org/document/5447699[22] C. C. Took and D. P. Mandic, â€œQuaternion-valued stochastic gradient-based adaptive IIR filtering,â€ IEEE Trans. Signal Process., vol. 58, no. 7, pp. 3895â€“3901, Jul. 2010, doi: 10.1109/TSP.2010.2047719. [Online] . Available: https://ieeexplore.ieee.org/document/5444945[23] F. Ortolani, D. Comminiello, M. Scarpiniti, and A. Uncini, â€œFrequency domain quaternion adaptive filters: Algorithms and convergence performance,â€ Signal Process., vol. 136, pp. 69â€“80, Jul. 2017, doi: 10.1016/j.sigpro.2016.11.002. [Online] . Available: https://linkinghub.elsevier.com/retrieve/pii/S0165168416302948[24] F. Zhang, â€œQuaternions and matrices of quaternions,â€ Linear Algebr. Its Appl., vol. 251, pp. 21â€“57, Jan. 1997, doi: 10.1016/0024-3795(95)00543-9. [Online] . Available: https://linkinghub.elsevier.com/retrieve/pii/0024379595005439[25] N. Le Bihan and J. Mars, â€œSingular value decomposition of quaternion matrices: A new tool for vector-sensor signal processing,â€ Signal Process., vol. 84, no. 7, pp. 1177â€“1199, Jul. 2004, doi: 10.1016/j.sigpro.2004.04.001. [Online] . Available: https://linkinghub.elsevier.com/retrieve/pii/S0165168404000647[26] S. Miron, N. Le Bihan, and J. Mars, â€œQuaternion-MUSIC for vector-sensor array processing,â€ IEEE Trans. Signal Process., vol. 54, no. 4, pp. 1218â€“1229, Apr. 2006, doi: 10.1109/TSP.2006.870630. [Online] . Available: http://ieeexplore.ieee.org/document/1608539/[27] X. Gong, Y. Xu, and Z. Liu, â€œQuaternion ESPRIT for direction finding with a polarization sensitive array,â€ in Proc. 9th Int. Conf. Signal Process., Beijing, China: IEEE Press, Oct. 2008, pp. 378â€“381. [Online] . Available: http://ieeexplore.ieee.org/document/4697149/[28] A. Sajeva and G. Menanno, â€œCharacterisation and extraction of a Rayleigh-wave mode in vertically heterogeneous media using quaternion SVD,â€ Signal Process., vol. 136, pp. 43â€“58, Jul. 2017, doi: 10.1016/j.sigpro.2016.11.011. [Online] . Available: https://linkinghub.elsevier.com/retrieve/pii/S0165168416303255[29] X. Gong, Y. Xu, and Z. Liu, â€œQuaternion ESPRIT for direction finding with a polarization sentive array,â€ in Proc. 9th Int. Conf. Signal Process., Beijing, China: IEEE Press, 2008, pp. 378â€“381. [Online] . Available: https://ieeexplore.ieee.org/abstract/document/4697149[30] D. Xu and D. P. Mandic, â€œThe theory of quaternion matrix derivatives,â€ IEEE Trans. Signal Process., vol. 63, no. 6, pp. 1543â€“1556, Mar. 2015, doi: 10.1109/TSP.2015.2399865. [Online] . Available: https://ieeexplore.ieee.org/document/7029694/[31] D. Xu, Y. Xia, and D. P. Mandic, â€œOptimization in quaternion dynamic systems: Gradient, hessian, and learning algorithms,â€ IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 2, pp. 249â€“261, Feb. 2016, doi: 10.1109/TNNLS.2015.2440473. [Online] . Available: http://ieeexplore.ieee.org/document/7124503/[32] D. Xu, C. Jahanchahi, C. C. Took, and D. P. Mandic, â€œEnabling quaternion derivatives: The generalized HR calculus,â€ Roy. Soc. Open Sci., vol. 2, no. 8, Aug. 2015, Art. no. 150255, doi: 10.1098/rsos.150255.[33] D. P. Mandic, C. Jahanchahi, and C. C. Took, â€œA quaternion gradient operator and its applications,â€ IEEE Signal Process. Lett., vol. 18, no. 1, pp. 47â€“50, Jan. 2011, doi: 10.1109/LSP.2010.2091126. [Online] . Available: http://ieeexplore.ieee.org/document/5623300/[34] Y. Xu, L. Yu, H. Xu, H. Zhang, and T. Nguyen, â€œVector sparse representation of color image using quaternion matrix analysis,â€ IEEE Trans. Image Process., vol. 24, no. 4, pp. 1315â€“1329, Apr. 2015, doi: 10.1109/TIP.2015.2397314. [Online] . Available: http://ieeexplore.ieee.org/document/7024169/[35] C. Zou, K. I. Kou, and Y. Wang, â€œQuaternion collaborative and sparse representation with application to color face recognition,â€ IEEE Trans. Image Process., vol. 25, no. 7, pp. 3287â€“3302, Jul. 2016, doi: 10.1109/TIP.2016.2567077. [Online] . Available: http://ieeexplore.ieee.org/document/7468463/[36] C. Zou, K. I. Kou, Y. Wang, and Y. Y. Tang, â€œQuaternion block sparse representation for signal recovery and classification,â€ Signal Process., vol. 179, Feb. 2021, Art. no. 107849, doi: 10.1016/j.sigpro.2020.107849. [Online] . Available: https://linkinghub.elsevier.com/retrieve/pii/S0165168420303935[37] J. Flamant, S. Miron, and D. Brie, â€œA general framework for constrained convex quaternion optimization,â€ IEEE Trans. Signal Process., vol. 70, pp. 254â€“267, 2022, doi: 10.1109/TSP.2021.3137746. [Online] . Available: https://ieeexplore.ieee.org/document/9662233/[38] T.-S. T. Chan and Y.-H. Yang, â€œComplex and quaternionic principal component pursuit and its application to audio separation,â€ IEEE Signal Process. Lett., vol. 23, no. 2, pp. 287â€“291, Feb. 2016, doi: 10.1109/LSP.2016.2514845. [Online] . Available: https://ieeexplore.ieee.org/document/7372422[39] P. J. Schreier and L. L. Scharf, Statistical Signal Processing of Complex-Valued Data: The Theory of Improper and Noncircular Signals. Cambridge, U.K.: Cambridge Univ. Press, 2010.[40] N. N. Vakhania, â€œRandom vectors with values in quaternion Hilbert spaces,â€ Theory Probability Its Appl., vol. 43, no. 1, pp. 99â€“115, Jan. 1999, doi: 10.1137/S0040585X97976696.[41] P.-O. Amblard and N. L. Bihan, â€œOn properness of quaternion valued random variables,â€ in Proc. IMA Conf. Math. Signal Process., 2004, pp. 23â€“26.[42] J. Via, D. Ramirez, and I. Santamaria, â€œProperness and widely linear processing of quaternion random vectors,â€ IEEE Trans. Inf. Theory, vol. 56, no. 7, pp. 3502â€“3515, Jul. 2010, doi: 10.1109/TIT.2010.2048440. [Online] . Available: http://ieeexplore.ieee.org/document/5484995/[43] C. Cheong Took and D. P. Mandic, â€œAugmented second-order statistics of quaternion random signals,â€ Signal Process., vol. 91, no. 2, pp. 214â€“224, Feb. 2011, doi: 10.1016/j.sigpro.2010.06.024. [Online] . Available: https://linkinghub.elsevier.com/retrieve/pii/S0165168410002719[44] N. Le Bihan, â€œThe geometry of proper quaternion random variables,â€ Signal Process., vol. 138, pp. 106â€“116, Sep. 2017, doi: 10.1016/j.sigpro.2017.03.017. [Online] . Available: https://linkinghub.elsevier.com/retrieve/pii/S0165168417301093[45] J. Via, D. P. Palomar, and L. Vielva, â€œGeneralized likelihood ratios for testing the properness of quaternion Gaussian vectors,â€ IEEE Trans. Signal Process., vol. 59, no. 4, pp. 1356â€“1370, Apr. 2011, doi: 10.1109/TSP.2010.2101067. [Online] . Available: http://ieeexplore.ieee.org/document/5672618/[46] J. Via, D. P. Palomar, L. Vielva, and I. Santamaria, â€œQuaternion ICA from second-order statistics,â€ IEEE Trans. Signal Process., vol. 59, no. 4, pp. 1586â€“1600, Apr. 2011, doi: 10.1109/TSP.2010.2101065. [Online] . Available: http://ieeexplore.ieee.org/document/5672617/[47] J. Navarro-Moreno, R. M. Fernandez-Alcala, and J. C. Ruiz-Molina, â€œA quaternion widely linear model for nonlinear Gaussian estimation,â€ IEEE Trans. Signal Process., vol. 62, no. 24, pp. 6414â€“6424, Dec. 2014, doi: 10.1109/TSP.2014.2364790. [Online] . Available: http://ieeexplore.ieee.org/document/6936344/[48] B. Che Ujang, C. C. Took, and D. P. Mandic, â€œQuaternion-valued nonlinear adaptive filtering,â€ IEEE Trans. Neural Netw., vol. 22, no. 8, pp. 1193â€“1206, Aug. 2011, doi: 10.1109/TNN.2011.2157358. [Online] . Available: http://ieeexplore.ieee.org/document/5934421/[49] M. Xiang, S. Kanna, and D. P. Mandic, â€œPerformance analysis of quaternion-valued adaptive filters in nonstationary environments,â€ IEEE Trans. Signal Process., vol. 66, no. 6, pp. 1566â€“1579, Mar. 2018, doi: 10.1109/TSP.2017.2787102. [Online] . Available: https://ieeexplore.ieee.org/document/8239856[50] M. Xiang, Y. Xia, and D. P. Mandic, â€œPerformance analysis of deficient length quaternion least mean square adaptive filters,â€ IEEE Trans. Signal Process., vol. 68, pp. 65â€“80, 2020, doi: 10.1109/TSP.2019.2955831. [Online] . Available: https://ieeexplore.ieee.org/document/8911458/[51] J. Flamant, S. Miron, and D. Brie, â€œQuaternion non-negative matrix factorization: Definition, uniqueness, and algorithm,â€ IEEE Trans. Signal Process., vol. 68, pp. 1870â€“1883, Feb. 2020, doi: 10.1109/TSP.2020.2974651. [Online] . Available: https://ieeexplore.ieee.org/document/9001262/[52] O. Imhogiemhe, J. Flamant, X. Luciani, Y. Zniyed, and S. Miron, â€œLow-rank tensor decompositions for quaternion multiway arrays,â€ in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., Rhodes Island, Greece: IEEE Press, Jun. 2023, pp. 1â€“5, doi: 10.1109/ICASSP49357.2023.10094816. [Online] . Available: https://ieeexplore.ieee.org/document/10094816[53] T. Parcollet, M. Morchid, and G. LinarÃ¨s, â€œA survey of quaternion neural networks,â€ Artif. Intell. Rev., vol. 53, no. 4, pp. 2957â€“2982, Apr. 2020, doi: 10.1007/s10462-019-09752-1.Digital Object Identifier 10.1109/MSP.2023.3278071CoverCelebrate IEEEMastheadGet PublishedFrom the EditorPresident's MessageSociety NewsElection of President-Elect, Regional Directors-at-Large, and Members-at-LargePerspectivesOn the Concept of Frequency in Signal Processing: A DiscussionQuaternions in Signal and Image ProcessingIntegrated Sensing and Communications With Reconfigurable Intelligent SurfacesDeep Learning Meets Sparse RegularizationLecture NotesPeriodograms and the Method of Averaged PeriodogramsSP CompetitionsHumorDates AheadSPS Resource CenterMathWorks