Qinqin Yang, Zi Wang, Kunyuan Guo, Congbo Cai, Xiaobo Qu
©SHUTTERSTOCK.COM/PAPAPIG
Deep learning (DL) has driven innovation in the field of computational imaging. One of its bottlenecks is unavailable or insufficient training data. This article reviews an emerging paradigm, imaging physics-based data synthesis (IPADS), that can provide huge training data in biomedical magnetic resonance (MR) without or with few real data. Following the physical law of MR, IPADS generates signals from differential equations or analytical solution models, making learning more scalable and explainable and better protecting privacy. Key components of IPADS learning, including signal generation models, basic DL network structures, enhanced data generation, and learning methods, are discussed. Great IPADS potential has been demonstrated by representative applications in fast imaging, ultrafast signal reconstruction, and accurate parameter quantification. Finally, open questions and future work are discussed.
Data learning has empowered computational imaging with fast sampling, ultrafast signal reconstruction, and straightforward parameter quantification [1], [2]. A milestone work of DL fast MR imaging (MRI) was presented in [1]. In the age of DL, a large amount of high-quality data is essentially important to achieve excellent performance. However, in biomedical imaging, these data may be hard to acquire in challenging applications, e.g., blurred images of moving organs, the lengthy measuring of quantitative physical parameters, and the irreversible acquisition of physiological processes. Thus, new data generation and learning schemes are highly desired to boost biomedical imaging applications.
Recently, synthetic data have begun to attract attention in computational imaging [3], [4], [5], [14], [17]. Learning with synthetic data could reduce the dependence on paired real-world data, quickly generate massive data, overcome the difficulties and even the impossibility of collecting real data, and protect privacy in biomedicine. Here, we focus on IPADS because it follows a plausible physical model and enables good interpretability [4]. To make the discussions compact, we mainly review IPADS learning in biomedical MR since IPADS learning has become a frontier in this area, such as with fast quantitative imaging [5], [6], [7], [8], [9], [10], [11], [12], [13], signal reconstruction [14], [15], [16], [17], [18], [19], [20], and pulse sequence optimization [21]. The concept of IPADS learning could be generalized to other computational imaging modalities as long as an appropriate physical model and learning network are included. This article gives an overview of IPADS for DL MR. Based on whether there is physical signal evolution, we first divide IPADS into two lines, including physical signal evolution and analytical modeling (Figure 1), and then discuss enhanced learning with realistic data adaption and advanced network structures. Representative applications and future work are provided.
Figure 1. Two lines of IPADS in biomedical MR. Various synthetic images can be generated from numerical tissue parameters through physical evolution or analytical models. (a) The tissue parameters. (b) The physical evolution (top) and analytical model (bottom). (c) The synthetic images. PD: proton density.
The physics of MR governs signal formation, which involves spin dynamics, quantum mechanics, and electromagnetism. For example, MRI presents spatially structural information with different contrasts, while MR spectroscopy (MRS) focuses on the spectral signals of multiple molecules. MR signal evolution may have analytical solutions under some conditions or may not have them in general. Accordingly, IPADS methods are divided into three categories, and some representative works are summarized in Table 1.
Table 1. A summary of IPADS in biomedical MR.
The Bloch equation [22] describes the time-dependent evolution of magnetization vector ${\bf{M}}{(}{t}{)}\,{\in}\,{\bf{R}}^{3}$ under an external magnetic field ${\bf{B}}{(}{t}{)}\,{\in}\,{\bf{R}}^{3}$. Two key phenomena in magnetized spin, i.e., precession and relaxation, can be formulated in the differential equation \[\frac{{d}{\bf{M}}{(}{t}{)}}{dt} = \mathop{\underbrace{{\gamma}{\bf{B}}{(}{t}{)}\,{\times}\,{\bf{M}}{(}{t}{)}}}\limits_{\text{Precession}} + \mathop{\underbrace{R({\bf{M}}{(}t{)})}}\limits_{\text{Relaxation}}, \tag{1} \] where ${\gamma}$ is the gyromagnetic ratio (e.g., 42.6 MHz/T for hydrogen nuclei). For the precession term, the motion of spins is determined by many factors, including the main static field ${B}_{0}$, radio frequency (RF) pulse field ${\bf{B}}_{1}{(}{t}{)}$, chemical shift (frequency) ${\Delta}{\omega}_{0}$, and linear gradient fields G(t) at location ${\bf{r}}$, as \[{\bf{B}}{(}{t}{)} = \left({{B}_{0}{-}\frac{\Delta{\omega}_{0}}{\gamma} + {\bf{G}}{(}{t}{)}\,{\cdot}\,{r}}\right)\hat{z} + {\bf{B}}_{1}{(}{t}{)}, \tag{2} \] where the thermal equilibrium magnetization vector is tipped from the ${B}_{0}$ direction (z) into the transverse plane (xy) under the effect of ${\bf{B}}_{1}{(}{t}{)}$. The ${\bf{G}}{(}{t}{)}$ is usually used to modulate the phase or encode different spatial locations. In contrast, the relaxation term describes the process of the magnetization vector returning to its equilibrium state. Two relaxation time constants, ${T}_{1}$ and ${T}_{2}$, are used to characterize the regrowth of longitudinal magnetization ${(}{M}_{\text{z}}{)}$ and the decay of the transverse magnetization ${(}{M}_{\text{x,y}}{)}$, respectively. These relaxation parameters are very valuable in clinics, such as for lesion diagnosis [7].
The MR pulse sequence describes a series of physical RF pulses applied to objects, resulting in a particular image or spectrum appearance. It usually consists of a series of varying ${\bf{B}}_{1}{(}{t}{)}$ and ${\bf{G}}{(}{t}{)}$ in the form of a timing diagram. Typically, the signal model for a simple MR pulse sequence has an analytical solution if a steady state is assumed. However, as the complexity of the pulse sequence increases, analytical solutions are hard to obtain due to spin history effects at unsteady states and system imperfections. For example, in MR fingerprinting [8], various sequence components are arranged in a pseudorandom pattern, and MR signals are not analyzed using the analytic expression but through dictionary matching.
The analytical model (AM) provides a clear closed-form solution of MR signals and could approximate the physical evolution (PE) of MR signals under some assumptions or simplifications. In MRI, consider a most common pulse sequence, the spin–echo sequence; (3) provides an analytical solution for the image contrast as \[{S}{(}{\bf{r}},{TE},{TR}{)} = {M}_{0}{(}{\bf{r}}{)}\,{\cdot}\,{(}{1}{-}{e}^{{-}{TR} / {T}_{1}{(}{\bf{r}}{)}}{)}\,{\cdot}\,{e}^{{-}{TE} / {T}_{2}{(}{\bf{r}}{)}}, \tag{3} \] assuming that the initial magnetization vector undergoes the action of 90 and 180° RF pulses and the repetition time (TR) is much larger than the echo time (TE). Here, ${M}_{0}{(}{\bf{r}}{)}$ represents equilibrium longitudinal magnetization. By adjusting the TR and TE, image contrasts can be generated if the tissue parameters ${M}_{0}{(}{\bf{r}}{)}$, ${T}_{1}{(}{\bf{r}}{)}$, and ${T}_{2}{(}{\bf{r}}{)}$ are provided.
In addition to relaxation parameters, magnetic susceptibility ${\chi}{(}{\bf{r}}{)}$ represents the ability of a substance to become magnetized under an applied magnetic field. Tissue-specific magnetic susceptibility variations within the MRI scanner can cause inhomogeneity of the static magnetic field ${B}_{0}$. To describe the field variations caused by ${\chi}{(}{r}{)}$, a dipole convolution model is typically used as [10], [11] \[{\Delta}{B}_{0}{(}{\bf{r}}{)} = {B}_{0}\,{\cdot}\,{\chi}{(}{\bf{r}}{)}\,{\ast}\,{D}{(}{\bf{r}}{)}, \tag{4} \] in which the induced field inhomogeneity ${\Delta}{B}_{0}{(}{\bf{r}}{)}$ is expressed as the convolution between the spatial distribution of susceptibility ${\chi}{(}{\bf{r}}{)}$ and a unit dipole response ${D}{(}{\bf{r}}{)}$.
Beyond images, in MRS, the spectral signal of each individual voxel comes from multiple molecules. This signal is commonly modeled as [14], [15], [16] \[{S}{(}{\bf{r}},{t}{)} = \mathop{\sum}\limits_{{m} = {1}}\limits^{M}{{c}_{m}{(}{\bf{r}}{)}{v}_{m}{(t)}{e}^{{-}{t} / {T}_{2,m}{(}{\bf{r}}{)}}}, \tag{5} \] where m denotes the mth molecule and ${c}_{m}{(}{\bf{r}}{)}$ and ${v}_{m}{(}{t}{)}$ are its concentration and basis functions, respectively. For ex vivo biological MRS, which is used to determine the concentrations of metabolites or structures of proteins, the objects (usually in liquid or solid form) to be acquired are placed in a tube and treated as a whole. Thus, the spatial location r is commonly ignored, and only ${S}{(}{t}{)}$ is acquired from scanners. In biological MRS, the basis function could be expressed as a linear combination of multiple exponentials $({J}_{m})$ as [14], [15], [16] \[{v}_{m}{(}{t}{)} = \mathop{\sum}\limits_{{j} = {1}}\limits^{{J}_{m}}{({a}_{j,m}}{e}^{i{\phi}_{j,m}}{)}{e}^{{i}{2}{\pi}{f}_{j,m}{t}}, \tag{6} \] where ${i}$ is the imaginary unit and ${a}_{j,m},{f}_{j,m}$, and ${\phi}_{j,m}$ are the amplitude, frequency, and phase of the jth spectral peak, respectively. By performing the Fourier transform on ${S}{(}{t}{)}$, a spectrum will be obtained, and the spectral peaks follow the Lorentzian line shape [14], [15], [16].
For in vivo MRS, (5) is suboptimal since it does not consider imperfect but real imaging conditions, e.g., field inhomogeneity and motions. More practically, the MRS signal could be modeled as [17], [18], [25] \[{S}{(}{\bf{r}},{t}{)} = \mathop{\sum}\limits_{{m} = {1}}\limits^{M}{{c}_{m}{(}{\bf{r}}{)}{v}_{m}{(}{t}{)}{e}_{m}{(}{t}{;}{\theta}_{m}{(}{\bf{r}}{)}{)}} + {b}{(}{\bf{r}},{t}{)}, \tag{7} \] where ${b}{(}{\bf{r}},{t}{)}$ is a baseline signal mainly contributed by macromolecules and commonly following the Gaussian line shape and ${e}_{m}{(}{t}{;}{\theta}_{m}{(}{\bf{r}}{)}{)}$ captures molecule-dependent time domain modulation functions that can be described by some experimental and physiological parameters in ${\theta}_{m}{(}{\bf{r}}{)}$.
PE-based IPADS and learning rely on simulating discrete spin motion at a short time interval using the Bloch equation. This process can be described with successive operators according to a specific MR pulse sequence. The signal formation, however, is computationally expensive, which involves large-scale matrix operations, integration, and differentiation. Fortunately, several MR physical simulation tools have been developed, e.g., simulation with product operator matrix (SPROM) (https://doi.org/10.6084/m9.figshare.19754836.v2), Jülich Extensible MRI Simulator (JEMRIS) (https://www.jemris.org), and MRiLab (http://mrilab.sourceforge.net) [23]. With these tools, one has to set proper physical parameters to make data generation in IPADS as realistic as possible.
Physical parameters include object-specific and experiment-specific parameters. The former indicates the nature of scanned objects, such as the proton density, relaxations ${(}{T}_{1},{T}_{2}$, and ${T}_{2}^{\ast})$, diffusion and electromagnetic properties at a particular spatial location in MRI, and amplitudes, resonance frequencies, and metabolite concentrations in MRS. The latter takes the pulse sequence and real imaging conditions into account, such as the TR, the TE, the flip angle, the strength of the main magnetic field ${(}{\text{B}}_{0}{)}$ and its inhomogeneity ${(}{\Delta}{B}_{0}{)}$, the RF field inhomogeneity ${(}{\text{B}}_{1}{)}$, and eddy currents.
Up to now, most PE-IPADS learning has been on imaging [5], [6], [7], [8], [9], [19], [20], [21], especially quantitative parametric imaging [5], [6], [7], [8], [9]. A representative application of PE-IPADS learning is ${T}_{2}$ mapping with overlapping-echo detachment (OLED) [5]. OLED is an ultrafast imaging sequence that encodes information of multiple images into one image, allowing parametric imaging within very short time (within 10 s for a whole brain). How to estimate reliable quantitative parameters within the same imaging time is a main problem. DL has been proved powerful, but the lack of real-world pairs of quantitative parameters and images should be addressed.
A typical flow of data generation for OLED is illustrated in Figure 2 and “Supplementary Data S1†(available at https://doi.org/10.1109/MSP.2022.3183809). First, object-shape templates should be provided. They are commonly created by randomly filling blank templates with hundreds of different 2/3D basic geometric shapes [5], [6], [11], [19], [20]. Second, tissue-specific parameters, such as the proton density and ${T}_{2}$ relaxation at particular spatial locations, are set in the ranges of [0, 1] and [20, 700] ms, respectively. Then, the pulse sequence needs to be programmed, and imaging parameters (echo times = 22, 52, 82, and 110 ms; flip angle = 30°) are set to be consistent with real imaging experiments. Finally, an imperfection imaging condition, the RF field inhomogeneity ${(}{\text{B}}_{1}{)}$, is generated by random polynomial functions.
Figure 2. The physical evolution-based data generation. (a) The parametric templates that have geometric shapes and object parameters. (b) The MR pulse sequence needs to be programmed according to the real-world experiment. (c) The software (here, it is MRiLab [23]) that enables physical signal evolution. (d) The massive synthetic data.
Once the data are generated with physical simulation tools, a network structure should be designed to conduct the learning with PE-IPADS data. A direct and commonly chosen approach is to learn the mapping from the signal to quantitative parameters in the end-to-end way [5], [6], [7], [8], [9]. For example, OLED images were mapped to ${T}_{2}$ parameters in Figure 3. In general, for a DL network ${\cal{N}}$ with trainable network parameter ${\theta}$, optimal learning is to minimize the loss ${L}{\left({\cdot}\right)}$ of estimated quantitative object parameters ${\hat{\bf{p}}}_{n} = {\cal{N}}{(}{\bf{s}}_{n}{;}{\theta}{)}$ according to \[{\hat{\theta}} = {\arg}\mathop{\min}\limits_{\theta} \mathop{\sum}\limits_{{n} = {1}}\limits^{N}{L}{\left({\bf{p}}_{n}{-}{\cal{N}}{(}{\bf{s}}_{n}{;}{\theta}{)}\right)}, \tag{8} \]
Figure 3. The ${T}_{2}$ mapping with OLED imaging sequence [5]–, [6][7]. (a) The network architecture learns the end-to-end mapping from the images to ${T}_{2}$ parameters. (b) and (c) The reconstructed ${T}_{2}$ maps of OLED and the corresponding reference of conventional acquisition in phantom and in vivo brain data, respectively. (d) A clinical sequence and ${T}_{2}$ map of OLED from an epilepsy patient. ReLu: rectified linear unit.
where ${\bf{p}}_{n}$ is the ground truth but with simulated quantitative parameters, ${\bf{s}}_{n}$ is the generated images, and n denotes the nth sample (its total number is N). Thus, PE-IPADS learning tries to approximate the inverse process from physical quantitative parameters of scanned objects to the output signal.
Once the network is trained with sufficient samples, estimating quantitative parameters ${\bf{p}} = {\cal{N}}{(}{\bf{s}}{;}{\hat{\theta}}{)}$ from a target image s becomes a straightforward and fast process. For PE-IPADS learning for OLED imaging, 800 images were trained in around 22 h, and less than 1 s was consumed to obtain faithful ${T}_{2}$ maps [Figure 3(b) and (c)]. High fidelity was achieved on the correlation coefficients (0.999 and 0.997 in phantom and human brain data) to the ${T}_{2}$ maps of conventional imaging pulse sequence, and the data acquisition time was reduced from 17 min to 10 s [6]. Besides, OLED with PE-IPADS avoids challenging motion artifacts in an epilepsy patient [Figure 3(d)] [7].
PE-IPADS has been applied to many other imaging scenarios due to its high flexibility [8], [9], [19], [20], [21]. In early stage research, the network structure seems not to have been the primary focus. Most approaches adopted mainstream network structures, e.g., a fully connected network in MR fingerprinting [8] and residual network in cardiac motion tag tracking [20]. Due to the complex nature of MR signals, the operation in networks, such as convolution, could be carefully treated, and a complex form could reduce the reconstruction error [24]. Through simulating the imaging process of water and fat, spatiotemporally encoded water–fat separation [19] trained a U-Net on the generated data and quickly separated water and fat in 0.46 s on a personal computer (a traditional method costs 30.31 s). Additionally, with a 3D convolutional neural network (CNN), the common Bloch equation evolution could be combined with electromagnetic simulation to obtain coil-specific RF profiles and phases [9]. This approach leverages PE-IPADS into electrical property tomography, extending traditional MRI to new imaging modalities.
AM-based IPADS skips the complex PE process. It directly and quickly generates a large amount of training data by adjusting the physical parameters in the closed-form solution expression [10], [12], [13], [14], [15], [16]. It is interesting to see that most AM-IPADS learning is on spectra in MRS [13], [14], [15], [16], with few applications in MRI [10], [12]. Thus, in the following, we mainly discuss spectra applications. In ex vivo biological MRS, the spectral peaks are usually in Lorentzian line shapes, and the effect of system imperfection is negligible in an ideal scenario, i.e., when only object-specific parameters need to be considered. Specifically, synthetic data are generated using general exponential functions[14], [15], [16].
AM-IPADS is commonly used in spectra reconstruction [14], [15], [16] that does not involve the PE process. Reconstruction aims at estimating a high-quality spectrum from undersampled or low signal-to-noise data. Although DL has shown astonishing performance in MRI image reconstruction, the methodology was developed relatively later in MRS due to the lack of paired realistic data. IPADS relaxes this requirement through data generation according to (5)–(7). Since physical parameters, such as amplitudes, resonance frequencies, and metabolite concentrations, can be simulated, IPADS strongly increases the flexibility of DL. In general, the network ${\cal{N}}$ learns the trainable parameters ${\theta}$ to minimize the total difference between the fully sampled (or noise-free) label signal ${\tilde{\bf{s}}}$ and output of network ${\cal{N}}{(}{\bf{d}}_{n}{;}{\theta}{)}$, as follows: \[{\hat{\theta}} = {\arg}\mathop{\min}\limits_{\theta} \mathop{\sum}\limits_{{n} = {1}}\limits^{N}{L}{\left({\tilde{\bf{s}}}_{n}{-}{\cal{N}}{(}{\bf{d}}_{n}{;}{\theta}{)}\right)}, \tag{9} \] where ${N}$ is the number of training samples and ${L}$ is the loss function, such as the ${l}_{2}$ norm loss. After obtaining the optimal network parameters ${\hat{\theta}}$, a target signal s is reconstructed via ${\bf{s}} = {\cal{N}}{(}{\bf{d}}{;}{\hat{\theta}}{)}$ for a given undersampled (or noisy) realistic input d. The reconstruction is always ultrafast in realistic experiments, e.g., only 0.04/2.75 s for reconstructing 2/3D protein spectra, and is about 30 times faster than the conventional compressed sensing methods [16]. The basic network architectures are the CNN [16] and its densely connected version [14], [15].
Without any real data involved in training, a representative application of AM-IPADS learning is ultrafast MRS reconstruction [14], [15], [16]. First, based on the exponential model in (5), we vary spectral parameters according to a uniform distribution, such as discretely randomizing the number of peaks from one to 10, normalized amplitude from 0.05 to one, and normalized frequency from 0.01 to 0.99 Hz [14], [15], [16]. In total, 40,000 pairs of synthetic data (the inputs are the undersampled time domain free induction decay signals, and the output labels are fully sampled spectra) are generated within several seconds. Then, a DL MR network (DLNMR) [14] is trained with the synthetic data in 5∼31 h. A DLNMR flowchart [Figure 4(a)] shows that the spectrum artifacts introduced by undersampling are first removed with a dense CNN, and then the intermediate spectra are further refined to maintain data consistency with the sampled free induction decay signals. With the increase of the network phase, artifacts are gradually removed, and finally, a clean spectrum can be reconstructed. DLNMR is good at restoring high-intensity peaks [the second row of Figure 4(c)]. It has been applied to reconstruct many real spectra of proteins, achieving a peak correlation up to 0.9996 in 2D spectra and an acceleration factor of sampling up to 10 in 3D spectra, even though small peaks [the second row of Figure 4(c)] may be formed and even become worse [the second row of Figure 4(d)] if a mismatch exists between the training and target data. These observations imply that, without any real data in training, AM-IPADS has great protentional to boost DL but still needs further improvements.
Figure 4. The exponential signal reconstructions with mismatched training data [14], [15]. (a) The recursive DLNMR framework that alternates between the dense CNN and data consistency. (b) The uniform distribution of peak intensities (distribution 1) for training or reconstruction and a nonuniform distribution (distribution 2) that is only for reconstruction. (c) and (d) The reconstructed target signals that satisfy uniform and nonuniform distributions, respectively. Note that the sampling rate is 25%. The network structure of the Deep Hankel Matrix Factorization (DHMF) imitates the iterative process of low-rank matrix factorization MRS reconstruction. DC: data consistency.
In addition to signal reconstruction, AM-IPADS DL has been explored in physical parameter quantification, such as quantitative susceptibility mapping [10] and water fraction estimation [12]. Since the loss function in (8) is defined on generated data and not the PE, this model can also be applied here. AM-IPADS enables inverse learning from an image to physical parameters. But this learning does not mean that sufficiently faithful results can be obtained. A possible approach is to use the forward AM ${\cal{F}}$ to bridge the missed connection from physical parameters p to image ${\bf{s}} = {\cal{F}}{\left({\bf{p}}\right)}$ and then regularize the solution to have small errors of both physical parameters and images as \begin{align*}{\hat{\theta}} = & {\arg}\mathop{\min}\limits_{\theta}\mathop{\sum}\limits_{{n} = {1}}\limits^{N}{{L}{\left({\bf{p}}_{n}{-}{\cal{N}}{(}{\bf{s}}_{n}{;}{\theta}{)}\right)}} \\ & + {L}_{\text{model}}{\left[{\cal{F}}{(}{\bf{p}}_{n}{)}{-}{\cal{F}}{(}{\cal{N}}{(}{\bf{s}}_{n}{;}{\theta}{)}{)}\right]}, \tag{10} \end{align*} where ${L}_{\text{model}}$ is the additional loss with the AM. For example, in Quantitative Susceptibility Mapping by Deep Neural Network Plus [10], p is a susceptibility map, s is a local ${B}_{0}$ inhomogeneity image, and ${\cal{F}}$ is an analytical differentiable dipole model defined in (4).
Hybrid approach (HB) IPADS means that IPADS learning integrates PE and an AM. The former could be used for some parts that involve the Bloch equation, specific MR pulse sequences, and resonance structures. The latter is used to directly generate a large amount of training data from an AM with randomized parameters. HB-IPADS has been successfully applied to in vivo MRS spectra reconstruction [17], [18], [25], which considers complex, nonideal acquisition conditions. The basis functions ${v}_{m}{(}{t}{)}$ in (7), which are invariant to different subjects, are first generated using physical quantum mechanical simulations tools, such as Java-based magnetic resonance user interface (jMRUI) (https://github.com/isi-nmr/jMRUI) and FID appliance (FID-A) (https://github.com/CIC-methods/FID-A). Then, all other physical signals, which are related to metabolite concentrations of objects, unexpected macromolecule signals, and system imperfections, are directly generated according to the analytical expressions in (7) by varying parameters from an empirical range [17], [18], [25].
After minimizing the same loss function in (9), the trained network can be applied to MRS denoising [17], [25] and separation [18]. With an autoencoder network, the mapping from the low-signal-to-noise-ratio (SNR) spectrum to the high-SNR one is learned on IPADS data and then applied to the SNR improvement of realistic MRS [17]. Compared with the conventional spatial smoothness and subspace methods, the autoencoder reduces the mean square error of denoised 31P spectra on a numerical phantom by 90 and 58%, respectively [17].
Moreover, HB-IPADS is also extended to MRI DL quantification, such as superparamagnetic iron oxide (SPIO) particle quantification [11]. This method first uses the AM in (4) to obtain the inhomogeneous local ${B}_{0}$ field of the SPIO concentration and then generates a slice-modulated MRI image through Bloch simulation. The network learns the mapping from the generated images to the wavelet coefficients of the spatial concentration distribution, and the SPIO prediction is finally obtained by performing the inverse wavelet transform. The network consists of an encoder, a bottleneck, and four decoder subnetworks. This HB-IPADS method significantly outperforms traditional algorithms, which are unreliable when a high concentration of SPIO has a large effect on the inhomogeneous field and phase.
Enhancing IPADS data to fit real applications is discussed in this section. Experiment-specific characteristics, especially system imperfections, should be matched to real applications. For example, in chemical exchange saturation transfer imaging, the inhomogeneity of ${B}_{0}$ has been carefully chosen for phantoms [–0.4∼0.4 parts per million (ppm)] and human skeletal muscle (–0.25∼0.25 ppm), respectively [13]. Besides, the same range of imperfect parameters in training and testing data could greatly improve DL performance. In real-time spiral MRI [26], a major limitation is image blurring introduced by off resonance. A neural network (NN) trained with the same off-resonance range significantly increases the peak SNR of deblurred images from 11.04 to 24.57 dB [26]. Furthermore, MRI images are usually acquired from multicoils, and their correlation can be reflected by coil sensitivity maps. To adapt for real parallel imaging, these maps could be obtained on few real experimental data (five subjects) from one scanner and then multiplied to single-channel images from other scanners or public databases [7].
The object-specific parameter should pay attention to the nonuniform distribution in real data. Many physical parameters of IPADS data are chosen from a given range with equal probability, resulting in uniform distributions of these parameters in the training data [14]. This type of data generation provides unbiased representation when no other prior knowledge is provided. For example, uniform distributions of amplitudes and frequencies were made for IPADS spectrum training and have been successfully applied to many real protein spectra [14], [15], [16]. However, reconstruction performance will be compromised if the true distribution is nonuniform. For example, many low-intensity peaks [the second row of Figure 4(d)] are lost in reconstruction if they do not have a high ratio in the training data. Thus, a good estimation of the distribution for real data is valuable.
In imaging, spatial and temporal distributions of physical parameters should be carefully tuned. In quantitative susceptibility mapping, hemorrhagic lesions have higher susceptibility than healthy tissues. To better fit patient data, susceptibility values were widened through multiplying different factors at different image regions of estimated maps, resulting in the much better artifact removal [10] (Figure 5). Random synthetic shapes assigned with restricted parametric values have been used for network training in OLED ${T}_{2}$ mapping [5], [6] and water–fat imaging [19]. The spatial distribution with texture information from real data could be introduced into enhance IPADS learning. More realistic texture was achieved than with a deep network trained with arbitrary random synthetic shapes [7], [19] (Figure 6). In dynamic cardiac imaging, more motions were set at the beginning of the cycle than at the end, making the synthetic motion paths more realistic [20].
Figure 5. The enhancement of the susceptibility distribution to a wider range [10]. (a) The original susceptibility distribution trained from healthy controls and the enhanced one. (b) The reconstructed quantitative susceptibility mapping (QSM) from hemorrhagic patients without and with data enhancement. The artifacts are marked by yellow arrows. ppm: parts per million. (Data source: Jongho Lee and Woojin Jung [10]; used with permission.)
Figure 6. The representative examples of spatial distribution-enhanced IPADS learning. Top: Rich texture information from natural images was added into random shape-based templates for better water–fat imaging [19]. Bottom: More realistic textures were captured when real spatial distribution was introduced in ${T}_{2}$ mapping [7]. (a) The reference. (b) A zoomed-in view. (c) The view without enhanced textures. (d) The view with enhanced textures.
Narrowing the gap between IPADS and real data is a main task of the network design. Regularizing the solution with general signal priors can generalize the IPADS network to practical data. These priors could be the smoothness [18] and low-rankness [15], that were widely adopted in traditional model-based reconstruction. A direct scheme is taking the output of a pretrained network as an intermediate solution and then improving it with traditional optimization model. For example, the deep autoencoders are first trained for separating two signal components in MRSI, and then smoothness of image in MRSI is enforced with a finite difference operator [18]. This scheme allows convergence characterization and thus makes the algorithm more tractable.
Another scheme is to design a network structure by imitating the iterative process in model-based reconstruction, e.g., low-rank MRS reconstruction (Figure 4) [15]. This approach better preserves the low-intensity peaks and may handle the shifted distributions of spectra intensities in training IPADS and target data (see “Supplementary Data S2,†available at https://doi.org/10.1109/MSP.2022.3183809). Besides, interpretable behavior of the network, such as progressive low-rank approximation (see [15, Fig. 4]), is also provided. Thus, the general signal prior is still powerful for improving the IPADS network.
Refining the network parameters may repair knowledge missed in IPADS learning. Under the principle of meta learning, introducing a subnetwork to adjust hyperparameters of the original network can adapt well to real data, enabling the network to learn more information beyond the IPADS training set [16]. For instance (see Figure 7 and “Supplementary Data S3,†available at https://doi.org/10.1109/MSP.2022.3183809), the threshold in the backbone of a sparse learning network can be adjusted so that each input has its own threshold to eliminate undersampling artifacts. Specifically, with the increase of the sampling rate of the input signals (i.e., with the reduction of undersampling artifacts), the learned threshold decreases. Moreover, with more network phases, thresholds become smaller, and artifacts are gradually removed. Thus, meta learning can remedy the loss of reconstruction performance under unseen sampling rates in IPADS [16].
Figure 7. The hyperparameter adjustment with meta learning in spectrum reconstruction [16]. (a) The backbone network with soft thresholding and the hypernetwork that adjusts the hyperparameter (threshold). (b) The threshold varies with different undersampling rates. (c) The robust reconstruction to unseen sampling rates in IPADS. (d) A region of reconstructed spectra. Note: A higher peak intensity correlation means better reconstruction of spectral peaks. ppm: parts per million; SR: sampling rate.
Another method is introducing a few real data to adjust network parameters by transfer learning. This approach has been explored in conventional network training without IPADS [27]. For example, thousands of natural images were assembled with phase and coil information and then used to train an MRI reconstruction network. By transfer learning from tens of real MRI images, nearly identical reconstruction performance is achieved for training thousands of real images [27]. Since IPADS learning has approximated physical laws in biomedical MR, much less uncertainty needs to be addressed in transfer learning. Thus, fewer real data may be required by IPADS learning than with conventional networks.
Directly incorporating physical laws into the loss function may improve performance or relax the requirements for paired data sets. A good example is the recently proposed physics-informed NNs for myocardial perfusion MRI quantification. By modeling the evolution of the concentration of the contrast agent as differential equations, a physics-informed network is proposed to simultaneously estimate concentrations of the contrast agent and quantify kinetic parameters [28]. This strategy does not need a large database of paired samples but costs a relatively long running time (1 h/imaging slice) under a shallow network of four layers [28]. Since the solutions satisfy the underlying physical laws, the degrees of freedom to describe the imaging physics would be reduced. Thus, the requirement of paired data may be relaxed. Besides, IPADS data could be used to train the initial network parameters offline. This would be very helpful to reduce the computational time and allow using deeper networks to improve parameter quantification performance.
The application of IPADS has been successful in other imaging modalities [3], [29], [30]. For example, in endoscope imaging, a monocular camera model was applied on numerical organs to generate synthetic endoscope images [29]. The synthetic-to-real gap was reduced by an adversarial generative network to remove patient-specific details and retain diagnostic information, resulting in improved depth estimation performance by 78.7%. Another example is Doppler ultrasound imaging. A hemodynamic model was used to generate synthetic blood flow data for training a physics-informed NN [30]. Combined with 3D MRI angiography, the framework can provide brain hemodynamic parameters as 4D flow MRI, and the difference among predicted results has been reduced by 20–50% compared with the conventional method. Thus, the IPADS strategy is applicable to other imaging modalities.
DL has driven innovation in the field of computational imaging. One of its bottlenecks is lacking labels in challenging applications. This article reviewed recent progress of biomedical MR on DL with IPADS. No or few real data are required since IPADS generates signals from partial differential equations or analytical solution models following physical laws, making learning more scalable and explainable and better protecting privacy. Great IPADS potential has been proved in fast imaging, ultrafast signal reconstruction, and accurate parameter quantification.
Mitigating the difference between synthetic and real data is still at an early stage. Many current methods highly depend on the accuracy of physical modeling and the selection of parameter ranges. Although good performance has been achieved on unseen real data, unpredictability in clinical situations remains the biggest risk of synthetic data-based approaches. To generalize synthetic data learning to real biomedicine and clinical applications, more effort should be made to mine the data/parameter distribution of real data, design robust networks, and characterize performance. Few subjective evaluations from radiologists have been found in existing IPADS learning, and this needs to be addressed.
Besides, a powerful, open, and always-accessible IPADS computing platform is highly expected. As programming for physical signal evolution is nontrivial, an online platform that can quickly and accurately generate data with user-defined parameters will allow more researchers to obtain huge IPADS data sets. These data can be used as training sets for learning networks or stress test data for developed DL imaging approaches. Scalable programming, such as graphic pulse sequence design for MRI, and high-precision simulation, e.g., nonrigid motion, proton diffusion, and intravoxel multispin dephasing, will enable challenging imaging applications. It is worth noting that we have developed a prototype of such an IPADS cloud platform (Figure 8), called CloudBrain (https://csrc.xmu.edu.cn/XCloudAiImaging.html), and will continue to improve it with online signal tools to better serve the community.
Figure 8. The vision map of CloudBrain. AI: artificial intelligence.
This work was supported, in part, by the National Natural Science Foundation of China (grants 62122064, 61871341, and 82071913) and Xiamen University Nanqiang Outstanding Talents Program. The authors are grateful to China Mobile for cloud computing resources. They thank Jongho Lee and Woojin Jung for providing the QSM data; Yirong Zhou, Jun Liu, Haoming Fang, Jiayu Li, and Bangjun Chen for building a prototype of the CloudBrain; Haitao Huang, Xinran Chen, and Chen Qian for providing experimental results and valuable discussions; and the guest editors and reviewers for providing valuable suggestions. This article has supplementary downloadable material available at https://doi.org/10.1109/MSP.2022.3183809, provided by the authors. The material includes videos. Xiaobo Qu is the corresponding author.
Qinqin Yang (qqyang@stu.xmu.edu.cn) received his B.S. degree in optoelectronic engineering from Fujian Normal University in 2018. He is working toward his Ph.D. degree in medical physics at Xiamen University, Xiamen, 361005, China. He was the recipient of the Magna Cum Laude Merit Award at the 28th International Society for Magnetic Resonance in Medicine Annual Meeting. His research interests include artificial intelligence, fast magnetic resonance imaging, quantitative imaging, and image reconstruction.
Zi Wang (wangzi1023@stu.xmu.edu.cn) received his B.S. degree in electronic engineering from Zhejiang Gongshang University, Hangzhou, China, in 2019. He is working toward his Ph.D. degree in the Department of Electronic Science, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China. His research interests include optimization, deep learning, magnetic resonance imaging, and magnetic resonance spectroscopy.
Kunyuan Guo (33320210155976@stu.xmu.edu.cn) received his B.S. degree in software engineering from Inner Mongolia University in 2021. He is working toward his Ph.D. degree in electronic science at Xiamen University, Xiamen, 361005, China. His research interests include deep learning, cloud computing, fast magnetic resonance imaging, and biomedical image analysis and its applications.
Congbo Cai (cbcai@xmu.edu.cn) received his Ph.D. degree in physics from Xiamen University, Xiamen, 361005, China, where he has been a professor since 2017. He has been awarded four grants from the Natural Science Foundation of China, and as the main contributor to several teams, he has won numerous provincial and municipal scientific and technological progress awards in China. His research interests include ultrafast pulse sequence design, multiparametric mapping, and deep learning magnetic resonance imaging reconstruction.
Xiaobo Qu (quxiaobo@xmu.edu.cn) received his Ph.D. degree in communication engineering from Xiamen University, Xiamen, 361005, China, where he is a professor and the vice director of the Department of Electronic Science, the director of the Biomedical Intelligent Cloud Research and Development Center, and the vice director of the Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance. He received the Distinguished Young Scholar Award from the Natural Science Foundation of Fujian Province of China (2018), He Yici Chair Professor Award (2018) at Xiamen University, and Excellent Young Scientists Fund Award from the National Natural Science Foundation of China (2021). He is an associate editor of IEEE Transactions on Computational Imaging and a senior editor of BMC Medical Imaging. His research interests include magnetic resonance imaging and spectroscopy, computational imaging, machine learning, artificial intelligence, and cloud computing. Qu is a member of IEEE and the IEEE Signal Processing Society.
[1] S. S. Wang, Z. H. Su, L. Ying, X. Peng, S. Zhu, F. Liang, D. G. Feng, and D. Liang, “Accelerating magnetic resonance imaging via deep learning,†in Proc. IEEE 13th Int. Symp. Biomed. Imag. (ISBI), Apr. 2016, pp. 514–517, doi: 10.1109/ISBI.2016.7493320.
[2] M. Jacob, J. C. Ye, L. Ying, and M. Doneva, “Computational MRI: Compressive sensing and beyond,†IEEE Signal Process. Mag., vol. 37, no. 1, pp. 21–23, Jan. 2020, doi: 10.1109/msp.2019.2953993.
[3] A. F. Frangi, S. A. Tsaftaris, and J. L. Prince, “Simulation and synthesis in medical imaging,†IEEE Trans. Med. Imag., vol. 37, no. 3, pp. 673–679, Mar. 2018, doi: 10.1109/tmi.2018.2800298.
[4] R. J. Chen, M. Y. Lu, T. Y. Chen, D. F. K. Williamson, and F. Mahmood, “Synthetic data in machine learning for medicine and healthcare,†Nature Biomed. Eng., vol. 5, no. 6, pp. 493–497, Jun. 2021, doi: 10.1038/s41551-021-00751-8.
[5] C. Cai et al., “Single-shot T2 mapping using overlapping-echo detachment planar imaging and a deep convolutional neural network,†Magn. Reson. Med., vol. 80, no. 5, pp. 2202–2214, Nov. 2018, doi: 10.1002/mrm.27205.
[6] J. Zhang, J. Wu, S. Chen, Z. Zhang, S. Cai, C. Cai, and Z. Chen, “Robust single-shot T2 mapping via multiple overlapping-echo acquisition and deep neural network,†IEEE Trans. Med. Imag., vol. 38, no. 8, pp. 1801–1811, Aug. 2019, doi: 10.1109/tmi.2019.2896085.
[7] Q. Yang et al., “MOdel-based SyntheTic Data-driven Learning (MOST-DL): Application in single-shot T2 mapping with severe head motion using overlapping-echo acquisition,†IEEE Trans. Med. Imag., early access, 2022, doi: 10.1109/TMI.2022.3179981.
[8] O. Cohen, B. Zhu, and M. S. Rosen, “MR fingerprinting Deep RecOnstruction NEtwork (DRONE),†Magn. Reson. Med., vol. 80, no. 3, pp. 885–894, Sep. 2018, doi: 10.1002/mrm.27198.
[9] S. Gavazzi et al., “Deep learning-based reconstruction of in vivo pelvis conductivity with a 3D patch-based convolutional neural network trained on simulated MR data,†Magn. Reson. Med., vol. 84, no. 5, pp. 2772–2787, Nov. 2020, doi: 10.1002/mrm.28285.
[10] W. Jung, J. Yoon, J. Y. Choi, J. M. Kim, Y. Nam, E. Y. Kim, J. Lee, and S. Ji, “Exploring linearity of deep neural network trained QSM: QSMnet+,†Neuroimage, vol. 211, p. 116,619, May 2020, doi: 10.1016/j.neuroimage.2020.116619.
[11] G. Della Maggiora et al., “DeepSPIO: Super paramagnetic iron oxide particle quantification using deep learning in magnetic resonance imaging,†IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 1, pp. 143–153, Jan. 2022, doi: 10.1109/tpami.2020.3012103.
[12] S. Jung, H. Lee, K. Ryu, J. E. Song, M. Park, W. J. Moon, and D. H. Kim, “Artificial neural network for multi-echo gradient echo-based myelin water fraction estimation,†Magn. Reson. Med., vol. 85, no. 1, pp. 394–403, Jan. 2021, doi: 10.1002/mrm.28407.
[13] L. Chen et al., “In vivo imaging of phosphocreatine with artificial neural networks,†Nature Commun., vol. 11, no. 1, p. 1072, Feb. 2020, doi: 10.1038/s41467-020-14874-0.
[14] X. Qu, Y. Huang, H. Lu, T. Qiu, D. Guo, T. Agback, V. Orekhov, and Z. Chen, “Accelerated nuclear magnetic resonance spectroscopy with deep learning,†Angewandte Chemie Int. Ed., vol. 59, no. 26, pp. 10,297–10,300, Jun. 2020, doi: 10.1002/anie.201908162.
[15] Y. Huang, J. Zhao, Z. Wang, V. Orekhov, D. Guo, and X. Qu, “Exponential signal reconstruction with deep Hankel matrix factorization,†IEEE Trans. Neural Netw. Learn. Syst., early access, 2021. doi: 10.1109/tnnls.2021.3134717.
[16] Z. Wang et al., “A sparse model-inspired deep thresholding network for exponential signal reconstruction-Application in fast biological spectroscopy,†IEEE Trans. Neural Netw. Learn. Syst., early access, 2022, doi: 10.1109/tnnls.2022.3144580.
[17] F. Lam, Y. Li, and X. Peng, “Constrained magnetic resonance spectroscopic imaging by learning nonlinear low-dimensional models,†IEEE Trans. Med. Imag., vol. 39, no. 3, pp. 545–555, Mar. 2020, doi: 10.1109/tmi.2019.2930586.
[18] Y. Li, Z. Wang, R. Sun, and F. Lam, “Separation of metabolites and macromolecules for short-TE H-1-MRSI using learned component-specific representations,†IEEE Trans. Med. Imag., vol. 40, no. 4, pp. 1157–1167, Apr. 2021, doi: 10.1109/tmi.2020.3048933.
[19] X. Chen, W. Wang, J. Huang, J. Wu, L. Chen, C. Cai, S. Cai, and Z. Chen, “Ultrafast water-fat separation using deep learning-based single-shot MRI,†Magn. Reson. Med., vol. 87, no. 6, pp. 2811–2825, Jun. 2022, doi: 10.1002/mrm.29172.
[20] M. Loecher, L. E. Perotti, and D. B. Ennis, “Using synthetic data generation to train a cardiac motion tag tracking neural network,†Med. Image Anal., vol. 74, p. 102,223, Dec. 2021, doi: 10.1016/j.media.2021.102223.
[21] A. Loktyushin et al., “MRzero: Automated discovery of MRI sequences using supervised learning,†Magn. Reson. Med., vol. 86, no. 2, pp. 709–724, Aug. 2021, doi: 10.1002/mrm.28727.
[22] F. Bloch, W. W. Hansen, and M. Packard, “Nuclear induction,†Phys. Rev., vol. 69, no. 3-4, pp. 127–127, 1946, doi: 10.1103/PhysRev.69.127.
[23] F. Liu, J. V. Velikina, W. F. Block, R. Kijowski, and A. A. Samsonov, “Fast realistic MRI simulations based on generalized multi-pool exchange tissue model,†IEEE Trans. Med. Imag., vol. 36, no. 2, pp. 527–537, Feb. 2017, doi: 10.1109/tmi.2016.2620961.
[24] S. S. Wang, H. T. Cheng, L. Ying, T. H. Xiao, Z. W. Ke, H. R. Zheng, and D. Liang, “DeepcomplexMRI: Exploiting deep residual network for fast parallel MR imaging with complex convolution,†Magn. Reson. Imag., vol. 68, pp. 136–147, May 2020, doi: 10.1016/j.mri.2020.02.002.
[25] H. H. Lee and H. Kim, “Intact metabolite spectrum mining by deep learning in proton magnetic resonance spectroscopy of the brain,†Magn. Reson. Med., vol. 82, no. 1, pp. 33–48, Jul. 2019, doi: 10.1002/mrm.27727.
[26] Y. Lim, Y. Bliesener, S. Narayanan, and K. S. Nayak, “Deblurring for spiral real-time MRI using convolutional neural networks,†Magn. Reson. Med., vol. 84, no. 6, pp. 3438–3452, Dec. 2020, doi: 10.1002/mrm.28393.
[27] S. U. Dar, M. Ozbey, A. B. Catli, and T. Cukur, “A transfer-learning approach for accelerated MRI using deep neural networks,†Magn. Reson. Med., vol. 84, no. 2, pp. 663–685, Aug. 2020, doi: 10.1002/mrm.28148.
[28] R. L. M. van Herten, A. Chiribiri, M. Breeuwer, M. Veta, and C. M. Scannell, “Physics-informed neural networks for myocardial perfusion MRI quantification,†Med. Image Anal., vol. 78, p. 102,399, May 2022, doi: 10.1016/j.media.2022.102399.
[29] F. Mahmood, R. Chen, and N. J. Durr, “Unsupervised reverse domain adaptation for synthetic medical images via adversarial training,†IEEE Trans. Med. Imag., vol. 37, no. 12, pp. 2572–2581, Dec. 2018, doi: 10.1109/TMI.2018.2842767.
[30] M. Sarabian, H. Babaee, and K. Laksari, “Physics-informed neural networks for brain hemodynamic predictions using medical imaging,†IEEE Trans. Med. Imag., early access, 2022, doi: 10.1109/TMI.2022.3161653.
Digital Object Identifier 10.1109/MSP.2022.3183809