Logo of halHAL - Archives Ouvertes - Home page.
Neuroimage. 2007 October; 38(1): 124–37.
Published online 2007 August 7. doi: 10.1016/j.neuroimage.2007.07.025.

In electroencephalographic (EEG) measurements performed during functional Magnetic Resonance Imaging (fMRI), imaging and cardiac artefacts strongly contaminate the EEG signal. Several algorithms have been proposed to suppress these artefacts and most of them have shown important improvements with respect to uncorrected signals. However, the relative performances of these algorithms have not been properly assessed. In particular, it is not known to what extent such algorithms deteriorate the EEG signal of interest. In this study, we propose to cross-validate different methods proposed for artefact correction, using a forward model to generate EEG and MR-related artefacts. The methods are assessed under various experimental conditions (described in terms of EEG sampling rate, artefacts amplitude, frequency band of interest, etc.). Using experimental data, we also tested the performance of the correction methods for alpha rhythm imaging and for epileptic spike reconstruction. Results show that most of the methods allow the observation of the modulation of alpha rhythms and the identification of spikes, despite subtle differences between algorithms. They also show that over-filtering the data may degrade the EEG. Our results indicate that the optimal artefact removal technique should be chosen according to whether one is interested in fast (>10 Hz) vs. slow (<10 Hz) oscillations or in evoked vs. ongoing activity.

MESH keywords: Adult, Algorithms, Artifacts, Brain Mapping, methods, Diagnosis, Computer-Assisted, methods, Electroencephalography, methods, Epilepsy, diagnosis, Humans, Magnetic Resonance Imaging, methods, Reproducibility of Results, Sensitivity and Specificity, Software, Software Validation

Simultaneous use of electroencephalography (EEG) and functional Magnetic Resonance Imaging (fMRI) has been actively developed over the last years (Gotman et al., 2004; Hamandi et al., 2004; Salek-Haddadi et al., 2003). Recording EEG during fMRI (EEG/fMRI) enable using fMRI in applications whereby the regressors of interest are to be derived from brain electrical activity. For statistical analysis of fMRI data, EEG signals play then the role of the stimulus function or of other extrinsic inputs, as is classically performed in cognitive psychology when modelling stimuli or changes in the experimental context. Examples of such applications are studies of ongoing oscillatory activity (Goldman et al., 2002; Laufs et al., 2003), of sleep (Czisch et al., 2002) and of epilepsy (Bénar et al., 2003; Krakow et al., 2001; Lemieux et al., 2001).

Recording EEG during fMRI is challenging because signals due to underlying neuronal activity may be small in comparison to signals of instrumental origin and of other physiological nature. Both the static field (B0) and the time-varying fields (generated by the radio-frequency excitations and by the imaging gradients) generate artefacts in EEG measurements. These two types of artefacts are referred to as ballistocardiogram (BCG) (Bonmassar et al., 2002) and imaging artefacts (Allen et al., 2000; Felblinger et al., 1999), respectively.

The BCG, or pulse artefact, is caused by pulsations of the scalp arteries (Allen et al., 1998; Ives et al., 1993). The ensuing motion of the scalp electrodes and of the attached leads produces an electromotive force proportional to the strength of the static field. It may be considerably higher than the brain-related scalp EEG (up to 200 μV at 3T). It is difficult to remove the BCG artefact mainly because it is highly non-stationary: its duration and amplitude, although correlated in time, appear to differ stochastically between successive heartbeats. In addition to this important non-stationarity, other difficulties originate from the facts that: (i) most of BCG power lies in the frequency range (1–10 Hz) corresponding to where most of the EEG power occurs (theta, delta and alpha bands, evoked potentials); (ii) BCG waveforms are often similar to interictal spikes and thus represent an important confound in EEG/fMRI applied to epilepsy; (iii) BCG amplitude varies considerably between EEG channels, depending on the particular spatial configuration of electrodes and of leads within the static magnetic field; (iv) BCG amplitude and morphology vary within and between subjects.

Imaging artefacts are induced by the MR sequences, i.e. by the radio-frequency pulses applied for spin excitation and by the switching of gradients of magnetic field (B) applied for spatial encoding of the MR signal. The electromotive forces that produce imaging artefacts are proportional to the cross-sectional area of wire loops and to the time derivative of the magnetic field (dB/dt). Furthermore, acoustic vibrations of the scanner due to gradient switching cause small movements of the subject and of the equipment, thus contributing to the artefacts. This occurs in addition to intrinsic subject’s motion, which produces a slowly varying modulation of the amplitude of imaging artefacts because the spatial configuration of conductors in the magnetic field is modified. Overall, the amplitude of imaging artefacts is huge, up to several mV. Without any correction, it is usually not possible to use EEG recordings for further analysis.

Several methods have been proposed to remove imaging artefacts (Allen et al., 2000; Bénar et al., 2003; Felblinger et al., 1999; Garreffa et al., 2003; Hoffmann et al., 2000; Negishi et al., 2004; Niazy et al., 2005; Sijbers et al., 1999; Wan et al., 2006a) and/or BCG artefacts (Allen et al., 1998; Bénar et al., 2003; Briselli et al., 2006; Ellingson et al., 2004; Goldman et al., 2000; Kim et al., 2004; Nakamura et al., 2006; Niazy et al., 2005; Sijbers et al., 2000; Srivastava et al., 2005; Wan et al., 2006b). Each of these algorithms has its own pros and cons. They are difficult to evaluate on the basis of the literature as most of these methods have been validated by simple visual tests, i.e. checking whether artefacts were still visible after correction (Bénar et al., 2003; Briselli et al., 2006; Hoffmann et al., 2000; Negishi et al., 2004; Sijbers et al., 2000; Srivastava et al., 2005; Wan et al., 2006a) or whether artificial spikes were present in the corrected signal (Allen et al., 1998; Allen et al., 2000; Bonmassar et al., 2002). Similarly to EEG modelling studies, comparing the EEG spectrum obtained before and after artefact suppression has also been used to validate results (Bonmassar et al., 2002; Niazy et al., 2005; Srivastava et al., 2005). It is likely, however, that correction methods might suppress significant EEG signal, in addition to BCG and imaging artefacts. Suppression of EEG activity in the signal cannot be assessed merely by visual inspection or computation of EEG spectra as the EEG is often undistinguishable from coloured noise (amplitude spectrum in 1/f where f is the frequency).

Experimental solutions to attenuate the limitations induced by imaging artefacts can be interleaved (Bonmassar et al., 2002) or EEG-triggered (Krakow et al., 1999) acquisitions. In interleaved acquisitions, pauses in fMRI acquisitions are introduced to record EEG during artefact-free time windows. In EEG-triggered acquisitions, MRI scanning is triggered by EEG events, such as epileptic spikes. However, these methods have important drawbacks: (i) EEG events occurring during scanning may be lost in EEG-triggered acquisition because not detected online; (ii) a refractory period after scanning is needed in triggered acquisitions for complete T1 relaxation; (iii) the scanning time is considerably lengthened. Clearly, continuous EEG/fMRI acquisition associated with efficient artefact correction is the optimal approach.

In this study, we use simulated data to assess to what extent the true EEG signal is recovered or deteriorated by the different correction algorithms. First, we develop a forward model of the EEG and of the different sources of artefacts. Then, we describe briefly the different correction procedures selected from the literature. In the Results section, we present the results obtained from the simulations and we show how these results translate to experimental data. We conclude by proposing general recommendations for optimal processing of experimental data.

We developed a simple forward model to generate EEG data and associated MR-related artefacts. As shown below, we consider ongoing EEG activity as being well approximated by additive stochastic processes occurring in different frequency bands. On top of that, we assume additive MR-related artefacts. The dynamics of these artefacts were chosen so as to reproduce experimental data obtained in our centre. Such forward modelling is aimed at evaluating the influence of different parameters on the quality of artefacts rejection. Evaluation was done by comparing the EEG activity without any artefact to the output of the artefact removal algorithms.

II.1.1. Electroencephalogram Ongoing EEG activity is thought to be mainly, but not only, generated by the depolarisation of apical dendrites of cortical principal cells (Nunez and Srinivasan, 2005). Accordingly, biophysical models of the neural mass have been proposed (David et al., 2007). Because neural networks generating the EEG are extremely complex, and not well understood, biophysical models do not generally capture easily all possible EEG dynamics. Alternatively, one might be interested in modelling the EEG dynamical properties only, with a higher precision than what biophysical models can generally do, using simple generative procedures.

Because we were mainly interested in EEG dynamical properties, and not so much in the biophysical origins of EEG, we have chosen to reproduce the spectrum of ongoing EEG activity in a simple way. We simulated EEG as a linear mixture of seven Gaussian distributions. Each Gaussian distribution was bandpass filtered in different frequency bands within the range between 1 Hz and 70 Hz. The amplitude and variance of each distribution was adjusted to fit experimental data obtained in one subject (Figure 1). We modelled a dropout of the EEG signal in the 45–55 Hz band to mimic the effect of a notch filter for the line noise (assumed to be at 50 Hz here). We furthermore assumed a modulation of amplitude in the alpha band (8–12 Hz) to simulate a standard EEG/fMRI paradigm of alpha rhythm imaging where subjects open and close their eyes successively every 20 seconds. This was accomplished by modulating with a sine wave the amplitude of the Gaussian distribution representing the alpha band between a low value (eyes opened, amplitude of alpha oscillations=10 μV) and a high value (eyes closed, amplitude of alpha oscillations=30 μV). To model the spatial correlation between EEG channels due to the diffusion of the electric potential on the scalp, we finally correlated the different EEG time series. To do so in a general way, i.e. not assuming a specific neuronal configuration which would correspond to a specific spatial pattern of correlation, we assumed a circular connectivity between EEG channels and applied a smoothing convolution kernel at each time bin. This kernel was a Gaussian function with a standard deviation equal to 4 channels. In summary, our EEG model is not biophysical, but it allows to generate signals showing realistic spatial-temporal statistical properties, at least for first order dynamics.

II.1.2. Imaging artefacts A template of imaging artefacts was derived by averaging experimental EEG recordings. They were obtained in our 3T scanner (3T Bruker BioSpin, Bruker Medizintechnik GmbH, Ettlingen, Germany) using an MR compatible EEG amplifier (SD32, Micromed, Treviso, Italy) with 17 c-shaped electrodes positioned according to the 10/20 system (O1 and O2 were not used to preserve subjects’ comfort). A gradient-echo Echo Planar Imaging (GE-EPI) sequence was used (TR = 3 s, TE = 30 ms, flip angle = 80°, RF pulse duration = 1.536 ms, RF modulation bandwidth = 3516 Hz, FOV = 216 mm × 216 mm, matrix 72×72, 41 adjacent slices 3.5 mm thick). The EEG acquisition sampling rate was 1024 Hz. An antialiasing hardware low-pass filter was implemented by a Sigma-Delta AD converter at 268.8 Hz. EEG signals were calibrated with a square wave of 100 μV using an external calibrator plugged on all inputs.

Despite the fact that EEG amplifiers are generally equipped with low-pass filters to avoid aliasing of radio-frequency signals (Abacherli et al., 2005; Anami et al., 2003; Hamandi et al., 2004), a radio-frequency component is usually observed in imaging artefacts, albeit of less amplitude than the gradient component. For simplicity, we did not take into account the radio-frequency artefact, mainly because its duration (usually less than 5 ms) is negligible in comparison to the duration of gradient artefacts (usually about 60 ms).

The template of imaging artefacts was then over sampled at 50 kHz (Figures 2A and 2B, left). Figure 2C shows its amplitude spectrum to indicate the overlap with the EEG spectrum (Figure 1B). Finally, the amplitude of the imaging artefact was varied between channels, from 0 up to 7000 μV peak-to-peak, to model observed differences in experimental data.

II.1.2.1. Asynchronicity of MRI and EEG clocks The use of different clocks in the MR console and in the EEG acquisition system is a major source of residual gradient-induced spikes after correction (Allen et al., 2000). These residual spikes are particularly important when the EEG sampling rate is low. One approach to improve suppression of these artefacts is to use the EEG clock as a trigger for the EPI sequence. It appears that this procedure is not easy to implement practically on most current MR acquisition systems. The alternative is to adjust the repetition time (TR) to the sampling rate of the EEG.

In simulations, we introduced a slight timing offset of 152 μs per second between EEG and MRI clocks, similarly to what we noticed in our recordings. We then sampled down the data to various sampling rates, from 256 Hz to 8192 Hz (Figure 2B, middle). These data have been used to evaluate to what extent correction methods are sensitive to the sampling rate, given a slight mismatch between EEG and MRI clocks.

II.1.2.2. Slow modulation of the imaging artefact amplitude In addition to a fast modulation of the imaging artefact waveform that could be attributed to the timing offset problem, we noticed a slow modulation (up to 500 μV) of the amplitude of the imaging artefact in experimental recordings. This drift may be due to several sources of temporal fluctuations such as subject’s motion and electronic or mechanical instabilities due to temperature variation. We thus modelled this effect by modulating the amplitude of the imaging artefact by a sine wave (period=200 s, amplitude adjusted to simulations as indicated in Section II.3.1) (Figure 2B, right).
II.1.3. Ballistocardiogram (BCG) We modelled the BCG to match as well as possible global features of experimental recordings obtained at 3T, in the absence of MR measurements. First, the heart rate was modulated smoothly between 65 and 85 beats per minute using a sine wave of one minute period (Figure 3A). Second, the mean BCG amplitude of a channel was varied from 10 to 200 μV to investigate different levels of EEG contamination (Figure 3B). Third, the amplitude of the BCG was varied between two successive cardiac events by taking into account equally the amplitude of the last event and a random fluctuation which followed a normal distribution with a standard deviation of 15% of the BCG mean amplitude (Figure 3C). Fourth, we modelled phase lags between vessel pulsations over the head, by introducing a latency of several milliseconds (random latency of 15 ms standard deviation) between the BCG of the different channels (Figure 3D). Finally, we noticed a jitter (about 20 ms) on the timing of the QRS complex as determined using available automatic techniques for QRS detection. We modelled this jitter by adding a random latency for each QRS event (Figure 3E). The distribution of the latencies was normal, zero-mean and with a variance which was varied to test for the sensitivity of the correction methods to suboptimal detection of cardiac events.
II.2. Artefact correction methods
In this section, we present the different artefact correction methods that were tested. All methods have been thoroughly described in the literature. We used the original code from (Niazy et al., 2005). Otherwise, we used our own implementation which followed as faithfully as possible published equations. We will indicate below the parameters we have chosen for each algorithm.
II.2.1. Imaging artefacts
II.2.1.1. Image Artefact Reduction (IAR) Image Artefact Reduction (IAR) algorithm was proposed in (Allen et al., 2000). This method is composed of two successive steps: (i) average artefact waveform subtraction followed by (ii) adaptive noise cancellation (ANC). For the averaging step, the EEG is assumed to be uncorrelated between MR volume acquisitions. That assumption is usually valid as the autocorrelation function of the EEG is much narrower than that of the imaging artefacts (particularly true for higher frequencies).

In our implementation of the IAR algorithm, EEG data were first interpolated to get a sampling rate of 10 kHz. This interpolation allowed a sufficiently small time resolution to proceed to a slice timing alignment to reduce the phase drift caused by the different clocks of EEG and MRI systems. Artefact waveforms were obtained using a moving average of 25 repetition times and subtracted from the noisy EEG. The corrected EEG was then down-sampled to the initial sampling frequency and low-pass filtered (finite impulse response filter, cut-off 50 Hz) to reduce aliasing. Finally, an ANC process was used to estimate the artefact residuals in the EEG by adjusting the weights of an ANC filter using a least mean square (LMS) algorithm. This filter minimises the correlation between the source of noise, estimated during the first step, and the noisy EEG.

II.2.1.2. FMRI Artefact Slice Template Removal (FASTR) FMRI Artefact Slice Template Removal (FASTR) algorithm was proposed in (Niazy et al., 2005). Similarly to other approaches (Negishi et al., 2004), it is based on temporal Principal Component Analysis (PCA). It decomposes the spatio-temporal EEG matrix into orthogonal temporal components, the number of which is equal to the number of EEG channels. Components are sorted according to the variance they explain in the EEG. Since the imaging artefacts are uncorrelated to neuronal activity and of much higher amplitude, they are usually captured in the very first PCA components.

The FASTR algorithm proceeds into four steps. The first two steps are identical to those applied in the IAR algorithm: (i) realignment following interpolation and slice-timing; (ii) subtraction of local artefact templates. The third step is a decomposition of the residuals using several basis functions (those that explain most of the variance of the residuals). The basis functions are obtained by PCA applied to a matrix containing the artefact segments of each channel, independently. The second and third steps are complementary because the second step capture large changes in the artefact shape whereas the third step capture more subtle variations in the artefact shape. Finally, the fourth and last step is ANC filtering similar to that described in the previous method (IAR).

We used the implementation of the FMRIB plug-in for EEGLAB (http://www.sccn.ucsd.edu/eeglab/, Salk Institute, La Jolla, CA) provided by the University of Oxford Centre for Functional MRI of the Brain (FMRIB) with the following parameters: low-pass filter cut-off frequency = 50 Hz, averaging window length = 25 and up-sampling frequency = 10 kHz. The principal components were sorted according to their eigenvalues, in decreasing order. The number of components removed was equal to the number of eigenvalues for which the relative difference with the following eigenvalue was greater than a fixed threshold (empirically adjusted). This method evaluated the number of components describing artefacts to be usually comprised between 2 and 4.

II.2.1.3. Independent Component Analysis (ICA) Independent Component Analysis (ICA) is a statistical approach for blind source separation, which is often used to remove EEG artefacts (Jung et al., 2000), in particular in simultaneous EEG/fMRI recordings (Bénar et al., 2003; Briselli et al., 2006; Nakamura et al., 2006; Srivastava et al., 2005). The purpose of temporal ICA is to identify components that present maximal temporal statistical independency. This appears a priori as a valid approach to separate neuronal EEG and imaging artefacts because these signals are generated by different (uncorrelated) processes.

ICA decomposition was done with the EEGLAB toolbox from the Computational Neurobiology Laboratory (http://www.sccn.ucsd.edu/eeglab/, Salk Institute, La Jolla, CA), using an approach based on the Infomax ICA algorithm (Bell and Sejnowski, 1995). We selected the components which were correlated with the imaging artefact template. This was done by keeping the components with a normalised cross-correlation coefficient that was higher than the average plus one standard deviation of that coefficient computed for all the components. Selected components were excluded from the reconstruction: EẼG = EEG − (WS)−1 I0 A where EEG is the noisy EEG, W is the ICA matrix of weights, S is the data sphering matrix, A is the activation time course of the output components and I0 is a diagonal matrix of ones or zeros indicating which components were kept and which were excluded (the corresponding elements of the diagonal were set either to one or to zero, respectively). WS is known as the unmixing matrix.

II.2.1.3. Filtering in the frequency domain using the Fourier Transform (FT) Imaging artefacts can be removed in the frequency rather than in the time domain, because EPI artefacts are periodic and distributed over a limited range of frequencies (Hoffmann et al., 2000). Filtering in the frequency domain is particularly efficient if imaging artefacts and EEG have non-overlapping frequency spectra. In other words, this method is optimal for MR sequences designed so as to generate imaging artefacts at frequencies non overlapping EEG frequencies of interest. A priori, EEG signals should be sampled fast enough to avoid aliasing effects.

To filter imaging artefacts in the frequency domain, we first calculated the Fourier transform (FT) of the imaging artefact template N = FT{artefact}. We then created a vector W that contained the weights W(i) applied to each spectral component of the Fourier transform of the noisy EEG. If the ith coefficient N(i) exceeded a given threshold, W(i) was set to the inverse of that coefficient (W(i) = 1/N(i), where N(i) is much greater than 1 according to the threshold used). Otherwise, W(i) was set equal to one. For each channel, we calculated the Fourier transform of the noisy EEG (E = FT{EEG}) and we multiplied the results by the weighting vector W to attenuate the coefficients corresponding to the artefact ( = EW). Finally, we applied the inverse Fourier transform to obtain the corrected EEG: EẼG = FT−1 {}. This process is very similar to the algorithm proposed in (Hoffmann et al., 2000). The only difference is that we introduced a weighting inversely proportional to Fourier transform coefficients of the artefact whereas these coefficients were set to zero in Hoffmann’s approach. In a preliminary study (not shown in the Results section), we found that weighting in comparison to zeroing improved the results for two reasons: (i) mainly, less signal of interest was removed, (ii) less importantly, the ringing artefact (Bénar et al., 2003) which may occur using this type of filtering was potentially reduced.

II.2.2. Ballistocardiogram
II.2.2.1. Average Artefact Subtraction (AAS) Similarly to what can be done with imaging artefacts, subtracting an averaged artefact template of cardiac activity is probably the simplest way to reduce the BCG. An explicit assumption here is that the BCG is reproducible between successive heartbeats. This may be less valid than for imaging artefacts.

We used the Average Artefact Subtraction (AAS) algorithm proposed in (Allen et al., 1998). In our implementation, we calculated a moving average artefact template over 30 successive heartbeats, and we then subtracted this template from the data. AAS algorithm includes identification of ECG peaks and rejection of section contaminated by strong artefacts (such as eye blinks or muscular activity). These steps were not used in simulations because synthetic data were free from ocular and muscular artefacts and because the onsets of QRS complexes were known a priori (no problem of QRS detection). It has been proposed to replace the moving average by a weighted moving average (Goldman et al., 2000) or by a median filter (Ellingson et al., 2004; Sijbers et al., 2000). Using these adaptations of the AAS algorithm, we obtained similar results (not shown below) to those obtained using the original AAS approach. Therefore we will only present below results obtained with the first formulation of AAS.

II.2.2.2. Kalman Adaptive Filtering Kalman adaptive filtering has been proposed to filter the BCG (Bonmassar et al., 2002). This method uses a piezoelectric motion sensor located over the temporal artery to obtain an indirect measure of the BCG waveform alone (assumed to be a mechanical artefact). The algorithm uses the correlation between the motion sensor signal and the EEG to remove the BCG signal from EEG by adaptive filtering.

The shape of the BCG and of the motion sensor signal is usually different. For simulations, we assumed that the relationship between those two signals was instantaneous and nonlinear, as might be found in electro-mechanical coupling. We therefore postulated a sigmoid transfer function. This transformation converts the BCG signal into a model of the motion signal of different shape that could have been recorded by the sensor. The sigmoid function used was: where x(t) is the BCG averaged over all channels and a = 0.1. The parameter of the sigmoid function was chosen so as to induce a strong saturation of motion signal, thereby modelling strong nonlinear effects. Introducing such nonlinear effects was important for us to evaluate the robustness of Kalman filtering. Under this condition, the choice of the sigmoid function was not critical and any other nonlinear transfer function could have been used instead for simulations.

The Kalman adaptive filter computes a linear minimum mean-square estimate of the Finite Impulse Response (FIR) filter coefficients using a one-step predictor algorithm. The Kalman filter which was used (Bonmassar et al., 2002) is expressed in matrix form as:

(1)
where t is the index of the current algorithm iteration, m(t) is the buffered motion signal at step t, P(t) is the correlation matrix of state estimation error (initial variance set to 1), k(t) is the vector of Kalman gains at step t, (t) is the filter-tap estimate at step t, (t) is the filtered output at step t, e(t) is the estimation error at step t, s(t) is the desired response at step t, QM is the correlation matrix of the measurement noise (set to 100) and QP is the correlation matrix of the process noise (variance set to 10−8). In our simulations, the FIR filter length was equal to 80.

II.2.2.3. Principal Component Analysis (PCA) In addition to being used for removal of imaging artefacts, temporal PCA has also been used to remove BCG (Bénar et al., 2003; Negishi et al., 2004; Niazy et al., 2005). The main advantage of this method in comparison to the subtraction of an average artefact is that it allows for slight variations in the shape of successive BCG artefacts. This is achieved by selecting different components that can be used as a basis data set. In other words, the first components of a PCA are assumed to capture most of the variance introduced by the BCG. However, removing too many components deteriorates the EEG data as EEG and BCG are not strictly orthogonal. We have therefore chosen to use only the first 3 components, including the average BCG using the FMRIB plug-in for EEG, provided by the University of Oxford Centre for Functional MRI of the Brain. The principle for removing the BCG with temporal PCA is similar to what has been discussed earlier with respect to imaging artefacts.
II.2.2.4. Independent Component Analysis (ICA) Similarly to what has been described above for imaging artefacts, temporal ICA can be used to remove the BCG (Bénar et al., 2003; Briselli et al., 2006; Nakamura et al., 2006; Srivastava et al., 2005). Using the same algorithm as for imaging artefacts, we excluded from the reconstruction the components which were the most correlated with cardiac activity.
II.3. Performance evaluation using simulations
Using the forward model described above, several simulations have been designed to investigate the influence of different key parameters on the quality of artefact correction, for the different methods described above. The quality of artefact correction was evaluated as follows. We first created a model of noisy EEG by adding together the models of the original EEG data and of the artefacts (either BCG or imaging artefacts). Then, we applied the correction routines to this noisy EEG to obtain an estimated EEG (EẼG) that was eventually compared to the original EEG (EEG). The correction quality was quantified using the signal to noise ratio (SNR), the value of which is noted SNR:
(2)

The SNR is calculated as the ratio between the standard deviation (std) of original EEG (signal) and the standard deviation of the difference between EEG and EẼG (noise). It is a summary of discrepancies between two signals. However, because it is normalised, it does not quantify the absolute value of residual noise. Moreover, it is limited to difference detection and does not allow distinguishing between signal attenuation or amplification.

For each simulation, we generated 10 data sets and averaged the results to increase the precision and to reduce the effects of outliers. It appeared that the outcome of the simulations was highly reproducible, however. Unless otherwise specified, (i) EEG signals were simulated over 20 channels, during 3 minutes, and were sampled at 1024 Hz; (ii) the amplitude of imaging artefacts was slowly modulated (10% of the mean amplitude) when correcting imaging artefacts; (iii) a QRS detection jitter was set to 25 ms (standard deviation) when correcting BCG. Given these parameters, we have successively varied one of them to investigate key phenomena, as listed just below.

II.3.1. Imaging artefacts
  • EEG sampling rate: The EEG sampling rate was varied from 256 to 8192 Hz.
  • Artefact amplitude modulation: The artefact amplitude was slowly modulated. The amplitude of modulation was varied from 0% up to 25% of the artefact mean amplitude.
  • Deterioration of the EEG spectrum after correction: When studying ongoing activity in EEG/fMRI, it is important to evaluate how the artefact correction influences the estimation of the signal power in the frequency domain of interest. The optimal correction method may indeed depend on the particular frequency band one is mainly interested in. In this simulation, the SNR was estimated in successive frequency bands (Hz): [0–4]; [4–8]; [8–12]; [12–16]; [16–20]; [20–25]; [25–30]; [30–35]; [35–40]; [40–45]; [45–50]; [50–55]; [55–60]; [60–65]; [65–70]. First, we corrected the artefacts as described above. Second, we filtered the corrected EEG in successive frequency bands before computing the SNR.
II.3.2. Ballistocardiogram
  • BCG amplitude: The mean amplitude of the BCG was varied between 10 to 200 μV, corresponding to BCG detected in magnetic fields up to at least 3T.
  • QRS detection jitter: The standard deviation of the QRS detection jitter was varied from 0 to 50 ms.
  • Deterioration of the EEG spectrum after correction: Identically to what we did when assessing imaging artefacts, we evaluated the influence of the BCG removal algorithms on the EEG spectrum.
To evaluate the performance of correction algorithms in experimental data, we analysed experimental EEG data acquired at 3T. The EPI sequence and EEG apparatus were the same as those used in simulations (GE-EPI sequence: TR = 3 s, TE = 30 ms, flip angle = 80°, RF pulse duration = 1.536 ms, RF modulation bandwidth = 3516 Hz, FOV = 216 mm × 216 mm, matrix 72×72, 41 adjacent slices 3.5 mm thick; Micromed SD32, 17 EEG electrodes + 2 ECG electrodes). First, imaging artefacts were removed using the different algorithms. Second, the BCG was corrected, following imaging artefact correction with the method that performed best in the previous step. For removal of the BCG artefacts, it was not possible to test the adaptive filtering approach because we did not have any motion sensor. All experiments were approved by the ethical committee of the Grenoble University Hospital.
II.4.1. Alpha rhythm imaging A healthy subject was asked to open and close his eyes every 15 seconds during 5 minutes. The experiment was performed with an EEG sampling rate equal to 1024 Hz. The EEG data, referenced to Cz, were corrected with the different methods previously described and the alpha power (between 8–12 Hz) was estimated on electrode T3, which was the electrode showing most alpha power modulation during measurements performed outside the MR scanner. To evaluate the quality of the EEG correction, we calculated for each method the correlation coefficient between the estimated alpha power and the experimental block design.
II.4.2. Interictal spike morphology A young adult patient, suffering from drug-resistant partial epilepsy since childhood, exhibited a substantial focal interictal spiking activity on standard EEG recordings (10 interictal spikes per minute). Interictal discharges were recorded in this patient outside the MR scanner just before the EEG/fMRI scan for epilepsy mapping. During EPI acquisition, EEG data were recorded for 15 minutes with the patient at rest, using a very low EEG sampling rate of 256Hz. To evaluate the quality of the different correction methods, we computed the cross-correlation coefficient between twenty interictal spikes acquired on the same electrodes (dipolar derivations F8-T4 and T4-T6), ten outside the scanner and ten during EPI acquisition. For each method, we then averaged 100 correlation coefficients to measure the similarity between spikes acquired outside and inside the scanner.
In this section, results of simulations are summarised in Figures showing the average and standard deviation of the SNR, obtained from 10 simulations, as indicators of the accuracy of the estimated EEG after correction. In all simulations, the standard deviation of the original EEG was set to 10.9 μV. From Equation 2, the residual noise after correction can be easily calculated knowing the SNR reported in Figures. For instance, SNR=2 (typical value obtained in simulations) gives a residual noise of standard deviation equal to 5.45 μV.
III.1.1. Imaging artefacts Figure 4A shows the effect of the EEG sampling rate on the performance of imaging artefact removal algorithms. For each correction method assessed, results improve with increasing sampling rate. However, beyond a sampling rate of 2 kHz, improvements are not substantial for most methods. Only ICA seems to require very high sampling rate (4 kHz) for optimal performances. ICA is by far the method presenting the best results. FASTR and IAR are approximately equivalent, except for low sampling rates (<1 kHz) where FASTR behaves better. FT is significantly less powerful.

Figure 4B shows the effect of the slow modulation of the gradient artefact amplitude. As intuitively anticipated, the accuracy of the EEG estimation decreases with increasing modulation. Interestingly, the FT approach is much less sensitive than other approaches to this modulation, up to the point that it might become the optimal approach for recordings with large modulation, i.e. with large subject’s motion.

Figure 4C shows the performance of the different methods as a function of the EEG frequency. In the lowest part of the frequency spectrum, ICA appears to be the most powerful on average. However, in high frequencies (>30 Hz), the FASTR approach becomes competitive and might even be optimal for highest frequencies.

Overall, the simulations indicate that ICA performs best for correcting imaging artefacts, on average. However, the standard deviation of ICA results is much higher than that of other approaches, showing that this method is not highly reproducible and might be unstable. We will come back to this issue when analysing experimental data.

III.1.2. Ballistocardiogram The results obtained for correction of the cardiac artefacts with the methods previously described are summarised in Figure 5 (see description of algorithms in Methods Section for parameter values).

Figure 5A shows the effect of the mean amplitude of the BCG. All methods show a dependency on this parameter, except FASTR. At low field strength, ICA and adaptive filtering appear to represent the best choice. At high field strength, FASTR and adaptive filtering should be clearly favoured. Overall, the AAS method poorly performs in comparison to other methods.

As shown in Figure 5B, the AAS method also appears to behave poorly for increasing QRS detection jitter. Other methods present much less sensitivity to variations of this parameter. This is particularly the case for adaptive filtering. This technique is not influenced by the jitter because the correction algorithm does not rely upon this type of information. However, if QRS are well detected, then AAS is the optimal approach.

Figure 5C shows the influence of the BCG removal algorithms on the different frequency bands of the estimated EEG. Above 10 Hz, the corrected EEG is approximately equivalent (for adaptive filtering) or worse (for the other correction methods) to the uncorrected EEG in terms of resemblance with the true EEG. This means that no cardiac correction should be applied when studying frequencies above 10 Hz. The ballistocardiogram removal is useful only for frequencies below 10 Hz.

To sum up, it appears that adaptive filtering is the optimal approach to suppress the ballistocardiogram when assuming some difficulties to perfectly detect QRS events. This technique necessitates the use of an additional motion sensor, however. Also, optimal parameters of the Kalman filter must be estimated, which is relatively easy in the case of simulations because we know a priori the EEG without artefacts. In the case of experimental data, it may be difficult to find optimal filter parameters and the results should be affected by this imprecision. Alternatively, the PCA approach offers a relative robustness. In the case of an ECG of high quality, the AAS technique is by far the most powerful and should be chosen. In any case, ICA behaves poorly on average for BCG correction. Again, ICA showed high standard deviations in comparison to other methods. This suggests instability issues which might be even more important in experimental data.

III.2. Experimental recordings
III.2.1. Alpha rhythm Figure 6 shows a good cross-correlation (r>0.8) between the block paradigm and the alpha power (8–12 Hz) after removing imaging artefacts with IAR or FASTR. The correlation is smaller when using the FT approach (r=0.689) or ICA (r=0.560). These results are not in line with those obtained from simulations, where ICA was found to be optimal for imaging artefact correction. This suggests that instabilities of ICA results observed in simulations translate into a lack of robustness when applied to experimental data. We will come back to that point in the discussion. Without any correction, no correlation could be detected. Also, as anticipated on the basis of our simulations (BCG correction unnecessary for frequencies around 10 Hz), the BCG correction did not improve the correlation between the paradigm and the modulation of alpha power.
III.2.2. Spike morphology Figure 7 shows the average of ten interictal spikes analysed (black curves), following application of different artefact correction approaches. The time series corresponding to one of the spikes is superimposed on the average spike (grey curves). After correction of the imaging artefacts (Figure 7, middle line), spikes can usually be easily detected between two cardiac events using each method. However, it remains difficult to differentiate them from cardiac artefacts. According to correlation coefficients (mean and standard deviation shown in Figure 7), averaged subtraction (IAR) and frequency based (FT) methods appears to outperform other approaches. The failure of ICA might again be related to robustness issues of the decomposition. The fact that FASTR is less powerful than when analysing a block-design might be related to the fact that interictal spikes are not orthogonal to residual spikes of gradient. PCA is therefore likely to remove too much signal. The curves on the bottom line of Figure 7 illustrate the benefit obtained from correcting also the BCG, following removal of the imaging artefacts with the frequency based (FT) method (spikes are much more evident than before BCG correction). Clearly, the AAS and PCA are better than ICA in removing the BCG. Besides, these methods do not induce additional distortion of the interictal spike.

Fusion of simultaneous EEG and fMRI recordings is an interesting imaging approach to study spontaneous brain activity. However, EEG measurements in a magnetic environment are accompanied with strong cardiac and imaging artefacts. Removal of these artefacts is a crucial step for further data analysis. A large number of correction methods have been proposed in the ten last years. Template subtraction methods were first proposed, both for pulse artefacts (Allen et al., 1998) and imaging artefacts (Allen et al., 2000). Other approaches were then developed to increase robustness. They used (i) filtering in the frequency domain (Hoffmann et al., 2000), (ii) adaptive filtering (Bonmassar et al., 2002; Wan et al., 2006a), (iii) sets of basis functions based on principal components analysis (Niazy et al., 2005), (iv) independent component analysis (Bénar et al., 2003; Briselli et al., 2006; Nakamura et al., 2006; Srivastava et al., 2005), (v) nonlinear filtering (Wan et al., 2006b). As suggested by the list of signal processing techniques, the development of new correction methods has been often performed by increasing the algorithm complexity, so as it is now difficult to evaluate what is necessary and what is not.

In this study, we evaluated various imaging and cardiac artefacts removal algorithms, using simulated as well as experimental data. Assessing performance by the means of simulated data presents three main advantages: (i) one can operationally determine the effects of the removal process on the EEG signal since it is known a priori; (ii) one can test the effects of different experimental or empirical parameters, such as EEG sampling rate, amplitude and stationarity of artefacts or imprecision in heartbeat detection; (iii) the discrepancies between the results obtained from simulations and from experimentations permit to get an insight into the invalidity of certain hypotheses on the signal properties. Besides, the main drawback of using simulated data is the limitation of the model, which can partially explain the differences observed between simulated and experimental data.

IV.1. EEG forward modelling
Ongoing EEG activity is thought to be mainly, but not only, generated by the depolarisation of apical dendrites of cortical principal cells (Nunez and Srinivasan, 2005). Those cells belong to distributed thalamocortical neural networks. Biophysical models of the neural mass have been proposed to model EEG activity. They can be decomposed into two classes of models: (i) models which consider the cortex as a physical continuum in which travelling and standing waves of neural activity take place (Jirsa and Haken, 1996; Nunez, 2000; Nunez et al., 2001; Robinson et al., 2001; Wilson and Cowan, 1972); (ii) models which adopt a more local approach in which small neuronal clusters generating intrinsic dynamics in certain regions are coupled together (David et al., 2007; David and Friston, 2003; Jansen and Rit, 1995; Sotero et al., 2007; Suffczynski et al., 2001). Once the cortical dynamics has been modelled, it is transferred on the scalp by the means of a head model (Jirsa et al., 2002; Mosher et al., 1999). Such head model is simply a linear combination of the activity of the different neural ensembles, weighted by the properties of propagation of the electric potential in the head. Because true neural networks generating the EEG are extremely complex, and not well understood, biophysical models do not generally capture easily all possible EEG dynamics. Alternatively, one might be interested in modelling the EEG dynamical properties only, with a higher precision than what biophysical models can generally do. This third class of models is the most common. The simplest model is to suppose that ongoing EEG activity is a random process which generates time series with more power in low frequencies (power spectrum in 1/f2). According to those assumptions, a random number generator associated to a low-pass filter is then perfectly sufficient to model EEG. Also, one might go further and generate EEG according to some chaotic assumptions on EEG time series (Perea et al., 2006). All these models are usually evaluated according to spatial-temporal similarities between simulated signals and recorded signals. To do so, visual similarity in the time domain and comparison of power spectra obtained from isolated time series are often used. More rarely, spatial correlation in terms of spatial power spectral density (Freeman et al., 2003) and of synchronisation (Astolfi et al., 2005; David et al., 2004) are also investigated.

Because we were mainly interested in EEG dynamical properties, and not so much in the biophysical origins of EEG, we have chosen to reproduce the spectrum of ongoing EEG activity. To that end, we used a linear mixture of seven Gaussian distributions which were bandpass filtered in different frequency bands.

IV.2. Simulation results
For imaging artefacts, the simulations indicate that an EEG sampling rate of 1 or 2 kHz is sufficient for most methods, except for ICA in which a sampling rate of at least 4 kHz is required to obtain optimal results. Experimentally, the most efficient solution to reduce the confounding effects of sampling rate is to synchronise EEG and fMRI clocks (Mandelkow et al., 2006). Using such synchronisation, it is possible to reduce further the EEG sampling rate without causing significant deterioration of EEG correction quality. Slow variation of the imaging artefact amplitude during the scan decreases the performance of the algorithms. The frequency-based approach (FT) (Hoffmann et al., 2000) was the less sensitive to this confound. Simulations also suggest that the particular imaging artefact removal algorithm to be used should be chosen according to the frequency band of interest. If one is interested in alpha rhythms, for instance, ICA (Nakamura et al., 2006) or IAR (Allen et al., 2000) appear more suited than FT (Hoffmann et al., 2000) or FASTR (Niazy et al., 2005). For high frequencies (>50 Hz), however, FASTR appears to be the most robust approach.

For cardiac artefact correction, algorithm performances decrease when artefact amplitude increases, except for PCA. PCA thus appears to be the optimal choice when BCG is strongly contaminates recordings (>100 μV), e.g. usually when recording at high field strength (3T or more). Template artefact subtraction (AAS) (Allen et al., 1998) is highly sensitive to the QRS detection jitter (temporal imprecision in detecting heart beat by automatic signal analysis). However, when QRS detection is perfect (ECG is of high quality), then AAS outperforms all other approaches. Adaptive filtering (Bonmassar et al., 2002) and ICA (Nakamura et al., 2006) are not affected by the heartbeat detection because these method do not use this kind of information. Simulations also demonstrated that for frequencies higher than 10 Hz, any correction algorithm deteriorates the neuronal component of the EEG. This suggests not trying to remove BCG for experiments aimed at detecting ongoing activity in upper alpha, beta and gamma bands.

IV.3. Experimental results
Experimental results obtained when studying alpha rhythms modulation in a healthy subject (Figure 6) confirmed that IAR was particularly robust to detect alpha oscillations after correction of imaging artefacts. Indeed, the correlation between estimated alpha power and the experimental design was best for IAR. Similar results were obtained using FASTR. As anticipated from simulations, further correction of the BCG did not improve significantly estimated alpha power.

Interictal spike identification was successful with all methods used (Figure 7). IAR (Allen et al., 2000) and FT (modified version of (Hoffmann et al., 2000)) approaches showed the best correlation between spikes recorded outside and inside the MR scanner after imaging artefacts correction. This suggests that both methods were the most efficient to suppress artificial spikes due to remaining gradient artefacts. Note that we noticed that the FT method may tend to remove more signal of interest than IAR does. For that reason, we suggest to use IAR instead of FT to maximise the chance of detecting events of small amplitude, as already shown in (Bénar et al., 2003). Further processing of the BCG was best performed by AAS (Allen et al., 1998). According to simulations, this indicates that QRS events were accurately defined.

Overall, these experimental results agree with simulations except on the fact that ICA, which appeared very efficient on average in simulations (but with a large variability), did show the less satisfactory experimental results, both for imaging and cardiac artefacts. We will come back to that issue in Section IV.4.

IV.4. Summary of algorithm properties
Among imaging artefact methods, artefact template subtraction (IAR) (Allen et al., 2000) shows excellent results in experimental data but seems to be very sensitive to distortions according to simulations. FASTR (Niazy et al., 2005), which is based on IAR with the addition of a PCA decomposition for residual artefacts, showed similar results to those of IAR (significant improvement for low sampling rate only). Experimentally, FASTR results were significantly less satisfactory than when IAR was used, in particular for spike reconstruction. This suggests that the assumption of orthogonality between residual artefacts and EEG events was not valid in the case of interictal spikes. In other words, interictal spikes and gradient spikes look the same, and therefore FASTR should not be applied when trying to detect interictal spikes. We found that the frequency-based approach (FT) (Hoffmann et al., 2000) was efficient in experimental data. This would confirm the result obtained from simulation that this method is the less sensitive to modulation of gradient artefact amplitude, which is closely related to subject’s motion. However, in the presence of gaps in between EPI acquisition volumes, this method may introduce ringing artefacts due to discontinuities in signals to be corrected (Bénar et al., 2003). Although very seducing in simulations, ICA (Nakamura et al., 2006) behaved badly in experimental data.

For cardiac artefact correction, average artefact template subtraction (AAS) (Allen et al., 1998) showed very interesting results for experimental data. In simulations however, results indicated that it is very sensitive to a bad QRS detection. PCA following AAS (Niazy et al., 2005) allowed getting better results, in the case of poor QRS detection only. ICA (Nakamura et al., 2006) showed poor results in removing cardiac artefacts both in experimental and simulated data, probably due to the same reasons as for imaging artefacts (see below). Adaptive filtering (Bonmassar et al., 2002) showed interesting results in removing artefact in simulated data and may constitute a very efficient approach. However, we could not use this method experimentally because we did not have the necessary motion sensor. Bad points about this approach are: (i) optimal parameters of Kalman filter may be difficult to determine empirically; (ii) important computational time in comparison to other approaches.

IV.5. Possible improvements of the forward model
The discrepancies in ICA results between simulation and experimentation highly suggest that the forward EEG model we used is lacking an important property of physiological signals. The most obvious limitation of our EEG model is non-stationarity. Although we introduced some degrees of non-stationarity in the model, we did so mainly at the level of artefacts (a modulation of alpha power was used). It is highly plausible, however, that neural signals are highly non-stationary and, thus, violate underlying assumptions of ICA decomposition.

The notion of statistical independency between imaging/cardiac artefacts and EEG is very compelling and applied well to our forward model for simulated data. Practically, however, it appeared difficult to distinguish the components representing artefacts from those representing the EEG in the experimental data. First, given the much greater variability of ICA results in simulations compared to those obtained with other approaches, this confirms the impression, that ICA decomposition with the algorithm we used (runica.m from EEGLAB) show a certain degree of instability. Second, this suggests that the linear mixture model underlying temporal ICA may not be applicable to estimate efficiently independent components in long time series such as those acquired in EEG/fMRI. A theoretical reason for that may be the following. When applying temporal ICA on a time segment, the implicit assumption is that underlying sources are stationary in space. However, we have mentioned above that some biophysical models (Jirsa and Haken, 1996; Nunez, 2000; Nunez et al., 2001; Robinson et al., 2001; Wilson and Cowan, 1972) explain EEG data by the means of travelling waves, the modes of propagation of which are likely to be modified endogenously by fast plastic mechanisms constantly occurring (Turrigiano and Nelson, 2004). Clearly, ICA assumptions do not conform to these biophysical properties. Because ICA was not successful with BCG correction either, a forward model of BCG based on non-stationary propagating waves would probably constitute a significant improvement in forward modelling. To our knowledge, such model does not exist yet.

Therefore, using biophysical models assuming EEG and artefact sources non-stationary in space would be a possibility to increase the similarity between simulated and experimental results.

When data are acquired in ideal conditions, methods based on an average template subtraction (Allen et al., 1998; Allen et al., 2000) are extremely efficient in removing the artefact without deteriorating too much the EEG neuronal component. When one suspects subject’s motion or when cardiac events are difficult to detect, more sophisticated approaches are necessary. We found that sampling the EEG at frequencies higher than 1 kHz or 2 kHz does not improve significantly the EEG estimation and increases unnecessarily the size of the data and the computational burden. We also found that it is unnecessary to correct cardiac artefacts for the detection of alpha or higher rhythms, while it is critical when estimating focal events such interictal spikes or evoked potentials. In other words, there is no correction algorithm that is generally optimal. Depending on the type of data analysis pursued, certain algorithms may be preferred.

 
Figure 1Figure 1
Generative model of EEG. (A) Addition of the 7 Gaussian distributions in various frequency bands to emulate a realistic EEG spectrum. (B) Comparison of the log-spectrum between simulated (dotted line) and experimental EEG (plain line). (C) Example of (more ...)
Figure 2Figure 2
Construction of the model of imaging artefact. (A) Imaging artefact template for a volume sampled at 50 kHz (TR= 3 s, 40 slices per volume). (B) Left: Zoom on 6 slices of the imaging artefact template (50 kHz as in A). Middle: Imaging artefact template (more ...)
Figure 3Figure 3
The different parts of the cardiac artefact model. (A) Slow variations of the heart rate between 65 and 85 heartbeats per minute during the recording. (B) Variation of the cardiac artefact mean amplitude between the different channels. (C) Variation of (more ...)
Figure 4Figure 4
Plots of the SNR obtained from simulations of the imaging artefact removal. (A) SNR as a function of the EEG acquisition sampling rate (from 256 Hz to 8 kHz). (B) SNR as a function of the artefact amplitude slow modulation (from 0 to 25% of the mean amplitude). (more ...)
Figure 5Figure 5
Plots of the SNR obtained from simulations of the cardiac artefact removal. (A) SNR as a function of the cardiac artefact mean amplitude (from 10 to 200 μV). A rough indication of the corresponding magnetic field strength (1.5T and 3T) is provided as (more ...)
Figure 6Figure 6
Alpha rhythm imaging. (A) Comparison between the experimental block-paradigm (eyes closed/eyes open) and the normalised power in the alpha band. Top line: no artefact correction. Middle line: imaging artefact correction. Bottom line: cardiac artefact (more ...)
Figure 7Figure 7
Spike morphology. Top line: One interictal spike (in grey) and the average over ten interictal spikes (in black) recorded outside the scanner. On the middle and the bottom lines, the same interictal spike (in grey) and the average over the same ten spikes (more ...)