WARNING this page is being edited in preparation of a new release and may be in an inconsistent state

The eddy executables

eddy is a very computationally intense application, so in order to speed things up it has been parallelised. This has been done in two ways, resulting in two different executables

Hence, there is not longer an executable named eddy and when I refer to the eddy-command in the rest of this users guide it is implied that this is either eddy_openmp or eddy_cuda. The eddy_cuda version is potentially much faster than eddy_openmp and not all new features will be available for the OpenMP version. This is because the slow speed makes it almost impossible to test the more time-consuming options thoroughly. I warmly recommend investing in a couple of CUDA cards.

Running eddy

Running eddy is a little bit more complicated than running for example its predecessor eddy_correct. The reasons for this are

The need to pass more information to eddy results in a more complicated command line. Here I will outline a typical use of topup and eddy (they are really intended to be used together) on a "typical" data set suited for use with eddy.

The data

The data for this example consists of one set of volumes acquired with phase-encoding A>>P consisting of 5 b=0 volumes and 59 diffusion weighted volumes

data.nii.gz

A2P_five_volumes.jpeg

First b=0 volume and the first four dwis of the A>>P data

and one single b=0 volume with phase-encoding P>>A.

P2A_b0.nii.gz

P2A_b0.jpeg

The P>>A data

Note how the shape of the b=0 scan is different for the two different acquisitions. This is what topup will use in order to calculate the susceptibility induced off-resonance field.

Running topup on the b=0 volumes

The first thing we do is to run topup to estimate the susceptibility induced off-resonance field. In order to prepare the data for topup we issue the following commands

fslroi data A2P_b0 0 1
fslmerge -t A2P_P2A_b0 A2P_b0 P2A_b0
printf "0 -1 0 0.0646\n0 1 0 0.0646" > acqparams.txt

The first two commands will produce a file called A2P_P2A_b0.nii.gz containing the two b=0 volume, and the third command will create a file (named acqparams.txt) that informs topup/eddy of how the data was collected. This file is described here, here and in more detail here.

Now it is time to run topup which we do with the command

topup --imain=A2P_P2A_b0 --datain=acqparams.txt --config=b02b0.cnf --out=my_topup_results --iout=my_hifi_b0

which will give us as our main result a file named my_topup_results_fieldcoef.nii.gz which contains an estimate of the susceptibility induced off-resonance field.

Running eddy

Before we can run eddy we need to do a couple of more preparations. First of all we need a mask that separate brain from non-brain. This is no different from for example the mask that `dtifit` needs. Since eddy will work in a non-distorted space we will base the mask on my_hifi_b0.nii.gz (the secondary output from our topup command above). We generate this mask with the commands

fslmaths my_hifi_b0 -Tmean my_hifi_b0
bet my_hifi_b0 my_hifi_b0_brain -m

which results in the file my_hifi_b0_brain_mask.nii.gz. It may be a good idea to check this stage to ensure bet has done a good job of extracting the brain.

The final thing we need to do is to create an index file that tells eddy which line/of the lines in the acqparams.txt file that are relevant for the data passed into eddy. In this case all the volumes in data.nii.gz are acquired A>>P which means that the first line of acqparams.txt describes the acquisition for all the volume. We specify that by passing a text file with as many ones as there are volumes in data.nii.gz. One way of creating such a file would be to type the following commands

indx=""
for ((i=1; i<=64; i+=1)); do indx="$indx 1"; done
echo $indx > index.txt

where 64 is the total number of volumes in data.nii.gz and needs to be replaced by the number of volumes in your data.

We are now in a position to run eddy using the command

eddy --imain=data --mask=my_hifi_b0_brain_mask --acqp=acqparams.txt --index=index.txt --bvecs=bvecs --bvals=bvals --topup=my_topup_results --out=eddy_corrected_data

You may be in for quite a long wait as eddy is quite CPU intensive and also memory hungry. It has been written using OpenMP to take advantage of multiple processors and this may/may not be available to you depending on how it was built in your system. A rule of thumb for how much memory eddy will use is 8*m*nxnynz bytes where m is the number of volumes in --imain, nx is the matrix-size in the x-direction, ny is the matrix size in the y-direction and nz is the number of slices.

eddy with outlier replacement

When (not if) a subject makes a movement that coincides in time with the diffusion encoding part of the sequence, there will be partial or complete signal dropout. The dropout will affect the whole (the most common case) or parts of a slice. In the presence of out-of-plane rotations these slices will turn into diagonal bands when eddy rotates the volume back. If uncorrected this will affect any measures derived from the data.

As of version 5.0.10 (or the 5.0.9 eddy-patch) eddy has a mechanism for detecting these dropout-slices and replacing them with Gaussian Process predictions. All one needs to do for example in the example above is to add --repol to the command line.

eddy --imain=data --mask=my_hifi_b0_brain_mask --acqp=acqparams.txt --index=index.txt --bvecs=bvecs --bvals=bvals --topup=my_topup_results --repol --out=eddy_corrected_data

The exact details of how the outlier replacement is performed can be specified by the user, and in particular if ones data has been acquired with multi-band it can be worth taking a look here.

The pertinent reference for when using the --repol functionality is at the main eddy page.

Example of before and after outlier replacement

OLR_before_after.gif

This mini-movie flips between before and after outlier replacement
for two consecutive planes. In the before case one can
clearly see the telltale diagonal bands caused by "missing" slices
being rotated out of plane. The after case shows how the
missing slices were replaced in the native space before being
rotated into the reference space (the first volume of --imain).

Correcting slice-to-volume movement with eddy

A common assumption for movement correction methods is that the subject remains still during the time is takes to acquire a volume (TR, typically 2-8 seconds for diffusion) and that any movement occurs between volumes. This is of course not true, but as long as the movement is slow relative the TR, it is a surprisingly good approximation.

However, for some subjects (for example small children) that assumption no longer holds. The result is a "corrupted" volume where the individual slices no longer stacks up to a "valid" volume. The leads to a telltale zig-zag pattern in a coronal or sagittal view when the slices are acquired in an interleaved fashion (that is the typical way to acquire diffusion data).

Example of zig-zag pattern from within-volume movement

ExampleOfS2VZigZagPattern.001.jpeg

This sagittal slice along the midline demonstrates
the typical zig-zag pattern associated with
within-plane movement and interleaved acquisition.
Note that this is not signal dropout
(except for one of the slices), but a case
of signal being rotated in or out of the mid-sagittal plane.
Hence, the signal is not lost and only needs to be
relocated to its "proper" location.

Version 5.0.11 of eddy has the ability to correct for such within-volume (or "slice-to-volume") movement. This is done by setting the --mporder option to a value greater than 0. If one for example specifies --mporder=4 the movement during each volume is modelled by the 5 first terms of a DCT basis-set (and is hence defined by 6*5=30 parameters instead of the usual 6 rigid-body parameters). Note that the DCT-set is a function of time rather than slice, which means that the user needs to specify the relation between time and slice. A command line for correcting intra-volume movement in a data set with lots of movement can look something like

eddy --imain=data --mask=my_hifi_b0_brain_mask --acqp=acqparams.txt --index=index.txt --bvecs=bvecs --bvals=bvals --topup=my_topup_results --niter=8 --fwhm=10,8,4,2,0,0,0,0 --repol --out=eddy_corrected_data --mporder=6 --slspec=my_slspec.txt --s2v_niter=5 --s2v_lambda=1 --s2v_interp=trilinear

where the 5 last options are all related to the intra-volume correction.

Example of before and after intra-volume movement correction

olr_vs_s2v_movie.gif

This example shows a sagittal plane of a volume that was either
corrected for "everything" (susceptibility, eddy currents,
inter-volume movement and outliers) except intra-volume movement
or corrected for "everything" including intra-volume movement.
The little movie flips back and forth between the two cases.

The pertinent reference for when using the slice-to-volume functionality is at the main eddy page.

Understanding eddy output

The --out parameter specifies the basename for all output files of eddy. It is used as the name for all eddy output files, but with different extensions. If we assume that user specified --out=my_eddy_output, the files that are always written are

The following file is only written if the --repol flag was set.

The following file is only written if the --mporder option has been set to a value greater than zero.

The following files are only written if the corresponding flags are set.

List of parameters

Parameters explained

--imain

Should specify a 4D image file with all your images acquired as part of a diffusion protocol. I.e. it should contain both your dwis and your b=0 images. If you have collected your data with reversed phase-encode blips, data for both blip-directions should be in this file.

--mask

Single volume image file with ones and zeros specifying brain (one) and no-brain (zero). Typically obtained by running BET on the first b=0 image. If you have previously run topup on your data I suggest you run BET on the first volume (or the average of all volumes) of the --iout output and use that.

--acqp

A text-file describing the acquisition parameters for the different images in --imain. The format of this file is identical to that used by topup (though the parameter is called --datain there) and described in detail here.

--index

A text-file that determines the relationship between on the one hand the images in --imain and on the other hand the acquisition parameters in --acqp and (optionally) the subject movement information in --topup. It should be a single column (or row) with one entry per volume in --imain. We will use a small (simplified) example to make it clear.

eight_original_images.png

The image above shows a selected slice from each of the eight volumes in --imain. The associated --acqp file is

-1 0 0 0.051
1 0 0 0.051

which specifies that phase-encoding is performed in the x-direction, sometimes traversing k-space left->right (-1) and sometimes right->left (1). Finally the --index file is

1 1 1 1 2 2 2 2

which specifies that the first four volumes in --imain were acquired using the acquisition parameters on the first row (index 1) of the --acqp file, and that volumes 5--8 were acquired according to the second row (index 2).

There are cases when there may be advantageous to have more than two lines in the --acqp file and in these cases there will be more than two different index values in the --index file. These cases are explained here

--bvecs

A text file with normalised vectors describing the direction of the diffusion weighting. This is the same file that you would use for FDT.

--bvals

A text file with b-values () describing the "amount of" diffusion weighting. This is the same file that you would use for FDT.

--topup

This should only be specified if you have previously run topup on your data and should be the same name that you gave as an argument to the --out parameter when you ran topup.

--field

If there is no topup output available for your study you may alternatively use a "traditional" fieldmap in its place. This can for example be a dual echo-time fieldmap that has been prepared using PRELUDE. Note that in contrast to for example FUGUE it expects the fieldmap to be scaled in Hz. For boring reasons the filename has to be given without an extension. For example --field=my_field, not --field=my_field.nii.gz.

/!\ There are two important caveats with --field, which is why we strongly recommend using a topup derived field if at all possible. These are

--field_mat

Specifies a flirt style rigid body matrix that specifies the relative locations of the field specified by --field and the first volume in the file specified by --imain. If my_field is the field specified by --field, my_ima is the first volume of --imain and my_mat is the matrix specified by --field_mat. Then the command flirt -ref my_ima -in my_field -init my_mat -applyxfm should be the command that puts my_field in the space of my_ima.

--out

Specifies the basename of the output. Let us say --out="basename". The output will then consist of a 4D image file named <basename>.nii.gz containing all the corrected volumes and a text-file named <basename>.eddy_parameters with parameters defining the field and movement for each scan.

--flm

This parameter takes the values linear, quadratic or cubic. It specifies how "complicated" we believe the eddy current-induced fields may be.

Setting it to linear implies that we think that the field caused be eddy currents will be some combination of linear gradients in the x-, y- and z-directions. It is this model that is the basis for the claim "eddy current distortions is a combination of shears, a zoom and a translation". It is interesting (and surprising) how successful this model has been in describing (and correcting) eddy current distortions since not even the fields we intend to be linear (i.e. our gradients) are particularly linear on modern scanners.

The next model in order of "complication" is quadratic which assumes that the eddy current induced field can be modelled as some combination of linear and quadratic terms (x, y, z, x2, y2, z2, xy, xz and yz). This is almost certainly also a vast oversimplification but our practical experience has been that this model successfully corrects for example the HCP data (which is not well corrected by the linear model).

The final model is cubic which in addition to the terms in the quadratic model also has cubic terms (x3, x2y, etc). We have yet to find a data set where the cubic model performs significantly better than the quadratic one. Note also that the more complicated the model the longer will eddy take to run.

--slm

"Second level model" that specifies the mathematical form for how the diffusion gradients cause eddy currents. For high quality data with 60 directions, or more, sampled on the whole sphere we have not found any advantage of performing second level modelling. Hence our recommendation for such data is to use none, and that is also the default.

If the data has quite few directions and/or is has not been sampled on the whole sphere it can be advantageous to specify --slm=linear.

--fwhm

Specifies the FWHM of a gaussian filter that is used to pre-condition the data before using it to estimate the distortions. In general the accuracy of the correction is not strongly dependent on the FWHM. Empirical tests have shown that ~1-2mm might be best, but by so little that the default has been left at 0.

One exception is when there is substantial subject movement, which may mean that eddy fails to converge in 5 iterations if run with fwhm=0. In such cases we have found that --fwhm=10,0,0,0,0 works well. It means that the first iteration is run with a FWHM of 10mm, which helps that algorithm to take a big step towards the true solution. The remaining iterations are run with a FWHM of 0mm, which offers high accuracy.

--niter

eddy does not check for convergence. Instead it runs a fixed number of iterations given by --niter. This is not unusual for registration algorithms where each iteration is expensive (i.e. takes long time). Instead we run it for a fixed number of iterations, 5 as default.

If, on visual inspection, one finds residual movement or EC-induced distortions it is possible that eddy has not fully converged. In that case we primarily recommend that one uses --fwhm=10,0,0,0,0, as described above, to speed up convergence. Only if that fails do we recommend increasing the number of iterations.

--fep

Stands for "Fill Empty Planes". For reasons that are not completely clear to us the reconstructed EPI images from some manufacturers contain one or more empty "planes". A "plane" in this context do not necessarily mean a "slice". Instead it can be for example the "plane" that constitutes the last voxel along the PE-direction for each "PE-direction column". The presence/absence of these "empty planes" seems to depend on the exact details of image encoding part of the sequence.

IF --fep is set eddy will attempt to identify the empty planes and "fill them in". The filling will consist of duplicating the previous plane if the plane is perpendicular to the frequency-encode direction and by interpolation between the previous and the "wrap-around plane" if the plane is perpendicular to the PE-direction .

--interp

Specifies the interpolation model used during the estimation phase, and during the final resampling if --resamp=jac is used. We strongly recommend staying with spline, which is also the default.

--resamp

Specifies how the final resampling is performed. The options are

--nvoxhp

Specifies how many (randomly selected within the brain mask) voxels that are used when estimating the hyperparameters of the Gaussian Process used to make predictions. The default is 1000 voxels, and that is more than sufficient for typical data with resolution of 2x2x2mm or lower. For very high resolution data, such as for example the HCP 7T data, with relatively low voxel-wise SNR one may need to increase this number. The only "adverse" effect of increasing this number is an increase in execution time.

--ff

This should be a number between 1 and 10 and determines the level of Q-space smoothing that is used by the prediction maker during the estimation of the movement/distortions. Empirical testing has indicated that any number above 5 gives best results. We have set the default to 10 to be on the safe side.

--dont_sep_offs_move

All our models for the EC-field contains a component that is constant across the field and that results in a translation of the object in the PE-direction. Depending on how the data has been acquired it can be more or less difficult to distinguish between this constant component and subject movement. It matters because it affects how the diffusion weighted images are aligned with the b=0 images. Therefore eddy attempts to distinguish the two by fitting a second level model to the estimated constant component, and everything that is not explained by that model will be attributed to subject movement. As of release 5.0.11 it will also perform a Mutual Information based alignment along the PE-direction.

If you set this flag eddy will not do that estimation. The option to turn this off is a remnant from when we did not know how well it would work and it is very unlikely you will ever use this flag. It will eventually be deprecated.

--dont_peas

The motion correction within eddy has greatest precision within a shell and has a bigger uncertainty between shells. There is no estimation of movement between the first b=0 volume and the first diffusion weighted volume. Instead is is assumed that these have been acquired very close in time and that there were no movement between them.

If there are multiple shells or if the assumption of no movement between the first b=0 and the first diffusion weighted volume is not fulfilled it can be advantageous to perform a "Post Eddy Alignment of Shells" (peas). Our testing indicates that the peas has an accuracy of ~0.2-0.3mm, i.e. it is associated with some uncertainty. This precision is still such that peas is performed as default.

But, if one has a data set with a single shell (i.e. a single non-zero shell) and the assumption of no movement between the first b=0 and the first diffusion weighted image is true it can be better to avoid that uncertainty. And in that case it may be better to turn off peas by setting the --dont_peas flag.

--repol

When set this flag instructs eddy to remove any slices deemed as outliers and replace them with predictions made by the Gaussian Process. Exactly what constitutes an outlier is affected by the parameters --ol_nstd, --ol_nvox, --ol_type, --ol_pos and --ol_sqr. If the defaults are used for all those parameters an outlier is defined as a slice whose average intensity is at least four standard deviations lower than the expected intensity, where the expectation is given by the Gaussian Process prediction.

The default is to not do outlier replacement since we don't want to risk people using it "unawares". However, our experience and tests indicate that it is always a good idea to use --repol.

--ol_nstd

This parameter determines how many standard deviations away a slice need to be in order to be considered an outlier. The default value of 4 is a good compromise between type 1 and 2 errors for a "standard" data set of 50-100 directions. Our tests also indicate that the parameter is not terribly critical and that any value between 3 and 5 is good for such data. For data of very high quality, such as for example HCP data with 576 dwi volumes, one can use a higher value (for example 5). Conversely, for data with few directions a lower threshold can be used.

--ol_nvox

This parameter determines the minimum number of intracerebral voxels a slice need to have in order to be considered in the outliers estimation. Consider for example a slice at the very top of the brain with only ten brain voxels. The average (based on only ten voxels) difference from the prediction will be very poorly estimated and to try to determine if it was an outlier or not on that basis would be very uncertain. For that reason there is a minimum number of brain voxels (as determined by the --mask) for a slice to be considered. The default is 250.

--ol_type

The normal behaviour for eddy is to consider each slice in isolation when assessing outliers. When acquiring multi-band (mb) data each slice in the group will have a similar signal dropout when it is caused by gross subject movement (as opposed to pulsatile movement of e.g. the brain stem). It therefore makes sense to consider an mb-group as the unit when assessing outliers. There are therefore three options for the unit of outliers.

Default is sw.

--ol_pos

By default eddy only considers signal dropout, i.e. the negative end of the distribution of differences. One could conceivably also have positive outliers, i.e. slices where the signal is greater than expected. This could for example be caused by spiking or other acquisition related problems. If one wants to detect and remove this type of outliers one can use the --ol_pos flag.

In general we don't encourage its use since we believe that type of artefacts should be detected and corrected at source. We have looked a large number of FMRIB and HCP data sets and not found a single believable positive outlier.

--ol_sqr

Similarly to the --ol_pos flag above it extends the scope of outliers that eddy considers. If the --ol_sqr flag is set eddy will look for outliers also in the distribution of "sums of squared differences between observations and predictions". This means it will detect also artefacts that doesn't cause a change in mean intensity.

Similarly to --ol_pos we don't encourage the use of --ol_sqr. Artefacts that fall into this category should be identified and the causes corrected at the acquisition stage.

--mb

If --ol_type=gw or --ol_type=both eddy needs to know how the multi-band groups were acquired and to that end --mb has to be set. If for example the total number of slices is 15 and mb=3 it will be assumed that slices 0,5,10 were acquired as one group, 1,6,11 as the next group etc (slices numbered 0,1,...14).

/!\ The current version of eddy allows for a more detailed description of the multi-band structure (see --slspec below), and if one wants to do slice-to-volume motion correction it is necessary to use that description. It is still possible to use --mb, but it is discouraged and it will be deprecated in the next release.

--mb_offs

If a slice has been removed at the top or bottom of the volumes in --imain the group structure given by --mb will no longer hold. If the bottom slice has been removed --mb_offs should be set to --mb_offs=-1. For the example above it would mean that the first group of slices would consist of slices 4,9 and the second group of 0,5,10 where the slice numbering refers to the new (with missing slice) volume numbered 0,1,13. Correspondingly, if the top slice was removed it should be set to 1.

/!\ The current version of eddy allows for a more detailed description of the multi-band structure (see --slspec below), and if one wants to do slice-to-volume motion correction it is necessary to use that description. It is still possible to use --mb_offs, but it is discouraged and it will be deprecated in the next release.

--mporder

If one wants to do slice-to-vol motion correction --mporder should be set to an integer value greater than 0 and less than the number of excitations in a volume. Only when --mporder > 0 will any of the parameters prefixed by --s2v_ be considered. The larger the value of --mporder, the more degrees of freedom for modelling movement. If --mporder is set to N-1, where N is the number of excitations in a volume, the location of each slice/MB-group is individually estimated. We don't recommend going that high and in our tests we have used values of N/4 -- N/2. The underlying temporal model of movement is a DCT basis-set of order N.

Slice-to-vol motion correction is computationally very expensive so it is only implemented for the CUDA version.

--s2v_niter

Specifies the number of iterations to run when estimating the slice-to-vol movement parameters. In our tests we have used 5--10 iterations with good results, and possibly a small advantage of 10 over 5. The slice-to-volume alignment is computationally expensive so expect N iterations of slice-to-volume movement estimation to take an order of magnitude longer than N iterations of volumetric movement estimation.

--s2v_lambda

Determines the strength of temporal regularisation of the estimated movement parameters. This is especially important for single-band data with "empty" slices at the top/bottom of the FOV. We have used values in the range 1--10 with good results.

--s2v_interp

Determines the interpolation model in the slice-direction for the estimation of the slice-to-volume movement parameters. In theory spline is a better interpolation method, but in this particular context (interpolation of irregularly spaced data) the computational cost of using spline is very large. In our tests we have not been able to see any actual advantage of using spline, so we recommend using trilinear. For the final re-sampling spline is always used regardless of how --s2v_interp is set.

--slspec

Specifies a text-file that describes how the slices/MB-groups were acquired. This information is necessary for eddy to know how a temporally continuous movement translates into location of individual slices/MB-groups. Let us say a given acquisition has N slices and that m is the MB-factor (also known as Simultaneous Multi-Slice (SMS)). Then the file pointed to be --slspec will have N/m rows and m columns. Let us for example assume that we have a data-set which has been acquired with an MB-factor of 3, 15 slices and interleaved slice order. The file would then be

0 5 10
2 7 12
4 9 14
1 6 11
3 8 13

where the first row "0 5 10" specifies that the first, sixth and 11th slice are acquired first and together, followed by the third, eighth and 13th slice etc. For single-band data and for multi-band data with an odd number of excitations/MB-groups it is trivial to work out the --slspec file using the logic of the example. For an even number of excitations/MB-groups it is considerably more difficult and we recommend using a DICOM->niftii converter that writes the exact slice timings into a .JSON file. This can then be used to create the --slspec file.

--data_is_shelled

At the moment eddy works for single- or multi-shell diffusion data, i.e. it doesn't work for DSI data. In order to ensure that the data is shelled `eddy "checks it", and only proceeds if it is happy that it is indeed shelled. The checking is performed through a set of heuristics such as i) how many shells are there? ii) what are the absolute numbers of directions for each shell? iii) what are the relative numbers of directions for each shell? etc. It will for example be suspicious of too many shells, too few directions for one of the shells etc.

It has emerged that some popular schemes get caught in this test. Some groups will for example acquire a "mini shell" with low b-value and few directions and that has failed to pass the "check", even though it turns out eddy works perfectly well on the data. For that reason we have introduced the --data_is_shelled flag.

If set, it will bypass any checking and eddy will proceed as if data was shelled. Please be aware that if you have to use this flag you may be in untested territory and that it is a good idea to check your data extra carefully after having run eddy on it.

--verbose

Turns on printing of information about the algorithms progress to the screen.

--help

 

eddy/UsersGuide (last edited 15:10:10 30-08-2017 by JesperAndersson)