Resting state FSLnets practical

Introduction

FSLNets is a toolbox for carrying out basic network modelling from FMRI timeseries data. It is a collection of MATLAB or Octave scripts that run a connectivity analysis between sets of timeseries.

Estimating network matrices and performing statistical analyses on these requires the following steps:

Extracting subject-specific timeseries relating to a given set of spatial maps of nodes. For example, using dual regression stage 1 outputs
Looking for group-level noise components that you want to remove
Calculating correlations (full or partial) between all pairs of timeseries to build the network matrix (netmat) for each subject
Looking at the group-average network matrix and how nodes cluster together into a hierarchy
Performing group-level statistical analysis

Before running FSLnets: Defining nodes and edges to run network modelling analysis
Networks estimation: Estimating network matrices from dual regression outputs
Group-average netmat summaries: Calculating group-average netmats
Cross-subject comparison with netmats: Comparing individual edge strengths between subject groups
(Optional) Multivariate cross-subject analysis: Multivariate comparison of whole netmats across subject groups

In the next sections we will go through an example of how perform network modelling using resting-state data (i.e. the output from melodic followed by stage 1 of dual regression). However, with FSLNets you can analyse any set of timeseries!

Before running FSLnets

Before you run FSLnets, you need to prepare several things:

Define 'nodes'

The nodes are the spatial maps that define the regions included in our network modelling. Nodes can be defined in a number of ways, but for this practical we have already generated the nodes for you using the following group ICA command to extract 100 components (please do not run this again):

melodic -i inputlist_new.txt -o groupICA100 \
  --tr=0.72 --nobet -a concat \
  --bgimage=$FSLDIR/data/standard/MNI152_T1_2mm_brain.nii.gz \
  -m $FSLDIR/data/standard/MNI152_T1_2mm_brain_mask.nii.gz \
  --report --Oall -d 100

Nodes could also be defined using anatomical or functional atlases, although anatomical atlases may not provide good functional parcellations as the structural boundaries are not always a good estimate of functional boundaries.

Why do we typically use a higher dimensionality for network modelling analyses? If we used a dimensionality of 25 to define our 'nodes', what would the network modelling analysis be looking at?

Extract timeseries from nodes

To look at the temporal relationship between our nodes, we must first extract the timeseries from each node. Again, the timeseries extraction can be done in a number of ways. For this practical, we have already extracted the timeseries from the 100 nodes by running dual regression with the command below, and using the stage 1 output.

dual_regression groupICA100/melodic_IC 1 -1 0 groupICA100.dr `cat inputlist_new.txt`

ICA is an example of soft parcellation. What does soft parcellation mean? What effect does soft parcellation have on the extracted timeseries?

Create images of nodes

For visualiation purposes, FSLnets requires a PNG image of each node. We have created these for you (located in ~/fsl_course_data/rest/Nets/group100.sum) using the following command:

slices_summary groupICA100/melodic_IC 4 $FSLDIR/data/standard/MNI152_T1_2mm groupICA100.sum -1

This command takes the 4D image containing the group ICA components (groupICA100/melodic_IC), sets a minimum threshold of 4, specifies the MNI152_T1_2mm as the background image, and creates the output directory group100.sum, which contains the png images of each node. The -1 flag specifies that single-slice summaries should be generated instead of 3-slice summaries.

MATLAB/Octave configuration

cd to the working directory of this practical, and start MATLAB/Octave:

cd ~/fsl_course_data/rest/Nets
matlab &

cd ~/fsl_course_data/rest/Nets
octave-cli
# or, for a graphical Octave session
octave

From here on, commands will be entered into the MATLAB command window, or from the Octave prompt.

Before starting the analysis, we need to tell MATLAB where to find FSL, and in particular the code for FSLNets. First of all, make sure that your $FSLDIR environment variable is set, by typing:

getenv('FSLDIR')

IMPORTANT: If you get something like '/usr/local/fsl', then all is well and good. Otherwise you will need to set the $FSLDIR and $PATH variables yourself - run the following line, replacing /path_to_fsl with the appropriate path for your system:

setenv('FSLDIR', '/path_to_fsl')
setenv('PATH', sprintf('%s/bin:%s', getenv('FSLDIR'), getenv('PATH')))
setenv('FSLOUTPUTTYPE', 'NIFTI_GZ')

Now add the folder containing the scripts we are going to use to the path, and set up a few variables:

addpath ~/fsl_course_data/rest/FSLNets
addpath(sprintf('%s/etc/matlab', getenv('FSLDIR')))

If you are using Octave, you need to run a couple more commands, to add some libraries that are built into MATLAB, but not Octave:

addpath ~/fsl_course_data/rest/octave/statistics-1.2.4/inst
addpath ~/fsl_course_data/rest/octave/libsvm/matlab

Networks estimation

Now we are ready to start setting up some important parts for our network modelling analysis. To load in all subjects' timeseries data files from the dual-regression output directory, run:

ts = nets_load('groupICA100.dr', 0.72, 0);

For those less familar with MATLAB, in the command ts = nets_load('groupICA100.dr', 0.72, 0), ts is an output variable defined by the function nets_load . The brackets contain parameters required by the function. You can get more information on what these parameters mean by opening the function.

Type ts and hit enter to have a look at the contents of the structure that was created using this command.

ts.ts contains the stage 1 dual regression timeseries from all subjects.

digraph G { rankdir=LR; node [shape=box]; 1 [label="dual regression output folder ('groupICA100.dr')"]; 2 [label="repetition time (TR) (0.72)"]; 3 [label="normalisation of timeseries (0) here none"]; 4 [label="nets_load"]; 5 [label="output structure with subfields (ts)"]; 1 -> 4; 2 -> 4; 3 -> 4; 4 -> 5; }

The data you just loaded using the nets_load command comprises N node timecourses of T time points from S subjects. Can you figure out the values of N, S and T for this dataset?

We will now have a quick look at the temporal spectra of the resting state networks (RSNs), as a quick check that these look reasonable. Running the following command will open a figure in which the left part of the plot shows one spectrum per group-ICA component, each averaged across all subjects. The right part shows the same thing, but with all spectra overlapping.

ts_spectra = nets_spectra(ts);

Click here for more information on how to read the figure

Why do the figures show that all nodes have high power in the low-frequency range?

Cleaning components

There is the option to remove components' (nodes') timeseries that correspond to artefacts rather than plausible nodes. Similarly to the last practical, you need to decide this by looking at the spatial maps, timeseries, and frequency spectra. In addition to looking at the spectra with the nets_spectra, you can also look at the spatial maps (groupICA100/melodic_IC.nii.gz) in FSLeyes.

When looking for corresponding components in FSLNets and FSLeyes remember to subtract 1 from each number, as timepoint counting in FSLeyes starts from 0 not 1!

To save time, we have listed the good components for you. Run the following commands to list the good components and apply the cleanup:

ts.DD = [1:3,5,6:9,11:13,17:23,25:38,40,42,43,47:50,52,53,55:59,61,...
62,64:66,70:74,77,80,81,86,87,93,97];
ts = nets_tsclean(ts,1);

What does the output variable ts.DD contain?

digraph G { rankdir=LR; node [shape=box]; 1 [label="input structure (ts) loaded previously"]; 2 [label="aggressive or soft cleanup (1) i.e. aggressive"]; 3 [label="nets_tsclean"]; 4 [label="output structure (ts) overwriting original ts"]; 1 -> 3; 2 -> 3; 3 -> 4; }

What is considered an 'aggressive' cleanup

Calculating netmats for each subject

Now you are ready to compute a network matrix for each subject, which is in general a matrix of correlation strengths (correlation coefficients). We will compute two different versions of these netmats for each subject: a simple full correlation ('corr'), and a partial correlation that has been regularised in order to potentially improve the mathematical robustness of the estimation ('ridgep'). The partial correlation matrix should do a better job of only estimating the direct network connections than the full correlation does. The extra parameter in 'ridgep' controls the strength of the regularisation.

Fnetmats = nets_netmats(ts,1,'corr');
Pnetmats = nets_netmats(ts,1,'ridgep',0.1);

digraph G { rankdir=LR; node [shape=box]; 1 [label="inputs structure (ts)"]; 2 [label="apply Fisher's r-to-z transformation? (1) i.e. yes"]; 3 [label="method for netmat estimation ('ridgep') i.e. regularised partial correlation"]; 4 [label="regularisation parameter (0.1)"]; 5 [label="nets_netmats"]; 6 [label="output variable (Pnetmats)"]; 1 -> 5; 2 -> 5; 3 -> 5; 4 -> 5; 5 -> 6; }

Fill in the blanks, to show that you understand what each network matrix represents. This question refers to dimensions of the network matrix, not to the contents of the netmats structure.
For each subject, each row (and each column) relates to a group-level _______________________________. These are also referred to as network "nodes".
The elements in the network matrix contain the _______________________ values between each pair of nodes, and are also referred to as network "edges".

So far, these commands have calculated netmats for each individual subject included in the input sample.

The full and partial netmats are now calculated for all subjects. Now run this command to look at the size of the Fnetmats variable:

size(Fnetmats)

Can you figure out how the netmats of all subjects are saved in this? Answer.

In the next section, you are going to compute the group average and take a look at the average network matrix.

Group-average netmat summaries

We have computed the full and partial network matrices for each individual subject. The next step is to perform a simple group-level analysis to look at the mean network matrix across all subjects. This can be done using the command below, which saves out both the simple average of netmats accross all subjects (Mnet) and the results of a simple one-group t-test (against zero) across subjects as Z values (Znet).

To calculate the group average netmat, each individual's netmat is unwrapped into a row and combined to create a subjects * edges matrix. The average edge strength is then calcualted across subjects.

[Znet_F,Mnet_F]=nets_groupmean(Fnetmats,0);
[Znet_P,Mnet_P]=nets_groupmean(Pnetmats,1);

The second input to this command indicates whether or not to display a summary figure. This is set to 1 for the second command run on the partial netmats. On the left, this figure shows the results from the group t-test, and on the right is a consistency scatter plot showing how similar the results from each subject are to the group (i.e. the more this looks like a diagonal line, the more consistent the relevant netmat is across subjects).

Click here for more information on how to read the figure

The value of Mnet_P in row 3, column 27 is ~6.6 (you can check this by typing Mnet_P(3,27) into Octave and hitting enter). What does this number represent?

The mean full correlation strength between node number 3 and node number 27 (after performing r-to-Z transformation for each subject before averaging) Incorrect! Remember that we are visualising the results for the partial netmats!
The mean partial correlation strength between node number 3 and node number 27 (after performing r-to-Z transformation for each subject before averaging) Correct! The group-averaged strength of the partial correlation between the pair of nodes is shown here (which is also called the 'edge' between the two nodes). The number is higher than 1, because these are averaged z-transformed correlation coefficients.
That the connection is 6.59 times stronger in patients than in healthy controls Incorrect! We have not yet done a group comparison, this is just the average partial correlation strength between the pair of nodes (which is called an 'edge').

Group average network hierarchy

The next thing we can look at is how nodes cluster together to form larger resting state networks. For this we run a clustering method that groups nodes together based on their covariance structure. To view this network hierarchy, run:

nets_hierarchy(Znet_F,Znet_P,ts.DD,'groupICA100.sum');

Click here for more information on how to read the figure

digraph G { rankdir=LR node [shape=box]; 1 [label="group-averaged netmat to drive clustering (Znet_F) shown below the diagonal"]; 2 [label="group-average netmat to show (doesn't drive clustering) (Znet_P) shown above the diagonal"]; 3 [label="list of good components (ts.DD) entered earlier"]; 4 [label="directory containing png summary figures ('groupICA100.sum') to create run: slices_summary groupICA100/melodic_IC 4 $FSLDIR/data/standard/MNI152_T1_2mm groupICA100.sum -1"]; 5 [label="nets_hierarchy = the FSLnets command we are running"]; 1 -> 5; 2 -> 5; 3 -> 5; 4 -> 5; }

The first parameter (full correlation group netmat) is shown below the diagonal (such netmats are symmetric about the diagonal, so we only need to show one half of them). Above the diagonal the second parameter is shown, here the partial correlation group netmat. You should be able to see that the group-level netmats is "sparser" for partial correlation than full correlation. This is because partial correlation is aiming to estimate only the direct connections strengths, and not just any (potentially indirect) correlation value.

The last parameter tells the command the location of a set of "thumbnail" images previously created from the original group-ICA spatial maps; these were generated earlier.

You can see, for example, that the nodes grouped together in the dark blue tree on the far left are part of a large-scale resting state network called the default mode network that you may have heard about.

Cross-subject comparison with netmats

We are now able to test whether the netmats differ significantly between healthy controls and patients with a tumor using a two-sample t-test. This is a 'univariate' test, as we will test each network matrix edge separately for a group-difference, and then we will estimate p-values for these tests, correcting for multiple comparisons across all edges. By analogy to high-level task-fMRI analyses: you can think of each subject's netmat as being an NxN image of voxels, and the univariate testing as modelling each voxel (in isolation from each other) across subjects.

As with the dual regression practical, we have 6 healthy controls and 6 patients with tumours. We have already created the design files for you to run the two-sample t-test.

Open the GLM GUI (in a new terminal window) and load the ~/fsl_course_data/rest/Nets/design/unpaired_ttest_1con.fsf file to look at the design (ignore the error message, go to the 'stats' tab and click 'full model setup' to be able to see the design).

The design includes one contrast. What is this contrast testing?

To perform the comparison run the following command (which calls randomise from within MATLAB/Octave, with 5000 permutations for each contrast):

[p_uncorr,p_corr]=nets_glm(Pnetmats,'design/unpaired_ttest_1con.mat','design/unpaired_ttest_1con.con',1);

digraph G { rankdir=LR; node [shape=box]; 1 [label="input to t-test (Pnetmats)"]; 2 [label="EVs of group-level design ('design/paired_ttest.mat') use GLM gui to create this"]; 3 [label="contrasts of group-level design ('design/paired_ttest.con') use GLM gui to create this"]; 4 [label="create output figure? (1) i.e. yes"]; 5 [label="nets_glm"]; 6 [label="variable containing uncorrected p-values (p_uncorr)"]; 7 [label="variable containing p-values corrected for multiple comparisons (p_corr)"] 1 -> 5; 2 -> 5; 3 -> 5; 4 -> 5; 5 -> 6; 5 -> 7; }

Once randomise has finished, you will see a figure showing "netmats" containing corrected p-values. The results above the diagonal show edges where the two-group t-test is significant, at corrected-p<0.05.

Click here for more information on how to read the figure

Displaying significant group differences

We will now run a command that shows which nodes were linked to the significant differences between the groups:

nets_edgepics(ts,'groupICA100.sum',Znet_P,reshape(p_corr,ts.Nnodes,ts.Nnodes),6);

Click here for more information on how to read the figure

Each pair of thumbnails corresponds to one position in the NxN network matrix and the node numbers are listed in the text captions. The coloured bar joining each pair of nodes tells you what the overall group-average connection strength is: thicker means a stronger connection; red means it's positive, and blue means that the connection is "negative" (meaning that the two nodes tend to anti-correlate on average). The "value" numbers tell you the 1-p-values - so the higher these are, the more significantly different this edge strength is between the two groups. Anything less than 0.95 is not significant, after correcting for multiple comparisons.

Displaying boxplots

In addition, we also want to show how the partial correlation differs between the patients and the controls and these two significant edges. To do this, run:

nets_boxplots(ts,Pnetmats,57,33,6);

The boxplots summarize the distributions of the correlation values (connection strengths) in the two groups - A being healthy controls and B being tumour patients - for this one particular node-pair (57,33).

Click here for more information on how to read the figure

Using this plot, how would you interpret what caused the significant group difference for the edge between node 57 and 33?

Now generate boxplots for the 4 netmat elements (edges) that have the highest 1-p-values for the difference between patients and healthy controls (for the purpose of this practical it does not matter if the difference is not significant) and try to associate the corresponding interpretation of the between-group difference referring to the examples described below (1,2,3,4):

the connection was positive in both A and B, but higher in A
the connection was negative in both A and B, and more negative in B
the connection was not present (on average) in condition A, and was negative in condition B
the connection was positive in condition A, and zero (or negative) in condition B

Type exit in the terminal to exit MATLAB/octave if you are finishing the practical here and not doing the optional section below.

(Optional) Multivariate cross-subject analysis

Finally, we will now carry out multivariate cross-subject analysis - meaning that, instead of considering each netmat edge in isolation (like we did above) we will consider the whole netmat in one analysis.

For example, we can attempt to classify subjects into patients or controls using machine learning methods, such as support vector machines (SVM) or linear discriminant analysis (LDA). Such methods look at the overall pattern of values in the netmat, and try to learn how the overall pattern changes between the two groups.

The following command feeds the regularised partial correlation netmats from both groups into LDA. It uses a method known as leave-one-out cross-validation to train and test a classifier, and reports in what percentage of tests it was successful at discriminating between patients and controls:

nets_lda(Pnetmats,6,2)

You can also try using other classifiers, for example by changing the last entry from 2 to 1 (type help nets_lda for further details).

What does the value output from the nets_lda command represent? What does it tell you about your netmats and groups? How does changing the classifier method change the result, and what might this mean?

One "downside" of such multivariate testing is that you can no longer make strong statistical claims about individual edges in the network - the whole pattern of edges has been used, so we don't know which individual edges are significantly different in the two groups.

Type exit in the terminal to exit MATLAB/octave.

The End.

Resting state FSLnets practical

Introduction

Contents:

Before running FSLnets

MATLAB/Octave configuration

Networks estimation

Cleaning components

Calculating netmats for each subject

Group-average netmat summaries

Group average network hierarchy

Cross-subject comparison with netmats

Displaying significant group differences

Displaying boxplots

(Optional) Multivariate cross-subject analysis