Project: PROJECT | Report date: 2023-11-07 | Lab data analyst(s): OAA/LN/RMH/MK | Report author(s): JLB



1 Descriptive plots

Be aware that it may take a while before all plots are uploaded if the data set contains many metabolites.

1.1 Raincloud plots

These plots show the distribution of data points (without imputations of missing values, if any) by combining elements from half-violin, box, and dot plots.

1.1.1 Log2


Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.


1.1.2 Untransformed


Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.


1.2 Correlation plots

These plots are based on Spearman correlations and the coloured squares show statistically significant correlations.

1.2.1 All measured platforms

1.2.1.1 Raw (non-imputed) data


Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.
Metabolites with only zero values are excluded.
Cases with missing data point(s) in at least one variable are excluded.
If not all metabolites have been measured for all groups/samples in the data set, the plot may only show correlations for the group with the largest sample size.
The correlation plot may be omitted for raw data.


1.2.1.2 GSimp-imputed data


Here, missing data points, if any, are imputed by the GSimp approach as outlined here.
Missing values were initialized by QRILC (quantile regression imputation of left-censored data). We natural log-transformed data before QRILC was conducted to improve the imputation accuracy and ensure positive values in the original scale after back-transformation. Elastic net from the R package ‘glmnet’ was used as the prediction model. We applied the minimum observed value of missing variable as an informative upper truncation point and -Inf as a non-informative lower truncation point for left-censored missing. Before GSimp, we did not follow the ‘80% rule’ or ‘modified 80% rule’, but removed metabolites with >60% missing values or <25 observations.
Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.
Metabolites with only zero values are excluded.
If we have data on group allocation, the data set with missing values imputed within groups is used in the plot.
If not all metabolites have been measured for all groups/samples in the data set, the plot may only show correlations for the group with the largest sample size.


1.2.1.3 QRILC-imputed data


This figure is empty if the QRILC procedure is not performed.
Here, missing data points, if any, are imputed by QRILC (quantile regression imputation of left-censored data).
We natural log-transformed data before QRILC was conducted to improve the imputation accuracy and ensure positive values in the original scale after back-transformation. The R package ‘MsCoreUtils’ (function ‘impute_matrix’ and method ‘QRILC’) was applied for this imputation approach. Before QRILC, we did not follow the ‘80% rule’ or ‘modified 80% rule’, but removed metabolites with >60% missing values or <25 observations.
Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.
Metabolites with only zero values are excluded.
If we have data on group allocation, the data set with missing values imputed within groups is used in the plot.
If not all metabolites have been measured for all groups/samples in the data set, the plot may only show correlations for the group with the largest sample size.


1.2.2 Separate platforms

1.2.2.1 Raw (non-imputed) data





Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.
Metabolites with only zero values are excluded.
Cases with missing data point(s) in at least one variable are excluded.
If not all metabolites have been measured for all groups/samples in the data set, the plot may only show correlations for the group with the largest sample size.
The correlation plot(s) may be omitted for raw data.


1.2.2.2 GSimp-imputed data





Here, missing data points, if any, are imputed by the GSimp approach as outlined here.
Missing values were initialized by QRILC (quantile regression imputation of left-censored data). We natural log-transformed data before QRILC was conducted to improve the imputation accuracy and ensure positive values in the original scale after back-transformation. Elastic net from the R package ‘glmnet’ was used as the prediction model. We applied the minimum observed value of missing variable as an informative upper truncation point and -Inf as a non-informative lower truncation point for left-censored missing. Before GSimp, we did not follow the ‘80% rule’ or ‘modified 80% rule’, but removed metabolites with >60% missing values or <25 observations.
Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.
Metabolites with only zero values are excluded.
If we have data on group allocation, the data set with missing values imputed within groups is used in the plot(s).
If not all metabolites have been measured for all groups/samples in the data set, the plot may only show correlations for the group with the largest sample size.


1.2.2.3 QRILC-imputed data

These figures are empty if the QRILC procedure is not performed.
Here, missing data points, if any, are imputed by QRILC (quantile regression imputation of left-censored data).
We natural log-transformed data before QRILC was conducted to improve the imputation accuracy and ensure positive values in the original scale after back-transformation. The R package ‘MsCoreUtils’ (function ‘impute_matrix’ and method ‘QRILC’) was applied for this imputation approach. Before QRILC, we did not follow the ‘80% rule’ or ‘modified 80% rule’, but removed metabolites with >60% missing values or <25 observations.
Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.
Metabolites with only zero values are excluded.
If we have data on group allocation, the data set with missing values imputed within groups is used in the plot(s).
If not all metabolites have been measured for all groups/samples in the data set, the plot may only show correlations for the group with the largest sample size.


1.3 Network plots

These plots are based on Spearman correlations and the paths show correlations equal to or higher than 0.2.

1.3.1 All measured platforms

1.3.1.1 Raw (non-imputed) data


Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.
Metabolites with biological meaningful zero values are excluded.
Cases with missing data point(s) in at least one variable are excluded.
If not all metabolites have been measured for all groups/samples in the data set, the plot may only show correlations for the group with the largest sample size.
The network plot may be omitted for raw data.


1.3.1.2 GSimp-imputed data


Here, missing data points, if any, are imputed by the GSimp approach as outlined here.
Missing values were initialized by QRILC (quantile regression imputation of left-censored data). We natural log-transformed data before QRILC was conducted to improve the imputation accuracy and ensure positive values in the original scale after back-transformation. Elastic net from the R package ‘glmnet’ was used as the prediction model. We applied the minimum observed value of missing variable as an informative upper truncation point and -Inf as a non-informative lower truncation point for left-censored missing. Before GSimp, we did not follow the ‘80% rule’ or ‘modified 80% rule’, but removed metabolites with >60% missing values or <25 observations.
Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.
Metabolites with biological meaningful zero values are excluded.
If we have data on group allocation, the data set with missing values imputed within groups is used in the plot.
If not all metabolites have been measured for all groups/samples in the data set, the plot may only show correlations for the group with the largest sample size.


1.3.1.3 QRILC-imputed data


This figure is empty if the QRILC procedure is not performed.
Here, missing data points, if any, are imputed by QRILC (quantile regression imputation of left-censored data).
We natural log-transformed data before QRILC was conducted to improve the imputation accuracy and ensure positive values in the original scale after back-transformation. The R package ‘MsCoreUtils’ (function ‘impute_matrix’ and method ‘QRILC’) was applied for this imputation approach. Before QRILC, we did not follow the ‘80% rule’ or ‘modified 80% rule’, but removed metabolites with >60% missing values or <25 observations.
Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.
Metabolites with biological meaningful zero values are excluded.
If we have data on group allocation, the data set with missing values imputed within groups is used in the plot.
If not all metabolites have been measured for all groups/samples in the data set, the plot may only show correlations for the group with the largest sample size.


1.3.2 Separate platforms

1.3.2.1 Raw (non-imputed) data





Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.
Metabolites with biological meaningful zero values are excluded.
Cases with missing data point(s) in at least one variable are excluded.
If not all metabolites have been measured for all groups/samples in the data set, the plot may only show correlations for the group with the largest sample size.
The network plot(s) may be omitted for raw data.


1.3.2.2 GSimp-imputed data





Here, missing data points, if any, are imputed by the GSimp approach as outlined here.
Missing values were initialized by QRILC (quantile regression imputation of left-censored data). We natural log-transformed data before QRILC was conducted to improve the imputation accuracy and ensure positive values in the original scale after back-transformation. Elastic net from the R package ‘glmnet’ was used as the prediction model. We applied the minimum observed value of missing variable as an informative upper truncation point and -Inf as a non-informative lower truncation point for left-censored missing. Before GSimp, we did not follow the ‘80% rule’ or ‘modified 80% rule’, but removed metabolites with >60% missing values or <25 observations.
Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.
Metabolites with biological meaningful zero values are excluded.
If we have data on group allocation, the data set with missing values imputed within groups is used in the plot(s).
If not all metabolites have been measured for all groups/samples in the data set, the plot may only show correlations for the group with the largest sample size.


1.3.2.3 QRILC-imputed data

These figures are empty if the QRILC procedure is not performed.
Here, missing data points, if any, are imputed by QRILC (quantile regression imputation of left-censored data).
We natural log-transformed data before QRILC was conducted to improve the imputation accuracy and ensure positive values in the original scale after back-transformation. The R package ‘MsCoreUtils’ (function ‘impute_matrix’ and method ‘QRILC’) was applied for this imputation approach. Before QRILC, we did not follow the ‘80% rule’ or ‘modified 80% rule’, but removed metabolites with >60% missing values or <25 observations.
Metabolites only measured in other matrices than plasma/serum may be excluded from this figure.
Metabolites with biological meaningful zero values are excluded.
If we have data on group allocation, the data set with missing values imputed within groups is used in the plot(s).
If not all metabolites have been measured for all groups/samples in the data set, the plot may only show correlations for the group with the largest sample size.