The authors have declared that no competing interests exist.
The current pandemic emergence of novel coronavirus disease (COVID-19) poses a relevant threat to global health. SARS-CoV-2 infection is characterized by a wide range of clinical manifestations, ranging from absence of symptoms to severe forms that need intensive care treatment. Here, plasma-EDTA samples of 30 patients compared with age- and sex-matched controls were analyzed via untargeted nuclear magnetic resonance (NMR)-based metabolomics and lipidomics. With the same approach, the effect of tocilizumab administration was evaluated in a subset of patients. Despite the heterogeneity of the clinical symptoms, COVID-19 patients are characterized by common plasma metabolomic and lipidomic signatures (91.7% and 87.5% accuracy, respectively, when compared to controls). Tocilizumab treatment resulted in at least partial reversion of the metabolic alterations due to SARS-CoV-2 infection. In conclusion, NMR-based metabolomic and lipidomic profiling provides novel insights into the pathophysiological mechanism of human response to SARS-CoV-2 infection and to monitor treatment outcomes.
The current COVID-19 pandemic caused by severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) is markedly affecting the world population. Here we report about the small-molecule profile of patients hospitalized during the first wave of the COVID-19 pandemic in Florence (Italy). Using magnetic resonance spectroscopy, we showed that the infection induces profound changes in the metabolome. The analysis of the specific metabolite changes and correlations with clinical data enabled the identification of potential biochemical determinants of the disease fingerprint. We also followed how metabolic alterations revert towards those of the control group upon treatment with tocilizumab, a recombinant humanized monoclonal antibody against the interleukin-6 receptor. These results open up possibilities for the monitoring of novel patients and their individual response to treatment.
The World Health Organization announced coronavirus disease 2019 (COVID-19) outbreak a pandemic in March 2020 [1,2]. At the beginning of October 2020 over thirty-four millions of patients have been diagnosed by SARS-CoV-2 infection and about 1 million deaths are reported all over the world [3]. The SARS-CoV-2 infection is characterized by a wide range of clinical manifestations, ranging from absence of symptoms to severe forms that need intensive care treatment. About 20% of patients, particularly the older ones and those affected by chronic comorbidities such as hypertension, diabetes mellitus, renal and heart diseases, may develop interstitial pneumonia and respiratory distress requiring oxygen therapy or mechanical ventilation [4]. In addition to interstitial pneumonia and acute respiratory distress syndrome (ARDS), COVID-19 is associated with other life-threatening complications such as sepsis, thromboembolism and multi-organ failure [5]. Patients with the highest rate of morbidity and mortality following SARS-CoV-2 infection develop a hyperinflammatory syndrome due to the overproduction of early response proinflammatory cytokines (such as IL-1β, IL-6, TNFα, MCP-1)–the so called “cytokine storm”–leading to an increased vascular permeability, activation of coagulation pathways, dysregulation of T cells with associated lymphopenia, multiorgan injury and rapid clinical deterioration [6–9].
Metabolomics and lipidomics can contribute a system level picture, thus expanding the options that chemists can explore to help fight the pandemic [10]. The human metabolome is composed by an ensemble of several thousands of small molecules (<1500–2000 Da) present on a very ample range of concentrations (from <1 nM to >1 μM) and produced by the genome of the host organism and by the genomes of its microflora, as well as deriving from exogenous factors like medical treatments [11]. Blood plasma is a primary carrier of small molecules in the body, the relative concentrations of which reflect the physio-pathological state of the organism and thus possible tissue lesions and organ dysfunctions. The overall picture is complemented by alterations in the lipid components [12]. As a consequence, metabolomics and lipidomics of serum and plasma are increasingly used for successful patient stratification in various diseases [13–18]. Herein, a strong metabolomic and lipidomic signature of COVID-19 is revealed via untargeted nuclear magnetic resonance (NMR) spectroscopy of plasma-EDTA [19,20] from 30 patients compared with age- and sex-matched controls. Moreover, in a subset of patients, the metabolic effects due to tocilizumab administration were successfully investigated. This study had no sample-size calculation; the analysis included all patients who were admitted at the Infective and Tropical Diseases and at the Intensive Care Unit of the Careggi University Hospital, Florence (IT), in the period between March 10 and March 30 2020, before the rapid decline of hospitalizations for COVID-19.
We analyzed via 1H NMR spectroscopy the metabolomic and lipidomic profiles of plasma-EDTA samples obtained from 30 patients affected by COVID-19. SARS-CoV-2 infection was confirmed by positive RT-PCR on nasopharyngeal swab specimens. The plasma-EDTA samples available for the metabolomic analysis were collected between 2–23 days from clinical onset (mean 9 days). Samples from 30 non COVID-19 subjects, one-to-one matched for age and sex, were used as control group (CTR). Tocilizumab, a humanized anti-IL-6 receptor monoclonal antibody, was administered to 8 of the 30 COVID-19 patients enrolled and another plasma-EDTA sample for each patient was collected after 2–18 days of treatment (mean 5 days). Demographic and clinical characteristics of enrolled patients are reported in S1 Table. Our analyses considered 21 quantified metabolites and 114 lipoprotein-related parameters [21]. Lipoprotein quantification of plasma samples of two COVID-19 patients (COVID-19-025 and COVID-19-027) was not possible for the presence of an interfering signal in the spectra, thus also their respective matched controls (CTR-4 and CTR-7) were removed from the lipoprotein analyses.
No outliers were identified using principal component analysis (PCA) on the entire population, both for metabolite and lipoprotein profiles (S1 Fig).
Plasma metabolite and lipoprotein profiles of COVID-19 patients and CTRs were compared using the Random Forest (RF) algorithm. The eight samples collected post-tocilizumab treatment are not included in these analyses. The RF model built on metabolite concentrations shows a significant differential clustering with 91.7% accuracy, 93.3% sensitivity, and 90.0% specificity (Fig 1A and 1C and S2A and S2 Table). In particular, a panel of 11 metabolites (Fig 1D and S3 Table) displays significant alterations between COVID-19 patients and CTRs. One of them, giving rise to a detectable multiplet in the region between 7.21–7.30 ppm has not been assigned and is referred as “unknown”. However, even if this signal is removed from the statistical model, the discrimination accuracy between COVID-19 patients and CTRs does not change significantly.


Metabolomic/lipidomic alterations in COVID-19 patients.
(A-B) Proximity plots of the RF model discriminating COVID-19 patients (red dots), and CTR subjects (green dots) using A) the 21 quantified metabolites and B) the lipoprotein-related parameters. (C) Confusion matrices with predictive accuracy values of each model. (D) Values of Log2 Fold Change (FC) of quantified metabolites. Grey bars represent p-values < 0.05 after FDR correction. (E) Values of Log2(FC) of lipoprotein-related parameters significantly different (p-value < 0.05 after FDR correction) between COVID-19 patients and controls. Metabolites/lipoproteins with Log2(FC) positive/negative values have higher/lower concentration in plasma samples from COVID-19 patients with respect to controls.
The RF model built on lipoprotein-related parameters shows a significant differential clustering with 87.5% accuracy, 85.7% sensitivity, and 89.3% specificity (Fig 1B and 1C and S2B and S2 Table). Forty-eight features (Fig 1E and S4 Table) display significant alterations between COVID-19 patients and CTRs. These results demonstrate that COVID-19 patients are characterized by higher levels of VLDL particles, and lower levels of Apo A1, Apo A2, cholesterol and free-cholesterol HDL and LDL subfractions. In particular, the subfractions HDL-3, HDL-4, LDL-4, LDL-5 of cholesterol are the most affected.
Correlations between clinical and metabolomic parameters were calculated and the results are reported in Fig 2.


Heatmap correlations between clinical and metabolomic parameters.
R values are shown as different degree of color intensity (red, positive correlations; blue, negative correlation). R values are reported in the plot only for statistically significant correlations (p-value < 0.05 after FDR correction).
Phenylalanine significantly correlates with C-reactive protein (CRP) and interleukin-6. Inflammation and immune activation impair the conversion of phenylalanine to tyrosine, as observed in patients suffering from sepsis, cancer, or HIV-1 infection [22–25]; accordingly, we found higher phenylalanine levels and a trend in lower tyrosine amounts in patients than in controls. Interestingly, a positive correlation between phenylalanine/tyrosine ratio and high CRP levels, has been already described by Murr and colleagues [26] in patients affected by coronary artery disease (CAD). These data are in accordance with ours, since SARS-CoV-2 infection is characterized not only by immune activation and systemic phlogosis, but also by microvascular endothelial damage and activation of coagulative cascade, as happens in CAD. Alveolar-arterial O2 gradient anticorrelates with citrate, accordingly the partial pressure of arterial oxygen and fraction of inspired oxygen (PaO2/FiO2) ratio (known as Horowitz index) and the ratio between oxygen saturation and fraction of inspired oxygen (SaO2/FiO2) positively correlates with citrate. This metabolite is known for its anti-oxidative, anti-coagulant and anti-inflammatory properties [27–29]. SARS-CoV-2 infection can cause lung damage, leading to ARDS, due not only to alveolar damage but also to diffuse microvascular endothelial damage and clot activation, mainly driven by pro-inflammatory cytokines, including IL-6 [30,31]. The protective role of citrate on endothelial integrity was recently reported by Dellepiane and colleagues [32]. Moreover, the consumption of citrate and other carboxylates is promoted by hypoxic conditions in red blood cells [33].
Formate shows the inverse pattern of correlation with respect to citrate, and it significantly correlates also with lactate dehydrogenase and FiO2. Regarding lipoprotein-related parameters only LDL cholesterol significantly correlates with the SaO2/FiO2 ratio.
In line with these observations, metabolites provide an optimal discrimination (accuracy 90.0%, 100.0% sensitivity, 83.3% specificity) between COVID-19 patients treated and non-treated with invasive ventilation (S3A and S3B Fig), with formate and citrate as the most important features of the model. Instead, no significant clustering is present in the model calculated with lipoprotein-related parameters (S3C and S3D Fig).
Despite plasma samples of COVID-19 patients are characterized by higher levels of VLDL and associated triglycerides, we observed a general reduction of HDL and LDL cholesterol-related parameters. Downregulation of lipids in COVID-19 blood sera has already been observed [34–37] and it has been hypothesized that lipids (in particular cholesterol and fatty acids) could play a pivotal role in virus replication and assembly [38,39]. Our data suggest that only LDL and HDL could be implied in this mechanism.
Accordingly, in a recent study, LDL and HDL levels were inversely correlated to disease severity and poor prognosis [40]. Furthermore, overproduction of VLDL has been linked with the processes inducing insulin resistance in COVID-19 patients [35].
We also detected an accumulation of mannose in the plasma of COVID-19 patients and a significant positive correlation between plasma mannose levels and neutrophils and between mannose and the neutrophils to lymphocytes ratio. An increment of mannose could be related to different reasons: it could be associated to its binding to lectin in order to promote complement activation [34], or it could be linked to insulin resistance. Indeed, plasma mannose levels are elevated in subjects with insulin resistance independently of obesity [41] and there are increasing evidences that a bidirectional relationship between COVID-19 and diabetes exists [42].
The increment of pyruvate and 3-hydroxybutyrate, along with the strong decrement of citrate and free amino acids (alanine, glycine, glutamine, histidine) in plasma of COVID-19 patients can be ascribed to an impairment of the energetic metabolism. Indeed, during inflammatory states amino acids can be used to provide energy and materials for the proliferation and phagocytosis of immune cells. It is important to underline that pyruvate is a metabolite particularly sensitive to pre-analytical procedures, thus further investigations to confirm its alteration are needed [43,44].
Among the 30 COVID-19 patients enrolled, 18 patients presented ARDS. Thus, the possibility that ARDS could significantly alter the profile of COVID-19 was examined. The RF models calculated both on metabolites and lipoprotein-related parameters can only slightly cluster ARDS patients with respect to the other COVID-19 patients (metabolite model: accuracy 76.7%, 88.9% sensitivity, 58.3% specificity; lipoprotein model: accuracy 75.0%, 81.2% sensitivity, 66.7% specificity). These results demonstrate that the metabolomic profile of COVID-19 patients is mainly dictated by the pathology or by the host response to the virus infection, rather than by the concomitant presence of ARDS.
Multilevel partial least square discriminant analysis (mPLS-DA) was used to analyze NMR data of pre- and post-tocilizumab samples in a pairwise multivariate fashion. The mPLS-DA model built on metabolites shows significant differential clustering, yielding a discrimination accuracy of 80.3% (Fig 3A). The two pairs of samples collected from patients who died (COVID-19-020 and COVID-19-021) present the smallest shift within the metabolomic subspace. The same trend is observed for COVID-19-018 patient who, unfortunately, was transferred to another hospital and no follow-up and outcome information was available. Univariate analysis enables the identification of a panel of eight metabolites (Fig 3B–3I and S5 Table) significantly different (before FDR correction) between pre- and post-tocilizumab patients. The post-treatment levels of these metabolites partially or completely revert towards the levels of CTR subjects.


Tocilizumab treatment reverts metabolomic/lipidomic alterations in COVID-19 patients.
(A) Score plot (of the first two principal components) and accuracy of the mPLS-DA model discriminating COVID-19 patients at pre- (red dots) and post- (orange dots) tocilizumab treatment using the 21 quantified metabolites. Patients 18, 20 and 21 are marked with *. (B-I) Boxplots of the statistically significant metabolites discriminating of pre- (red) and post- (orange) tocilizumab samples, p-values obtained using Wilcoxon signed-rank test are also reported. Boxplots of controls (green) and the p-values (Wilcoxon-Mann-Whitney test) for the comparison between pre-treatment and CTR are reported. P-values adjusted for FDR are reported in S5 Table.
The mPLS-DA model built using lipoprotein-related parameters shows a significant discrimination between samples collected pre- and post-tocilizumab treatment (accuracy of 82.8%) (S4 Fig). Univariate analysis identifies 19 lipoprotein parameters (S6 Table) significantly different between pre- and post-tocilizumab treatment. In particular, HDL-1 subfractions of cholesterol, phospholipids, and Apo A2 showed lower levels at post-treatment, whereas LDL-5, HDL-4, IDL, VLDL-1, and VLDL-2 of many subfractions are higher at post-treatment. The general increment of lipoprotein subfractions after treatment confirms the metabolic reversion and supports the key role of lipids in the metabolism of COVID-19 patients.
In summary, in this study a strong plasma metabolomic and lipidomic signature of Sars-CoV-2 infection is identified, in agreement with other studies where NMR or mass spectrometry have been used to study cohorts of COVID-19 patients from different countries and characterized by different degrees of severity/clinical manifestations [34–37]. Although the two analytical platforms do not address the same sets of molecules, common metabolic dysfunctions emerge from the comparison of all these studies, which include lipid metabolism, protein glycosylation and amino acid metabolism. Correlations between clinical parameters and some metabolites are shown, which include mannose and phenylalanine levels. In addition, here we found that some molecules, whose levels correlate with alveolar-capillary membrane injury are affected by mechanical ventilation. Of note, in the case of patients who underwent tocilizumab treatment, metabolic alterations revert towards those of the control group.
The study was approved by the Comitato Etico Regionale- sezione area vasta centro (protocol 16859) and it complies with the 1964 Declaration of Helsinki and its later amendments. Written informed consent was obtained from recruited patients.
In the period between March 10 and March 30, 2020 we enrolled 30 COVID-19 patients that were admitted at the Infective and Tropical Diseases Unit and at the Intensive Care Unit of the Careggi University Hospital, Florence, Italy. All patients were of Caucasian ethnicity. Demographic and clinical features of the enrolled patients are summarized in S1 Table.
Plasma samples were collected from all the subjects enrolled in the study, according to standard procedures [43,44]. Blood samples were collected in ethylene diamine tetra-acetate (EDTA)-coated blood collection tubes and stored at room temperature. Ficoll, a neutral highly branched polymer formed by the co-polymerization of sucrose and epichlorohydrin, was used for blood separation. 14 mL of blood were gently placed in 25 mL tubes containing 9 mL of Ficoll. Tubes were centrifugated 1500 g for 20 minutes. Plasma was collected and rapidly stored in a −20°C freezer pending NMR analysis.
NMR samples were prepared according to standard procedures [19,20,45]. NMR spectra for all samples were acquired using a Bruker 600 MHz spectrometer (Bruker BioSpin) operating at 600.13 MHz proton Larmor frequency and equipped with a 5 mm PATXI 1H-13C-15N and 2H-decoupling probe including a z-axis gradient coil, an automatic tuning-matching (ATM) and an automatic refrigerated sample changer (SampleJet, Bruker BioSpin). A BTO 2000 thermocouple served for temperature stabilization at the level of approximately 0.1 K of the sample. Before measurement, to equilibrate temperature at 310 K, samples were kept for at least 5 minutes inside the NMR probe head.
For each plasma sample, two one-dimensional 1H NMR spectra were acquired with water peak suppression and different pulse sequences that allowed the selective detection of different molecular components. The spectra were: 1) a standard NOESY using 32 scans, 98,304 data points, a spectral width of 18,028 Hz, an acquisition time of 2.7 s, a relaxation delay of 4 s and a mixing time of 0.01 s. This kind of spectrum is made up of signals arising from low molecular weight molecules (metabolites) and signals arising from macromolecules such as lipoproteins and lipids; 2) a standard spin echo Carr-Purcell-Meiboom-Gill (CPMG) using 32 scans, 73,728 data points, a spectral width of 12,019 Hz and a relaxation delay of 4 s. This NMR sequence allows the selective detection of signals arising only from low molecular weight metabolites.
Before applying Fourier transform, free induction decays were multiplied by an exponential function equivalent to a 0.3 Hz line-broadening factor. Transformed spectra were automatically corrected for phase and baseline distortions and calibrated to the anomeric glucose doubled at δ 5.24 ppm, using TopSpin 3.6.2 (Bruker BioSpin) [19,20].
All data analyses were performed using the “R” statistical environment. Metabolites, whose peaks in the CPMG spectra were well defined and resolved, were assigned and their concentrations analyzed. The assignment procedure was performed using an 1H NMR spectra library of pure organic compounds (BBIOREFCODE, Bruker BioSpin), public databases, e.g. Human Metabolome Database [11], storing reference 1H NMR spectra of metabolites, and using literature data when available. The spectral regions associated to the 21 assigned metabolites (S7 Table) were integrated using a R script in-house developed. Quantification of 114 lipid main fractions and subfractions was performed using the Bruker IVDr platform [21].
Both metabolites and lipoprotein-related parameters were analyzed via multivariate analysis. Unsupervised Principal Component Analysis (PCA) was used as first exploratory analysis to obtain a preliminary data outlook (i.e. cluster detection and screening for outliers). The Random Forest (RF) algorithm [46] was used for classification in the comparison between COVID-19 patients and CTR. RF is a classification algorithm that uses an ensemble of unpruned decision trees (forest), each of which is built on a bootstrap sample of the training data using a randomly selected subset of variables (metabolites or lipoprotein-related parameters) [47,48]. The percentage of trees in the forest that assign one sample to a specific class can be inferred as a probability of belonging to a given class [13]. In our case, each tree was used to predict whether a sample represents a COVID-19 patient or a CTR subject. Because the out-of-bag (OOB) observations were not used in the fitting of the trees, the OOB estimates are cross-validated, accuracy estimates, and represent an unbiased estimation of the generalization error [46]. Accuracy, sensitivity, and specificity of all calculated models were assessed according to the standard definitions. For all calculations, the R package ‘Random Forest’ [46] was used to grow a forest of 1000 trees, using the default settings.
Pairwise comparisons between pre- and post-treatment samples were performed using multilevel Partial Least Square Discriminant Analysis (mPLS-DA) and results validated using a Monte Carlo Cross-Validation scheme (MCCV, script in house developed): each dataset was randomly divided by 1000 times into a training set (80% of the data) which was used to build the model and a test set (20% of the data) which was used to test the integrity of the model. Accuracy, sensitivity, and specificity of all calculated models were assessed according to the standard definitions.
On the biological assumption that metabolite and lipoprotein concentrations are not normally distributed, non-parametric tests were used for the univariate analysis. Wilcoxon-Mann-Whitney test was used to infer differences between the metabolite/lipid levels in the comparison between COVID-19 patients and CTR. Instead for pairwise comparison between pre- and post-treatment samples the Wilcoxon signed-rank test was utilized [49]. P-values were adjusted for multiple testing using the false discovery rate (FDR) procedure with Benjamini-Hochberg [50] correction at α = 0.05. The effect size (Ef) was also calculated [51] to aid in the identification of the meaningful signals giving an estimation of the magnitude of the separation between the different groups. The magnitude is assessed using the thresholds provided in Romano et al. [52], i.e. |Ef| < 0.147 “negligible”, |Ef| < 0.33 “small”, |Ef| < 0.474 “medium”, otherwise “large”. Pearson correlations, adjusted for FDR using BH methods, were also calculated.
P.T., C.L., L.T., V.G., A.V. and G.M acknowledge the helpful discussions with our partners of the Metabolomics based COVID-19 Research Network (CV19 Research Network) and the support and the use of resources of Instruct-ERIC, a Landmark ESFRI project, and specifically the CERM/CIRMMP Italy Centre.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52