PLoS ONE
Home Lipidomic and metabolomic profiles of Coffea canephora L. beans cultivated in Southwestern Nigeria
Lipidomic and metabolomic profiles of <i>Coffea canephora</i> L. beans cultivated in Southwestern Nigeria
Lipidomic and metabolomic profiles of Coffea canephora L. beans cultivated in Southwestern Nigeria

Competing Interests: The authors have declared that no competing interests exist.

Article Type: research-article Article History
Abstract

Coffee (Coffea spp.) is one of the most popular refreshing beverages globally. Coffee lipid diversity has untapped potential for improving coffee marketability because lipids contribute significantly to both the health benefits and cup quality of coffee. However, in spite of its potential importance, there have not been extensive studies of lipids among C. canephora genotypes. In this study, ultra-performance liquid chromatography coupled with mass spectrometry (UPLC–MS) profiling of lipid molecules was performed for 30 genotypes consisting of 15 cultivated and 15 conserved genotypes of C. canephora in Southwestern Nigeria. We identified nine classes of lipids in the 30 genotypes which belong to the ‘Niaouli’, ‘Kouillou’ and ‘Java Robusta’ group: among these, the most abundant lipid class was the triacylglycerols, followed by the fatty acyls group. Although ‘Niaouli’ diverged from the ‘Kouillou’ and ‘Java Robusta’ genotypes when their lipid profiles were compared, there was greater similarity in their lipid composition by multivariate analysis, compared to that observed when their primary metabolites and especially their secondary metabolite profiles were examined. However, distinctions could be made among genotypes. Members of the fatty acyls group had the greatest power to discriminate among genotypes, however, lipids that were low in abundance e.g. a cholesterol ester (20:3), and phosphotidylethanolamine (34:0) were also helpful to understand the relationships among C. canephora genotypes. The divergent lipid profiles identified among the C. canephora genotypes, correlated with their Single Nucleotide Polymorphism grouping as assessed by genotype-by-sequencing, and will be exploited to improve coffee cup quality.

Anagbogu,Zhou,Olasupo,Baba Nitsa,Beckles,and Ratnasekhar: Lipidomic and metabolomic profiles of Coffea canephora L. beans cultivated in Southwestern Nigeria

Introduction

“Lipids” are any organic compounds that are insoluble in water, but soluble in organic solvents [1]. Represented in this group are a broad range of molecules such as fatty acids, triacylglycerols, and phospholipids. Lipids serve diverse and critical biological roles in plants such as maintaining cellular membrane integrity and homeostasis, acting as key components in cell signaling pathways and being used as an energy store [2]. In coffee beans, lipids are important reserve compounds and can make up to 10–17% of fresh weight [2, 3].

The lipids accumulated in coffee beans are important contributors to beverage organoleptic properties [46], and are excellent discriminators of coffee quality [711]. Lipids determine beverage foam and emulsion formation and stabilization, which in turn influences flavour and aroma enhancement, especially in espresso brewing [4, 12]. Coffee foam and emulsions are responsible for the characteristic pleasant aroma of coffee by trapping and retaining the volatile compounds [13]. Further, the foam created during coffee brewing also influences beverage creaminess and texture [14, 15], which further enhances coffee sensory perception.

Coffee bean lipid composition is influenced by the environment e.g. altitude, shade and temperature [8, 9, 14, 16] but is largely genotype-dependent [11, 17]. Differences are apparent in C. arabica and C. canephora, the two most consumed types of coffee [11]. Arabica coffees generally have lipid contents of 15% which is higher than that in Robusta which averages 10% [2, 3]. Arabica also has higher palmitic (16:0), arachidic (20:0), stearic (18:0) and linolenic acids (18:3), but is lower in oleic acid (18:3) compared to C. canephora [18, 19].

The value of drawing correlations between the metabolomic and genetic profiles of plant tissues have been comprehensively discussed by Harrigan, et al. [20] and Beckles and Roessner [21]. Knowledge of the metabolome can help to cement the gap between genotype and phenotype [22]. Identifying genetic polymorphisms in the genes encoding components of metabolic pathways, is a new area of interest to scientists, including researchers working to improve cup quality in coffee [6]. A recent genome wide association study of 107 C. arabica accessions found single nucleotide polymorphisms positioned within or near genomic regions coding for proteins involved in lipid and diterpenes metabolic pathways [23], indicating the potential genetic diversity of these key metabolites.

Lipidomics in plants is still in its infancy compared to the broader metabolomic profiling, and the data for C. canephora is comparatively less than that available for C. arabica. C. arabica is known to have a high lipid content compared to C. canephora [2, 3]. However, there have not been extensive studies of lipids of C. canephora genotypes, so the variability relative to C. arabica still remains largely undiscovered. This is important because although C. canephora is not as popular as C. arabica, its production and consumption is increasing, making it important to understand its organoleptic properties [24]. Further, genetic diversity among tetraploid C. arabica is not as high as diploid C. canephora [25] and this may be reflected in the metabolomic and lipid species diversity within these species.

The aim of this study was to determine the lipid and metabolomics profiles of C. canephora genotypes conserved in the Cocoa Research Institute of Nigeria (CRIN) coffee repository, and those used as a source of subsistence by farmers in the southwestern region of Nigeria. Because of the multifaceted role lipids play in determining coffee sensory quality, we wished to identify potential lipid markers differentiating among these C. canephora genotypes. These markers would be critical for future coffee improvement programs in Nigeria. We also wished to broaden the current understanding of lipid profiles in diverse coffee species for which there is relatively scarce data. An ultraperformance liquid chromatography method that incorporated a charged surface hybrid (CSH) solid phase was used to separate the different lipid molecules, followed by electrospray ionization (ESI) quadrupole time of flight (QTOF) tandem mass spectrometry (MS/MS) i.e. CSH-ESI QTOF MS/MS. These compounds were subjected to univariate and multivariate analyses to understand their relative abundance. They were also compared to the primary and secondary metabolites profiled in these same coffees, and their relatedness based on lipidomics was compared to that based on Genotyping-by-Sequencing data. We therefore were able to build a comprehensive, multidimensional overview of bean lipids in Southwestern Nigerian coffees.

Materials and methods

Plant material

Coffee beans (Coffea canephora) were harvested from coffee germplasm repository of Cocoa Research Institute of Nigeria (CRIN) and coffee farmers’ field and no specific permission was required. Similar genotypes were used for metabolomics [26] consisting of ‘Niaouli’, ‘Kouillou’ and ‘Java Robusta’ varieties. They were classified into six groups based on the result from Single Nucleotide Polymorphism-Genotype by Sequencing analysis [26]. ‘Niaouli’ is comprised of three genotypes from farmers’ field: Nia_1, Nia_2 and Nia_3, (respectively, classified as Groups 1, 2 and 3), ‘Kouillou’ is comprised of two genotypes from CRIN: C111 and C36, (respectively, classified as Groups 4, and 5) and, ‘Java Robusta’ is comprised of one genotype from CRIN: T1049, and was classified as Group 6. There were five replicates for each group (genotype) giving a total of 30 samples (Table 1). Both the farmers’ fields and the CRIN plantations are in the Southwestern part of Nigeria [26]. They are located at altitudes of 433–522 m and 133 m above sea level respectively, as recorded by ArcGIS software (Redlands, California). Their specific location i.e. longitude and latitude coordinates, is available at Anagbogu et al. [26].

Table 1
Coffee genotypes used in this study and their symbols.
Variety‘Niaouli’‘Kouillou’‘Java Robusta’
Group/GenotypeGroup1 (Nia_1)Group 2 (Nia_2)Group 3 (Nia_3)Group 4 (C111)Group 5 (C36)Group 6 (T1049)
Samples’ SymbolsNia_11Nia_21Nia_31C111_1C36_1T1049_1
Nia_12Nia_22Nia_32C111_2C36_2T1049_2
Nia_13Nia_23Nia_33C111_3C36_3T1049_3
Nia_14Nia_24Nia_34C111_4C36_4T1049_4
Nia_15Nia_25Nia_35C111_5C36_5T1049_5

Sample preparation

Reddish mature (ripened), coffee beans of these genotypes were collected in ice bags and immediately transferred to -80°C. The endosperms of the coffee bean were excised using sterile blade and re-transferred to -80°C. These endosperms were lyophilized, ground into powder with Udy mill (Udy Corporation) and sealed prior to lipidomic analysis. The lipid was extracted following the protocols according to Matyash et al., [27]. Dried extracts containing an internal standard [12-[(cyclohexylamino)carbonyl] amino]-dodecanoic acid (CUDA)] used as a quality control were resuspended with a mixture of methanol/toluene (9:1, v/v) (60 μL) [28, 29].

Data acquisition

All data processing was done at the West Coast Metabolomic Center, University of California, Davis. Extracted lipids were separated on an Acquity UPLC CSH C18 column (100 × 2.1 mm; 1.7 μm) maintained at 65°C. The mobile phases for positive mode consisted of 60:40 ACN:H2O with 10 mM ammonium formate and 0.1% (v/v) formic acid (A) and 90:10 IPA:ACN with 10 mM ammonium formate and 0.1% (v/v) formic acid (B). For negative mode, the mobile phase modifier was 10 mM ammonium acetate instead. The gradient was as follows: 0 min 85% (A); 0–2 min 70% (A); 2–2.5 min 52% (A); 2.5–11 min 18% (A); 11–11.5 min 1% (A); 11.5–12 min 1% (A); 12–12.1 min 85% (A); and 12.1–15 min 85% (A). Sample temperature was maintained at 4°C in the autosampler. Two μL of sample was injected. Vanquish UHPLC system (ThermoFisher Scientific) was used. Thermo Q-Exactive HF Orbitrap MS instrument was operated in both positive and negative ESI modes respectively with the following parameters: mass range 120−1700 m/z; spray voltage 3.6kV (ESI+) and −3kV (ESI−), sheath gas (nitrogen) flow rate 60 units; auxiliary gas (nitrogen) flow rate 25 units, capillary temperature 320°C, full scan MS1 mass resolving power 60,000, data-dependent MS/MS (dd-MS/MS) 4 scans per cycle, normalized collision energy at 20%, 30%, and 40%, dd-MS/MS mass resolving power 15,000. Thermo Xcalibur 4.0.27.19 was used for data acquisition and analysis. The instrument was tuned and calibrated according to the manufacturer’s recommendations.

Data processing

Raw data files were converted to the mzML format using the ProteoWizard MSConvert utility. For each m/z values ion chromatogram was extracted with m/z thresholds of 0.005 Da and retention time threshold of 0.10 min. Apex of the extracted ion chromatograph was used as peak height value and exported to a .txt file. Peak height files for all the samples were merged together to generate a data matrix. Targeted peak height signal extraction was performed using an R script that is available at https://github.com/barupal. Extracted ion chromatograms for each peak were saved as pictures. CSH-POS and CSH-NEG data matrices were generated. No normalization was applied as minimum signal drift was observed during analysis.

Statistical analysis

Chroma TOF 4.3X software of LECO Corporation and LECO-Fiehn Rtx5 database were used for raw peaks exacting, the data baselines filtering and calibration of the baseline, peak alignment, deconvolution analysis, peak identification and integration of the peak area [30]. The RI (retention time index) method was used in the peak identification, and the RI tolerance was 5000. Metabolite data were normalized by dividing each peak area value by the area of internal standard (Ribitol). Data were log10 transformed, mean-centered and divided by the standard deviation of each variable before performing statistical analysis. The raw data from the UHPLC has been submitted to Dryad repository and can be found at the following location: https://doi.org/10.5061/dryad.j6q573nc4. The statistical analyses, such as ANOVA, PCA, PLS-DA were performed by using MetaboAnalyst 3.0 [31]. Univariate and multivariate statistical approaches were performed with MetaboAnalystR [31]. One-way Analysis of Variance (ANOVA) test was performed to ascertain the significant variables, and they were expressed as f—and p-values. The level of statistical significance (Log10(p)) was determined, followed by post-hoc analyses to correct the p-value and thus generate the False Discovery Rate (FDR). Fisher’s least significant difference method (LSD) was used to identify groups that differ in their lipid profiles [31]. The Hierarchical Clustering Analysis (HCA) plots for SNPs and lipids were generated under the R environment. Pearson correlative analysis was performed using lipid and metabolism parameters. Two packages in R were used: “psych” [32] and “reshape2” [33] for calculating Pearson’s correlation coefficient (PCC), p-value and qFDR. The correlation network was drawn using MetScape 3.1.3 [34] in the Cytoscape environment [35].

Results and discussion

The primary aim of this work was to determine the lipid profile and potential lipid markers differentiating C. canephora conserved and cultivated in southwestern Nigeria. The main classes of lipids, their relative abundance among the genotypes and their ability to discriminate among the genotypes were determined. Next, how the genotypes associated with each other based on their primary, secondary and lipid profiles. Finally, the relationship among genotypes based on lipidomics was compared to genotyping-by-sequencing SNP data, to determine if there was good accordance between genotypic and functional data i.e. lipids.

Identification of lipids

A total of 1824 lipid species consisted of identified (96) and unidentified (1728) lipids were detected with CSH-ESI QTOF MS/MS. The positive and negative ESI operating modes separated the lipids into 1171 positive and 653 negative lipids. Six lipid subclasses are typically recognized in living systems [36], five were found in the C. canephora genotypes studied (Fig 1A).

Lipid classes and their interrelationship.
Fig 1

Lipid classes and their interrelationship.

A) Lipid classes and subclasses detected in C. canephora beans. Classification is based on the LIPID MAPS (Metabolites and Pathways Strategy) system, which separates lipids based on their two fundamental “building blocks”—the ketoacyl group and the isoprene group [36]. Abbreviations: FA- Fatty acyls; DG—Diacylglycerol; TG- Triacylglycerols; PE—Phosphatidylethanolamine; PC -Phosphatidylcholine; LPC Lysophosphatidylcholine; SM-Sphingomyelins; CE- Cholesterol esters. B) A simplified view of a proposed biochemical interrelationship among the lipid classes according to data in [37, 38], as well as in the MetaCyc database [39] derived from plants. The red dashed arrow indicates data obtained from human cells [40].

Each class of lipid contains multiple molecules of varying lengths and degree of saturation. Although these lipid classes are diverse, we constructed a simplistic figure of how they may be related to each other in a metabolic network (Fig 1B). We identified a cholesterol ester (20:3) in this work but these compounds have not been well studied in plants.

Relative amount of lipids variability among genotypes studied

Each lipid group i.e. FA, TG, DG, PC, LPC, is made up of multiple chemicals of varying molecular mass, degree of unsaturation etc. The triacylglycerol (TG) group contained the most chemicals, with 45 different species identified (S1A–S1F Fig). We also identified 13 different fatty acyls (FA), nine phosphatidylcholines (PC), four lysophosphatidylcholine (LPC), three diacylglycerol (DG) species and one each of a cholesterol ether, sphingomyelin and phosphotidylethanolamine.

The relative abundance of the chemicals within each lipid class was examined across genotypes. The ratio of the compounds with the maximum and minimum concentration was determined and was highest for the TG, FA and DG classes i.e. 2145-, 2923- and 20-fold respectively and lowest for PC, LPC, CE, SM, and PE (Fig 2). Triacylglycerol (52:2) and was the most abundant lipid molecule, while Sphingomyelin (SM) was the lowest (Fig 2). the TG and FA were the most abundant lipids in these genotypes consistent with previous reports [2, 3]. Genotype C36_1 (‘Kouillou’) had the highest amount of TG while Nia_11 (‘Niaouli’) had the highest amount of FA. The ‘Niaouli’ (Nia) genotypes are cultivated by farmers, while ‘Kouillou’ (C111 and C36) and ‘Java Robusta’ (T1049) are conserved in the CRIN germplasm. To better understand the lipid composition in coffee beans, we examined the specific lipids identified within eight of the nine classes (the unknown compounds were not discussed) and compared them across genotypes. Each class was considered in turn, and all data referenced from 3.2.1 to 3.2.3 is contained in S1 Fig.

Relative lipid abundance in the C. canephora species studied.
Fig 2

Relative lipid abundance in the C. canephora species studied.

We identified nine subclasses of lipids within the five broader lipid classes described in LIPIDS MAPS. For each subclass, the relative abundance of compounds among each genotype was assessed. Values shown are the maximum and minimum values within each compound class, and the label indicates the genotype in which the highest value was found.

Fatty-acyl group (FA)

The predominant fatty acyls among our C. canephora germplasm were linoleic acid (18:2) followed by palmitic acid (16:0) (S1A Fig), which together encompassed 50% of the total fatty acids. Linoleic and palmitic were also the predominant FAs in a range of C. arabica and C. canephora genotypes studied by Speer and Kölling-Speer [2]. The number of carbons in the fatty acyls in this study ranged from C15 to C28, also similar to the range i.e. C14 to C24, found by Speer and Kölling-Speer [2].

The proportion of the major FA among genotypes was also examined, because of their influence on coffee quality. These FAs are also members of highly related biosynthetic pathways (Fig 3). FA (16:0) i.e. palmitic acid, was higher in ‘Kouillou’ and ‘Java Robusta’ compared to ‘Niaouli’ (S1A Fig). This is of interest because palmitic acid was positively associated with the high hedonistic values in some specialty Brazilian C. arabica coffees [9]. Oleic, linoleic, and linolenic acid levels have also been identified as good markers for Brazilian C. arabica coffees which had lower acidity, fragrance, body and flavor [9]. Oleic acid (18:1) was 3-fold lower than linoleic acid (18:2) especially in Nia_11 (S1A Fig). Further, linolenic acid (18:3) was low (10-fold) relative to linoleic acid (18:2). It is tempting to speculate that there is differential regulation of the enzymes that catalyze these steps (Fig 3). In Arabidopsis these two fatty acid desaturases (FAD2 and FAD3) operate independently [41].

Desaturation of C18 fatty acids.
Fig 3

Desaturation of C18 fatty acids.

This pathway is found in plastids (catalyzed by fatty acid desaturases—FAD6 and FAD7/8) and in the endoplasmic reticulum (catalyzed by FAD2 and FAD3). Arrows indicate the relative amount of each compound based on data derived from this study.

Triacylglycerol (TG)

This group can comprise up to 75% of the lipids in coffee beans [42], and during coffee roasting, they become the carriers of the emerging flavor volatiles [4, 12]. There were 45 classes of TGs identified, varying in acyl carbons from C48 to C60 in this study. The farmer cultivated accessions i.e. the ‘Niaouli’ genotypes had a greater proportion of high-carbon triacylglycerols i.e. C56, C58 lipids compared to ‘Kouillou’ which had more C48-C52 compounds (S1B Fig). The C52 triacylglycerols showed great structural diversity containing up to six double chain triacylglycerols i.e. (52:1) to (52:6).

Cholesterol Ester (CE)

Cholesterol esters are cholesterol molecules with a long-chain fatty acid linked to the hydroxyl group [43]. Cholesterol was thought to only be important in animal metabolism with low amounts found in plants, however, this molecule has now been identified as a key precursor for thousands of bioactive plant metabolites [44]. Cholesterol ester (20:3) was very low in abundance in C. canephora relative to the other lipid classes (Fig 2), however, levels were notably higher in ‘Niaouli’ when compared to ‘Kouillou’ and ‘Java Robusta’ (S1B Fig).

Lysophosphatidylcholine (LPC), Phosphatidylcholine (PC) and Phosphotidylethanolamine (PE)

Phospholipids are potentially important to food sensory perception [42, 45]. PCs accumulated to higher levels than LPCs which makes sense given their relative position the lipid biosynthetic pathway (Fig 1B). Generally, the relative levels of these lipids did not vary among genotypes. PEs are usually found in plant membranes [46]. Their levels were low among the genotypes studied, but there was a significant difference found between ‘Java Robusta’ which did not accumulate as much PE as the other genotypes.

Discriminatory lipid markers

Univariate analysis

Several studies indicate that lipids can be used as a marker to discriminate among coffee types [13, 15, 23, 47]. One-way analysis of variance (p<0.05; Table 2), was used to identify lipid species that could broadly discriminate among coffees genotypes. Twenty-four lipids separated metabolites in the farmer’s genotypes from those conserved at CRIN, since the data for ‘Kouillou’ and ‘Java Robusta’ were identical. ‘Kouillou’ and ‘Java Robusta’ were treated as a single group i.e. ‘Kouillou/Java Robusta’ because of the similarities detected. Fourteen of these lipid species were higher in ‘Niaouli’ and ten were higher in ‘Kouillou/Java Robusta’ (Table 2).

Table 2
One-way analysis of variance and Fisher’s Least Square Difference (LSD) for discriminating genotypes based on lipid content.
Lipid species more abundant in ‘Niaouli’f.valuep.value-LOG10(p)FDR
TG (56:2)15.4547.86E-076.10443.20E-05
TG (58:4)A15.2608.79E-076.05623.20E-05
PE (34:2)15.0371.00E-066.00003.20E-05
CSH_negESI (19:2)11.9886.84E-065.16500.000131
TG (58:3)11.9876.84E-065.16480.000131
CE (20:3)11.2121.18E-054.92990.000188
CSH_posESI (11:6)9.94783.00E-054.52290.000411
FA (28:0)9.08515.94E-054.22630.000713
TG (58:2)6.41470.0006453.19010.006197
TG (56:3)5.28520.0020602.68620.014124
TG (56:1)5.13180.0024312.61420.014586
FA (22:0)5.07360.0025902.58670.014627
FA (20:1)4.43420.0053052.27540.025462
TG (56:4)3.98020.0090302.04430.037692
Lipid species more abundant in ‘Kouillou’ /‘Java Robusta’
FA (16:0)7.39430.0002553.59330.002721
FA (20:0)6.24890.0007603.11890.006637
TG (50:1)_15.94830.0010292.98740.008235
CSH_posESI (17:6)5.77490.001232.91020.009082
CSH_posESI (29:8)5.16570.0023432.63020.014586
CSH_posESI (04:1)4.65490.0041242.38470.021996
CSH_posESI (28:2)4.51200.0048522.31410.024513
TG (52:2)4.29780.0062112.20690.028392
TG (48:1)4.11610.0076852.11440.033535
CSH_posESI (14:1)3.85400.0105071.97850.042028

The Farmer’s cultivated accessions (‘Niaouli’ i.e. Nia_1, Nia_2, Nia_3) and those from the conserved genotypes (‘Kouillou’ i.e. C111, C36 and ‘Java Robusta’ i.e. T1049) were compared. The f-value is derived from the F. statistic test for significance, the p-value tests variability between two groups,–LOG10(p) determines the significant levels, and FDR means False Discovery Rate.

The lipid with the highest discriminatory power between the two groups of genotypes were TG (56:2), TG (58:4)A, and PE (34:2) which accumulated to high levels in ‘Niaouli’ relative to other genotypes. The lipid species which differentiates ‘Kouillou/Java Robusta’ from ‘Niaouli’ were two saturated fatty acids, palmitic (16:0) arachidic acids (20:0) and TG (50:1). Based on our data analysis we show in Table 3, the lipids that have a high discriminatory power in the coffees in our study, were compared with information from data that was published by others. Fatty acids were the most discriminatory of all lipids, although the types varied with respect to the species examined (Table 3).

Table 3
Fatty acids with high discriminatory power among coffee genotypes.
‘Niaouli’‘Kouillou’C. arabicaC. canephoraSpecialty coffees
▪ FA (22:0) Behenic acid.▪ FA (16:0) Palmitic acid▪ FA (18:2) Linoleic acid [18]▪ FA (18:1) Oleic acid [18]▪ FA (14:0) Myristic acid [6]1
▪ FA (20:1) Arachidic acid.

▪ FA (28:0) Montanic acid
▪ FA (20:0) Arachidic acid▪ Polyunsaturated Fatty acids [16]▪ Monounsaturated fatty acids [18]▪ FA (20:0) Arachidic acid.

▪ FA (18:0) Stearic acid

▪ FA (16:0) Palmitic acid [9] 2

1 C. canephora;

2: C. arabica.

Included are the two main varieties of C. canephora (‘Niaouli’ and ‘Kouillou’) used in this study, two cultivated Coffea species (C. arabica and C. canephora) and specialty or high-quality coffee. Citations to the referenced data are included.

Multivariate analysis–partial least square projections

Variable influence (or importance) of projection (VIP) scores were used as another tool to identify the lipids that contribute most to differences among genotypes (Fig 4). This analysis predicted that PE 34:2, a negative unknown lipid, CSH_negESI (19:2), and FA (28:0) i.e. dodecanoic acid could distinguish ‘Niaouli’ from the conserved genotypes. The ANOVA results in Table 2, also indicated that these three compounds accumulated to higher levels in the ‘Niaouli’ genotypes.

Potential lipidomic markers differentiating ‘Niaouli’ (genotypes groups 1–3) from ‘Kouillou’ (groups 4–5) and ‘Java’ (group 6).
Fig 4

Potential lipidomic markers differentiating ‘Niaouli’ (genotypes groups 1–3) from ‘Kouillou’ (groups 4–5) and ‘Java’ (group 6).

The variable importance in projection (VIP) scores on the x-axis provide an estimate of the contribution of a given predictor (lipid species shown on the y-axis) to the Partial Least Square (PLS) regression above. The higher the VIP score, the better the lipid species is as a predictor of the discrimination among the genotypes.

Understanding the lipid networks in C. canephora coffee beans

We examined the lipid-to-lipid correlations among C. canephora beans. Many compounds in related pathways tend to occur within defined stoichiometries and kinetics parameters [48, 49]. Neighbouring compounds may be correlated with each other. This is also true of metabolites that are in chemical equilibrium, show mass conservation, or are highly sensitive to a common parameter [48, 50]. Using Pearson’s correlation coefficient at a cut-off of ≥ | ± 0.75|, we identified those lipids that co-occurred at similar relative levels across the C. canephora genotypes studied. Highly specific interactions between lipids are essential for regulation of cell physiology [51, 52] (Fig 5 and detailed in S2 Table).

Pearson’s correlation network illustrating correlation patterns among lipids.
Fig 5

Pearson’s correlation network illustrating correlation patterns among lipids.

Correlations found among lipid data are shown as attribute circular layout. The blue lines indicate lipids whose occurrence negatively correlate, while the pink lines denote lipids that show a positive correlation. Line thickness indicates the strength of correlation. A permissive threshold level of ≥ | ± 0.75| was set for the Pearson’s correlation coefficient and a qFDR of <0.05 was chosen for the identification of significant correlations within the lipid data.

The most notable observation was the very strong negative correlations between CE (20:3) and other lipids, many of which were uncharacterized (Fig 5). Negative correlations were also found with CE (20:3) and TG (54:1), TG (52:1), but most notably with TG (50.1). Based on the number of carbons in their acyl chains, it seems likely that these three TGs are closely connected in the TG biosynthetic pathway. Negative correlations among metabolites are sometimes due to regulatory mechanisms controlling metabolites that have conserved moieties, or metabolites connected by enzymes with high variance which can cause negative correlations between its substrate and product metabolites [48, 53]. Based on the simplified pathway in Fig 1B, a relationship between the CE and TGs can be envisaged. Not all of the relationships between CE (20:3) and other lipids were negative. There was a positive correlation between CE (20:3) and TG (58:2), TG (58:4), and FA (22:0) respectively. Most triacylglycerols were positively associated with other lipids except, TG (52:0) and TG (50:1), which showed a negative correlation (Fig 5).

Comparing the primary, secondary metabolites and lipid profiles among C. canephora

Metabolomics and lipidomics are powerful strategies to qualitatively and quantitatively analyze a wide range of small molecules in a biological sample, which represent endpoints of genome expression [54]. The metabolomics data from our recent work [24] was subjected to PLS-DA analysis for both primary and secondary metabolites and compared to the lipidomics profile of these coffees. Our aim was to determine the diversity of metabolism across genotypes. We expected greater similarity in primary metabolism across genotypes, and that specialized metabolites would be more divergent [55]. Lipids are primary metabolites but in coffee beans, it is not known the extent to which they are part of the fundamental or specialized metabolic machinery.

We found that there was clearer differentiation between ‘Niaouli’ genotypes which clustered together and away from the ‘Kouillou’ and ‘Java Robusta’ when secondary metabolites were examined (Fig 6B) compared to primary metabolites (Fig 6A). No clear difference was observed among the groups with PLS-DA on lipid (Fig 6C) but SPLS-DA—(Fig 6D) was able to reveal more similarity between ‘Java’ and the ‘Niaouli’ types, while the ‘Kouillou’ genotypes were separate from the rest. This is noteworthy because of the many studies that show 1) that lipids are valuable components of the sensory attributes of coffee and 2) that they are excellent at discriminating genotypes and growing environment [8, 9, 14].

Differentiating among the C. canephora genotypes using multivariate analyses.
Fig 6

Differentiating among the C. canephora genotypes using multivariate analyses.

A) Primary metabolites analyzed by PLS-DA, B) Secondary metabolites analyzed by PLS-DA, C) lipids analyzed by PLS and D) lipids analyzed by Sparse-PLS. (PCA = Principal Component Analysis, PLS-DA = Partial Least Square–discriminate analysis. Genotypes 1–3 belong to the ‘Niaouli’ group i.e. Nia_1, Nia_2 and Nia_3, Genotypes 4–5 are members of the ‘Kouillou’ group i.e. C111 and C36, and Genotype 6 is ‘Java Robusta’ i.e. T1049. The primary and secondary metabolite data were derived from Anagbogu et al., 2019b [24].

Correlating lipidomic profiles to genotypic variation of coffee

In spite of the importance of lipids in coffee cup quality improvement, as far as we know, it has not been fully integrated into coffee breeding strategies except in the study by Sant’Ana et al. [23]. The extracted DNA of the coffee genotypes in this study was analyzed using Genotype-by-Sequencing and their relatedness compared using hierarchical clustering. The lipidomics data was also analyzed similarly. Overall, there was good agreement between the DNA- and metabolite-based clustering (Fig 7A). From the genetic perspective, we observed three clusters I, II and III representing ‘Niaouli’, ‘Kouillou’, and ‘Java’ respectively (Fig 7A). The lipidomic analysis of the genotypes based on the mean values (Fig 7B) revealed two main clusters, separating the ‘Niaouli’ from ‘Kouillou’/‘Java Robusta’ respectively (Fig 7B). However there were notable exceptions: SNP analysis indicated that C36 and C111 (‘Kouillou’) were highly related, while T1049 (‘Java Robusta’) was divergent, in contrast, lipidomics indicted that T1049 and C111 formed a distinguishable subgroup (cluster IIa) while C36 was more distant (Fig 7C).

Comparative analysis of the genomics and lipidomic analysis based on their hierarchical structure.
Fig 7

Comparative analysis of the genomics and lipidomic analysis based on their hierarchical structure.

A) genetic analysis based on ~100,000 SNP, B) lipidomic analysis indicating the mean value of each genotype from five biological replicates and C) lipidomic analysis including all replicates. The genotypes and replicates names were described in Table 1 as follows: Nia_11- Nia_15 were Genotype 1, Nia_21-Nia_25 were genotype 2, Nia_31-Nia35 were Genotype 3 and belong to the ‘Niaouli’ group; Members of the ‘Kouilou’ group have been referred to as Genotypes 4 and 5 in this study and included C111_1-C111_5 and C36_1 to C36_5. ‘Java Robusta’ coffee has been described as Genotype 6 included T1049_1-T1049_5.

All lipid analyses (Fig 7B and 7C) revealed a clear disparity between the ‘Niaouli’ and ‘Kouilou/ Java Robusta’ genotypes. We suggest that there are two lipid diversity structures within the population studied which can be used for future coffee improvement in Nigeria. From this study, FA(16:0) and FA(20:0) were higher (p<0.05) in the ‘Kouillou’ genotypes (Table 2), and these fatty acids are among the precursors of good quality coffee (Table 2). A previous study identified the fatty acid desaturase (FAD2) encoded gene as one lipid-associated SNP in coffee [23]. Further study of our SNP data to detect the sources or causes of the distinct lipid difference between ‘Kouilou’ and ‘Niaouli’ is needed. For example, incorporating alleles for higher accumulation of FA(16:0), which is associated with high sensory attributes into the genome of cultivated variety (‘Niaouli’) (Table 2) would be of great help towards its quality improvement.

Conclusion

There is much interest in using lipid profiles to identify biomarkers for coffee cup quality improvement. In this study, we used various data mining tools to better understand variations in lipid profiles of Southwestern Nigerian coffee genotypes, and how these lipids relate to other metabolites, and to other studies done on coffee. Insight into the gene expression of coffee bean lipids would help to elucidate if there was an environmental effect on relative lipid abundance and in the proper identification of candidate genes. Despite this limitation, our findings can be summarized as following: 1) Lipids that could discriminate among the genotypes studied were identified, even though there was less variability for lipids among genotypes compared with other types of metabolites. 2) ‘Java Robusta’ and ‘Kouillou’ especially ‘Kouillou’ genotype C36, contained more saturated fatty acyls species such as FA(26:0), FA(16:0), FA(20:0), while ‘Niaouli’ contained more unsaturated FAs such as FA(16:1), FA(20:1). It should be interesting to study the SNPs of gene such as FADs which are responsible for desaturation in coffee bean. 3) Unexpected observations were made with respect to cholesterol ester (20:3) which was one of the least abundant lipids in our analysis. The levels of CE(20:3) and some TG compounds, showed some of the strongest negative correlations. Further, CE(20:3) can discriminate between ‘Niaouli’ and ‘‘Kouillou’ and Java Robusta’ indicating its value as a genotypic marker. 4) Lipids as a class of compounds in this analysis did not discriminate among the genotype studied, as well as secondary metabolites but grouped genotypes into two lipid profile structures. 5) There was a good correlation between the classification of genotypes based on SNP variability, and the lipidomic profile. In addition, the two divergent lipid profiles correlating with the C. canephora genotypes will be tapped and included in coffee cup quality improvement program. The lipid species TG (56:2), TG (58:4), and PE (34:2) showed the highest discriminatory power in distinguishing ‘Niaouli’ from ‘Kouillou/Java Robusta’. However, the saturated fatty acids, palmitic (16:0) and arachidic acids (20:0), were higher in ‘Kouillou/Java Robusta’. Hybridization between the two groups of coffee would be highly recommended to generate genotypes with a more favorable lipid profile for coffee cup quality improvement. This study was an important starting point for coffee quality improvement in Nigeria. Although we found some divergent lipids, generally, lipid diversity was low among the genotypes studied, and there will be a need to initiate genetic acquisition from other coffee producing countries through collaboration.

Acknowledgements

We appreciate the International Institute of Tropical Agriculture and the University of California Davis, USA, for providing free access to their facilities. We also thank Drs. May Thitisaksakul, Karin Albornoz, Shaoyun Dong, and Jorge Dubcovsky at UC Davis, USA, for their advice and technical support.

References

Oxford Dictionary of Biochemistry and Molecular Biology. T.Atwood; P.Campbell; H.Parish; A.Smith; F.Vella; J.Stirling, Eds.; Oxford University Press: Oxford, England, 2006.

K.Speer; I.Kölling-Speer The lipid fraction of the coffee bean. Brazilian Journal of Plant Physiology 2006, 18, 201216.

P.Folstar Lipids In Clarke, R.J.Coffee, R.Macrae, Eds. Elsevier Applied Science: Essex, England, 1989; Vol. Vol I: Chemistry, pp. 203222.

L.Navarini; M.Ferrari; F.S.Liverani; L.Liggieri; F.Ravera Dynamic tensiometric characterization of espresso coffee beverage. Food Hydrocolloids 2004, 18, 387393.

L.Piazza; J.Gigli; A.Bulbarello Interfacial rheology study of espresso coffee foam structure and properties. Journal of Food Engineering 2008, 84, 420429.

R.Gamboa-Becerra; M.C.Hernandez-Hernandez; O.Gonzalez-Rios; M.L.Suarez-Quiroz; E.Galvez-Ponce; J.J.Ordaz-Ortiz et al Metabolomic Markers for the Early Selection of Coffea canephora Plants with Desirable Cup Quality Traits. Metabolites 2019, 9 10.3390/metabo9100214

A.C.L.Amorim; A.M.C.Hovell; A.C.Pinto; M.N.Eberlin; N.P.Arruda; E.J.Pereira; et al Green and Roasted Arabica Coffees Differentiated by Ripeness, Process and Cup Quality via Electrospray Ionization Mass Spectrometry Fingerprinting. Journal of the Brazilian Chemical Society 2009, 20, 313321.

B.Bertrand; D.Villarreal; A.Laffargue; H.Posada; P.Lashermes; S.Dussert Comparison of the Effectiveness of Fatty Acids, Chlorogenic Acids, and Elements for the Chemometric Discrimination of Coffee (Coffea arabica L.) Varieties and Growing Origins. Journal of Agricultural and Food Chemistry 2008, 56, 22732280, 10.1021/jf073314f

L.P.Figueiredo; F.M.Borém; F.C.Ribeiro; G.S.Giomo; J.Henrique da Silva Taveira; M.RMalta. Fatty acid profiles and parameters of quality of specialty coffees produced in different Brazilian regions. African Journal of Agricultural Research 2015, 10, 34843493, 10.5897/AJAR2015.9697.

10 

A.S.Franca; J.C.F.Mendonca; S.D.Oliveira Composition of green and roasted coffees of different cup qualities. Lwt-Food Science and Technology 2005, 38, 709715.

11 

M.J.Martin; F.Pablos; A.G.Gonzalez; M.S.Valdenebro; M.Leon-Camacho Fatty acid profiles as discriminant parameters for coffee varieties differentiation. Talanta 2001, 54, 291297. 10.1016/s0039-9140(00)00647-0

12 

E.Nebesny; G.Budryn Evaluation of sensory attributes of coffee brews from robusta coffee roasted under different conditions. European Food Research and Technology 2006, 224, 159165.

13 

A.Farah; M.C.Monteiro; V.Calado; A.S.Franca; L.C.Trugo Correlation between cup quality and chemical attributes of Brazilian coffee. Food Chemistry 2006, 98, 373380.

14 

B.Mehari; M.Redi-Abshiro; B.S.Chandravanshi; S.Combrinck; R.McCrindle; M.Atlabachew GC-MS profiling of fatty acids in green coffee (Coffea arabica L.) beans and chemometric modeling for tracing geographical origins from Ethiopia. Journal of the Science of Food and Agriculture 2019, 99, 38113823, 10.1002/jsfa.9603

15 

F.J.M.Novaes; S.S.Oigman; R.O.M.A.de Souza; C.M.Rezende; F.R.D.Neto New approaches on the analyses of thermolabile coffee diterpenes by gas chromatography and its relationship with cup quality. Talanta 2015, 139, 159166. 10.1016/j.talanta.2014.12.025

16 

T.Joet; A.Laffargue; F.Descroix; S.Doulbeau; B.Bertrand; A.de Kochko; S.Dussert Influence of environmental factors, wet processing and their interactions on the biochemical composition of green Arabica coffee beans. Food Chemistry 2010, 118, 693701.

17 

S.Agatonovic-Kustrin; D.W.Morton; V.Smirnov; A.Petukhov; V.Gegechkori; V.Kuzina; et al Analytical Strategies in Lipidomics for Discovery of Functional Biomarkers from Human Saliva. Disease Markers 2019, 2019 10.1155/2019/6741518

18 

L.Cossignani; D.Montesano; M.S.Simonetti; F.Blasi Authentication of Coffea arabica according to Triacylglycerol Stereospecific Composition. Journal of Analytical Methods in Chemistry 2016 10.1155/2016/7482620

19 

R.Romano; A.Santini; L.Le Grottaglie; N.Manzo; A.Visconti; A.Ritieni Identification markers based on fatty acid composition to differentiate between roasted Arabica and Canephora (Robusta) coffee varieties in mixtures. Journal of Food Composition and Analysis 2014, 35, 19.

20 

G.G.Harrigan; S.Martino-Catt; K.C.Glenn Metabolomics, metabolic diversity and genetic variation in crops. Metabolomics 2007, 3, 259272, 10.1007/s11306-007-0076-0

21 

D.M.Beckles; U.Roessner Plant metabolomics: Applications and opportunities for agricultural biotechnology. Plant Biotechnology and Agriculture: Prospects for the 21st Century 2012, 10.1016/B978-0-12-381466-1.00005–5, 67–81, 10.1016/B978-0-12-381466-1.00005-5

22 

O.Fiehn; J.Kopka; P.Dormann; T.Altmann; R.N.Trethewey; L.Willmitzer Metabolite profiling for plant functional genomics. Nature Biotechnology 2000, 18, 11571161. 10.1038/81137

23 

G.C.Sant’Ana; L.F.P.Pereira; D.Pot; S.T.Ivamoto; D.S.Domingues; R.V.Ferreira; et al Genome-wide association study reveals candidate genes influencing lipids and diterpenes contents in Coffea arabica L. Scientific Reports 2018, 8 10.1038/s41598-017-18800-1

24 

C.F.Anagbogu; C.O.Ilori; R.Bhattacharjee; O.O.Olaniyi; D.M.Beckles Gas Chromatography-Mass Spectrometry and Single Nucleotide Polymorphism-Genotype-By-Sequencing Analyses Reveal the Bean Chemical Profiles and Relatedness of Coffea canephora Genotypes in Nigeria. Plants-Basel 2019, 8 10.3390/plants8100425

25 

S.Scalabrin; L.Toniutti; G.Di Gaspero; D.Scaglione; G.Magris; M.Vidotto; et al A single polyploidization event at the origin of the tetraploid genome of Coffea arabica is responsible for the extremely low genetic variation in wild and cultivated germplasm. Scientific Reports 2020, 10, 4642, 10.1038/s41598-020-61216-7

26 

C.F.Anagbogu; R.Bhattacharjee; C.Ilori; P.Tongyoo; K.E.Dada; A.A.Muyiwa; et al Genetic diversity and re-classification of coffee (Coffea canephora Pierre ex A. Froehner) from South Western Nigeria through genotyping-by-sequencing-single nucleotide polymorphism analysis. Genetic Resources and Crop Evolution 2019, 66, 685696.

27 

V.Matyash; G.Liebisch; T.V.Kurzchalia; A.Shevchenko; D.Schwudke Lipid extraction by methyl-tert-butyl ether for high-throughput lipidomics. Journal of Lipid Research 2008, 49, 11371146. 10.1194/jlr.D700041-JLR200

28 

T.Cajka; O.Fiehn Increasing lipidomic coverage by selecting optimal mobile-phase modifiers in LC-MS of blood plasma. Metabolomics 2016, 12, doi:ARTN 34 10.1007/s11306-015-0929-x

29 

T.Cajka; J.T.Smilowitz; O.Fiehn Validating Quantitative Untargeted Lipidomics Across Nine Liquid Chromatography-High-Resolution Mass Spectrometry Platforms. Analytical Chemistry 2017, 89, 1236012368, 10.1021/acs.analchem.7b03404

30 

T.Kind; G.Wohlgemuth; D.Y.Lee; Y.Lu; M.Palazoglu; S.Shahbaz; et al FiehnLib: Mass Spectral and Retention Index Libraries for Metabolomics Based on Quadrupole and Time-of-Flight Gas Chromatography/Mass Spectrometry. Analytical Chemistry 2009, 81, 1003810048. 10.1021/ac9019522

31 

J.G.Xia; I.V.Sinelnikov; B.Han; D.S.Wishart MetaboAnalyst 3.0-making metabolomics more meaningful. Nucleic Acids Research 2015, 43, W251W257. 10.1093/nar/gkv380

32 

W.R.Revelle psych: Procedures for Personality and Psychological Research., 2017.

33 

H.Wickham Reshaping data with the reshape package. Journal of Statistical Software 2007, 21, 120.

34 

S.Basu; W.Duren; C.R.Evans; C.F.Burant; G.Michailidis; A.Karnovsky Sparse network modeling and metscape-based visualization methods for the analysis of large-scale metabolomics data. Bioinformatics 2017, 33, 15451553. 10.1093/bioinformatics/btx012

35 

P.Shannon; A.Markiel; O.Ozier; N.S.Baliga; J.T.Wang; D.Ramage; et al Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Research 2003, 13, 24982504. 10.1101/gr.1239303

36 

E.Fahy; D.Cotter; M.Sud; S.Subramaniam Lipid classification, structures and tools. Biochimica Et Biophysica Acta-Molecular and Cell Biology of Lipids 2011, 1811, 637647. 10.1016/j.bbalip.2011.06.009

37 

Z.-Y.Du; C.Benning Triacylglycerol Accumulation in Photosynthetic Cells in Plants and Algae In Lipids in Plant and Algae Development, Y.Nakamura, Y.Li-Beisson, Eds. Springer International Publishing: Cham, 2016; 10.1007/978-3-319-25979-6_8 pp. 179205

38 

X.-R.Zhou; D.L.Callahan; P.Shrestha; Q.Liu; J.R.Petrie; S.P.Singh Lipidomic analysis of Arabidopsis seed genetically engineered to contain DHA. Frontiers in plant science 2014, 5, 10.3389/fpls.2014.00419

39 

R.Caspi; R.Billington; C.A.Fulcher; I.M.Keseler; A.Kothari; M.Krummenacker; et al The MetaCyc database of metabolic pathways and enzymes. Nucleic Acids Research 2017, 46, D633D639, 10.1093/nar/gkx935

40 

H.Yang; D.H.Suh; D.H.Kim; E.S.Jung; K.-H.Liu; C.H.Lee; et al Metabolomic and lipidomic analysis of the effect of pioglitazone on hepatic steatosis in a rat model of obese Type 2 diabetes. British Journal of Pharmacology 2018, 175, 36103625, 10.1111/bph.14434

41 

M.J.Chen; J.J.Thelen ACYL-LIPID DESATURASE2 Is Required for Chilling and Freezing Tolerance in Arabidopsis. Plant Cell 2013, 25, 14301444. 10.1105/tpc.113.111179

42 

L.Zhou; A.Khalil; F.Bindler; M.J.Zhao; C.Marcic; S.Ennahar; et al Effect of heat treatment on the content of individual phospholipids in coffee beans. Food Chemistry 2013, 141, 38463850. 10.1016/j.foodchem.2013.06.056

43 

M.Korber; I.Klein; G.Daum Steryl ester synthesis, storage and hydrolysis: A contribution to sterol homeostasis. Biochimica et Biophysica Acta (BBA)—Molecular and Cell Biology of Lipids 2017, 1862, 15341545, 10.1016/j.bbalip.2017.09.002

44 

P.D.Sonawane; J.Pollier; S.Panda; J.Szymanski; H.Massalha; M.Yona; et al Plant cholesterol biosynthetic pathway overlaps with phytosterol metabolism. Nature Plants 2017, 3, 10.1038/nplants.2016.1205

45 

M.S.A.de Lima; L.A.Rocha; E.F.Molina; B.L.Caetano; L.Marcal; C.Mello; et al Thermoanalysis of soybean oil extracted by two methods. Quimica Nova 2008, 31, 527529.

46 

Y.Liu; G.L.Wang; X.M.Wang Role of Aminoalcoholphosphotransferases 1 and 2 in Phospholipid Homeostasis in Arabidopsis. Plant Cell 2015, 27, 15121528. 10.1105/tpc.15.00180

47 

M.B.D.Scholz; C.S.G.Kitzberger; N.F.Pagiatto; L.F.P.Pereira; F.Davrieux; D.Pot; et al Chemical composition in wild ethiopian Arabica coffee accessions. Euphytica 2016, 209, 429438.

48 

D.Camacho; A.de la Fuente; P.Mendes The origin of correlations in metabolomics data. Metabolomics 2005, 1, 5363, 10.1007/s11306-005-1107-3

49 

B.S.Stamova; U.Roessner; S.Suren; D.Laudencia-Chingcuanco; A.Bacic; D.M.Beckles Metabolic profiling of transgenic wheat over-expressing the high-molecular-weight Dx5 glutenin subunit. Metabolomics 2009, 5, 239252, 10.1007/s11306-008-0146-y

50 

R.Steuer Review: On the analysis and interpretation of correlations in metabolomic data. Briefings in Bioinformatics 2006, 7, 151158, 10.1093/bib/bbl009

51 

X.L.Guan; C.M.Souza; H.Pichler; G.Dewhurst; O.Schaad; K.Kajiwara; et al Functional Interactions between Sphingolipids and Sterols in Biological Membranes Regulating Cell Physiology. Molecular Biology of the Cell 2009, 20, 20832095. 10.1091/mbc.e08-11-1126

52 

A.Shevchenko; K.Simons Lipidomics: coming to grips with lipid diversity. Nature Reviews Molecular Cell Biology 2010, 11, 593598. 10.1038/nrm2934

53 

K.Morgenthal; S.Wienkoop; M.Scholz; J.Selbig; W.Weckwerth Correlative GC-TOF-MS-based metabolite profiling and LC-MS-based protein profiling reveal time-related systemic regulation of metabolite-protein networks and improve pattern recognition for multiple biomarker selection. Metabolomics 2005, 1, 109121.

54 

J.K.Nicholson; J.Connelly; J.C.Lindon; E.Holmes Metabonomics: a platform for studying drug toxicity and gene function. Nature Reviews Drug Discovery 2002, 1, 153161. 10.1038/nrd728

55 

D.M.Beckles; U.Roessner 5—Plant metabolomics: Applications and opportunities for agricultural biotechnology In Plant Biotechnology and Agriculture, A.Altman, P.M.Hasegawa, Eds. Academic Press: San Diego, 2012; 10.1016/B978-0-12-381466-1.00005-5pp. 67–81.