PLoS ONE
Home Landscape of epigenetically regulated lncRNAs and DNA methylation in smokers with lung adenocarcinoma
Landscape of epigenetically regulated lncRNAs and DNA methylation in smokers with lung adenocarcinoma
Landscape of epigenetically regulated lncRNAs and DNA methylation in smokers with lung adenocarcinoma

Competing Interests: The authors have declared that no competing interests exist.

Article Type: Research Article Article History
  • Altmetric
Abstract

In this study, we identified long non-coding RNAs (lncRNAs) associated with DNA methylation in lung adenocarcinoma (LUAD) using clinical and methylation/expression data from 184 qualified LUAD tissue samples and 21 normal lung-tissue samples from The Cancer Genome Atlas (TCGA). We identified 1865 differentially expressed genes that correlated negatively with the methylation profiles of normal lung tissues, never-smoker LUAD tissues and smoker LUAD tissues, while 1079 differentially expressed lncRNAs were identified using the same criteria. These transcripts were integrated using ingenuity pathway analysis to determine significant pathways directly related to cancer, suggesting that lncRNAs play a crucial role in carcinogenesis. When comparing normal lung tissues and smoker LUAD tissues, 86 candidate genes were identified, including six lncRNAs. Of the 43 candidate genes revealed by comparing never-smoker LUAD tissues and smoker LUAD tissues, 13 were also different when compared to normal lung tissues. We then investigated the expression of these genes using the Gene Expression of Normal and Tumor Tissues (GENT) and Methylation and Expression Database of Normal and Tumor Tissues (MENT) databases. We observed an inverse correlation between the expression of 13 genes in normal lung tissues and smoker LUAD tissues, and the expression of five genes between the never-smoker and smoker LUAD tissues. These findings were further validated in clinical specimens using bisulfite sequencing, revealing that AGR2, AURKB, FOXP3, and HMGA1 displayed borderline differences in methylation. Finally, we explored the functional connections between DNA methylation, lncRNAs, and gene expression to identify possible targets that may contribute toward the pathogenesis of cigarette smoking-associated LUAD. Together, our findings suggested that differentially expressed lncRNAs and their target transcripts could serve as potential biomarkers for LUAD.

Jung,Lee,Kim,Ahn,and Tao: Landscape of epigenetically regulated lncRNAs and DNA methylation in smokers with lung adenocarcinoma

Introduction

Lung cancer is the leading cause of cancer-related deaths worldwide [1]. Cigarette smoke (CS) exposure is known to affect epigenetic regulation [2,3] and has been established as a critical factor in the development of lung cancer [48]. Epigenetic modifications include DNA methylation, histone modifications, and the modulation of non-coding RNAs [911], among which CS is widely known to alter DNA methylation and thereby cause lung cancer [1215].

Long non-coding RNAs (lncRNAs) are transcripts of over 200 nucleotides in length that lack or have a limited protein-coding potential [16,17]. Recently, lncRNAs have received considerable attention as epigenetic regulators [18,19], with an increasing number of studies implicating lncRNAs in several functions related to carcinogenesis and the progression of lung cancer [2024]. Until now, only a limited number of CS-associated lncRNAs have been reported in lung cancer, namely, Smoke and Cancer-associated LncRNA–1 (SCAL1), HOX Transcript Antisense RNA (HOTAIR), H19, and Metastasis-Associated Lung Adenocarcinoma Transcript 1 (MALAT1) [2530]. Since these lncRNAs have not been explored thoroughly with CS-related epigenetic regulation, further studies are required to identify additional lncRNAs and their possible roles in epigenetic regulation.

In this study, we aimed to investigate the relationship between DNA methylation and lncRNA expression in lung adenocarcinoma (LUAD) and thereby elucidate the landscape of lncRNAs associated with DNA methylation-mediated regulation in smokers.

Materials and methods

Datasets

Level 3 expression and matched DNA methylation data for LUAD were downloaded from The Cancer Genome Atlas (TCGA) data portal (https://portal.gdc.cancer.gov/) in January 2017. Only patients with available smoking history with their clinical information were included, amounting to 184 LUAD and 21 normal lung tissues with fully characterized expression and matched DNA methylation data assayed using Illumina Infinium Human Methylation 450K.

For the validation cohort, 76 samples were collected from patients with LUAD who had undergone surgery at the Korean University Medical Center between 2010 and 2013 (Seoul, Korea). Samples were fixed and processed according to clinical standard operating procedures. The specimens and data used in this study were provided by Korea University Anam Hospital and approved by the appropriate Institutional Review Board (2014AN0393).

Differential lncRNA expression and DNA methylation

Our analysis strategy is depicted in Fig 1. Differentially expressed genes (DEGs), differentially methylated regions (DMRs), and differentially expressed lncRNAs (DE-lncRNAs) were identified between normal lung, smoker LUAD, and never-smoker LUAD tissues. Ingenuity pathway analysis (IPA) was used to map candidate lncRNAs from the DE-lncRNAs. The matched DEGs and DMRs included those whose change in DNA methylation was inversely correlated with DEG expression (p < 0.05).

Schematic diagram of the analysis strategy used in this study.
Fig 1

Schematic diagram of the analysis strategy used in this study.

Integrated analysis of DE-lncRNAs associated with DMRs

Lists of significant DEGs generated from TCGA data were subjected to IPA using web-based software from Ingenuity Systems® (Qiagen, Redwood City, CA, USA) to produce a gene interaction network. DE-lncRNAs were subjected to biological process enrichment analyses. Functional enrichment analysis was also performed on these networks to understand the significance of the biological functions and/or disease phenotypes of the genes.

Validation analysis using MENT and GENT

To validate the target DMRs and DE-lncRNAs identified by TCGA data analysis, we utilized the web-accessible public gene expression datasets, Methylation and Expression Database of Normal and Tumor Tissues (MENT; http://mgrc.kribb.re.kr:8080/MENT/) and Gene Expression of Normal and Tumor Tissues (GENT; http://medicalgenome.kribb.re.kr/GENT/). GENT contains the gene expression profiles of 32 types of human cancer tissues and normal tissues generated using an Affymetrix U133A or U133Plus2 microarray platform with consistent data processing [31]. MENT contains DNA methylation and gene expression patterns obtained using an Illumina HumanMethylation27 BeadChip or GoldenGate Methylation Cancer Panel I [32].

Validation analysis using bisulfite sequencing

Putative genes were validated using bisulfite sequencing in a validation cohort consisting of 76 samples. DNA was quantified using Picogreen (Invitrogen, California, USA) according to the manufacturer’s protocol. Briefly, 1 μg of genomic DNA was bisulfite-converted using EZ DNA Methylation according to manufacturer’s protocol (Zymo Research, California, USA). The regions of interest were amplified by PCR using a KOD-Multi & EPi (Toyobo, Osaka Japan), purified using QIAquick PCR columns (Qiagen, Venlo, Netherlands‎), quantified using Picogreen (Invitrogen), and verified using agarose gel electrophoresis. Libraries were prepared using an Illumina TruSeq Nano DNA sample prep kit (Illumina) according to the manufacturer’s instructions and then quantified by qPCR using a CFX96 Real-Time System (Biorad, California, USA). After normalization, the prepared library was sequenced using a Miseq system (Illumina) with 300 bp paired-end reads.

Potential sequencing adapters and low-quality bases in the raw reads were trimmed using Skewer [33] and the remaining high-quality reads were mapped to the reference genome using BS-seeker2 software [34] with a 10% mis-mapping rate. To compare the CpG methylation profiles of different sample groups, only the CpG site values were selected and the Kruskal-Wallis test was performed.

Statistical analysis

To identify methylation markers for detecting CS-associated LUAD, we evaluated the distribution of mRNA expression and DNA methylation levels for each CpG site in normal lung, never-smoker LUAD, and smoker LUAD tissues. For candidate DMRs, pairwise comparisons were conducted to identify the genes that best distinguished each group.

Results

Identification of DMRs and DE-lncRNAs

To investigate the DNA methylation patterns in LUAD related to CS history, we analyzed publicly available Human Methylation 450k TCGA data that measured methylation levels in normal lung and LUAD tissues. The data sets used in this study are summarized in Table 1. Three comparisons were made: 1) normal lung vs. smoker LUAD tissues, 2) normal lung vs. never-smoker LUAD tissues, and 3) never-smoker LUAD vs. smoker LUAD, identifying 8,513 DEGs, 24,783 DMRs, and 2,798 DE-lncRNAs (Fig 2A–2C). Among the 2,798 DE-lncRNAs, 1,079 were mapped by IPA (Fig 2C), while 1,865 differentially methylated candidate genes with negative correlation were identified (Fig 2D) and annotated (S1 Fig) from the DEGs and DMRs.

Venn diagrams illustrating the number of differentially expressed transcripts for different pairwise comparisons.
Fig 2

Venn diagrams illustrating the number of differentially expressed transcripts for different pairwise comparisons.

(A) Venn diagram of 8,513 differentially expressed genes (DEGs; p < 0.05; |(fold change)| ≥ 2). (B) Venn diagram of 24,783 differentially methylated regions (DMRs; p < 0.05). (C) Venn diagram of 2,798 differentially expressed lncRNAs (DE-lncRNAs; p < 0.05; |(fold change)| ≥ 2) and 1,079 IPA-mapped lncRNAs. (D) Final 1,865 candidate genes displaying inverse correlation.

Table 1
Characteristics of TCGA and validation cohorts.
ParameterTCGA cohortValidation cohort
Smoker (n = 159)Never-smoker (n = 25)Total (n = 184)Smoker (n = 24)Never-smoker (n = 52)Total (n = 76)
GenderMale70 (89.74%)8 (10.26%)7822 (78.57%)6 (21.43%)28
Female89 (83.96%)17 (16.04%)1062 (4.17%)46 (95.83%)48
AgeMedian676967696064
RaceWhite126 (88.11%)17 (11.89%)143000
Asian1 (100%)0 (0%)124 (31.58%)52 (68.42%)76
Black or African American7 (100%)0 (0%)7000
Unknown25 (75.76%)8 (24.24%)33000
Vital statusAlive94 (87.85%)13 (12.15%)10722 (32.84%)45 (67.16%)67
Dead65 (84.42%)12 (15.58%)772 (22.22%)7 (77.78%)9
Tumor stageI87 (87%)13 (13%)1003 (27.27%)8 (72.73%)11
II31 (79.49%)8 (20.51%)3914 (27.45%)37 (72.55%)51
III34 (89.47%)4 (10.53%)387 (53.85%)6 (46.15%)13
IV6 (100%)0601 (100%)1
NA1 (100%)01000
KRASWild101 (82.79%)21 (17.21%)12212 (24%)38 (76%)50
Mutant58 (93.55%)4 (6.45%)624 (66.67%)2 (33.33%)6
NA0008 (40%)12 (60%)20
EGFRWild138 (87.34%)20 (12.66%)15818 (48.65%)19 (51.35%)37
Mutant21 (80.77%)5 (19.23%)266 (15.38%)33 (84.62%)39

TCGA, The Cancer Genome Atlas; NA, not available; KRAS, Kirsten rat sarcoma 2 viral oncogene homolog; EGFR, Epidermal Growth Factor Receptor.

Pathway analysis and epigenetically regulated lncRNAs

A total of 1,865 DMRs were selected as candidate targets for the DE-lncRNAs. To determine the functions of these target genes and their potential network connections, we used IPA to identify the gene networks that may have been affected by these DE-lncRNA target genes (Fig 3). The top ten significant canonical pathways based on the DMRs and DE-lncRNAs are shown in S2 Fig and S1S3 Tables. Interactions between the DMRs and DE-lncRNAs were predicted using molecular networks based on the IPA molecular database. The most noticeable functional category between never-smoker and smoker LUAD tissues was the lipopolysaccharide (LPS)/IL-1 mediated inhibition of retinoid X receptors (RXR) function.

Network of epigenetically regulated genes identified using ingenuity pathway analysis.
Fig 3

Network of epigenetically regulated genes identified using ingenuity pathway analysis.

Each network was displayed graphically with genes or gene products as nodes (different shapes represent different functional classes of gene products) and lines indicating the biological relationships between nodes. The molecular network in normal lung vs. smoker LUAD tissues (A), normal lung vs. never-smoker LUAD tissues (B), and never-smoker LUAD vs. smoker LUAD tissues (C).

A total of 86 candidate genes including six lncRNAs were identified by comparing smoker LUAD and normal tissues. Of the 43 candidate genes identified by comparing never-smoker LUAD and smoker LUAD tissues, 13 also displayed differences when compared to normal tissues. Although the majority of top functional pathways and related molecules overlapped when comparing 1) normal lung vs. smoker LUAD tissues and 2) normal lung vs. never-smoker LUAD tissues, notable differences were observed when comparing smoker LUAD and never-smoker LUAD tissues, including the LPS/IL-1-mediated inhibition of RXR function and nicotine degradation III.

We identified six lncRNAs that were significantly differentially expressed in normal lung vs. smoker LUAD tissues (Table 2). Among these, HOTAIR, Synapsin II (SYN2), MALAT1, and H19 were uniquely expressed in smoker LUAD tissues, while CYP4A22 antisense RNA 1 (CYP4A22-AS1) and Lnc-MUC2-1 expression overlapped in normal lung vs. never-smoker LUAD tissues. In addition, four lncRNAs were significantly differentially expressed in never-smoker LUAD vs. smoker LUAD tissues (Table 3). These findings suggest that the lncRNAs may be involved in CS-induced epigenetic alterations in patients with LUAD.

Table 2
Epigenetically regulated lncRNAs in normal lung vs. smoker LUAD tissues.
Gene IDRegulationFDRChr.ClassStrand
CYP4A22-AS1*Up7.E-091Intergenic, antisense-
Lnc-MUC2-1*Up2.E-0411Intergenic+
HOTAIRUp2.E-0712Intergenic, antisense-
SYN2Down9.E-143Sense-overlapping+
MALAT1Up9.E-0411Antisense+
H19Up1.E-0311Intergenic-

FDR, false discovery rate; Chr., chromosome.

* LncRNAs whose expression overlapped with normal lung vs. never-smoker LUAD tissues.

Table 3
Gene validation between never-smoker LUAD and smoker LUAD tissues.
GeneLog fold change (expression)P value (methylation)LncRNAGENT and MENT matching
SBSN6.49.E-14O
RP11-474D1.36.04.E-02O
MSLNL2.59.E-03
SLC3A12.85.E-02O
MSMB3.81.E-04O
ATP11AUN3.61.E-03O
TCP112.36.E-06
KRTDAP3.48.E-07
ALDH3A12.14.E-02
ADAM64.32.E-04OO
CYP4F32.12.E-09
CTC-518B2.92.09.E-07O
ADIPOQ2.62.E-03O

SBSN, suprabasin; MSLNL, mesothelin like; SLC3A1, solute carrier family 3 member 1; MSMB, microseminoprotein beta; ATP11AUN, ATP11A upstream neighbor; TCP11, t-complex 11; KRTDAP, keratinocyte differentiation associated protein; ALDH3A1, aldehyde dehydrogenase 3 family member A1; ADAM6, ADAM metallopeptidase domain 6; CYP4F3, cytochrome P450 family 4 subfamily F member 3; ADIPOQ, adiponectin, C1Q and collagen domain containing.

Validation of gene expression profiles using MENT and GENT

First, we investigated the expression and methylation levels of 86 genes in normal lung vs. smoker LUAD tissues and 13 genes in never-smoker LUAD vs. smoker LUAD tissues using the GENT and MENT databases. When comparing the 86 genes in smoker LUAD and normal lung tissues, seven up-regulated and six down-regulated genes were inversely correlated with methylation (Table 4), while five of the 13 genes were inversely correlated with methylation in smoker LUAD compared to never-smoker LUAD tissues (Table 3).

Table 4
DEGs between normal lung and smoker LUAD tissues with inverse correlation in the GENT and MENT databases.
GeneRegulationChr.Start (bp)End (bp)Size (bases)
AURKBUp178,204,7318,210,7676,046
CAV1Down7116,524,785116,561,18536,401
EGFUp4109,912,883110,012,962100,080
KCNA5Down125,043,9195,046,7882,870
MMP13Up11102,942,992102,955,73412,743
TEKDown927,109,14127,230,178121,038
AGR2Up716,791,81116,833,43341,623
CCNB1Up569,167,01069,178,24511,236
FGF2Down4122,826,708122,898,23671,529
HMGA1Up634,236,80034,246,2319,432
HNF4AUp2044,355,70044,434,59678,897
SOX17Down854,457,93554,460,8962,962
TAL1Down147,216,29047,232,37316,084

Chr., chromosome; bp, base pair; AURKB, aurora kinase B; CAV1, caveolin 1; EGF, epidermal growth factor; KCNA5, potassium voltage-gated channel subfamily A member 5; MMP13, matrix metallopeptidase 13; TEK, TEK receptor tyrosine kinase; AGR2, anterior gradient 2; CCNB1, cyclin B1; FGF2, fibroblast growth factor 2; HMGA1, high mobility group AT-hook 1; HNF4A, hepatocyte nuclear factor 4 alpha; SOX17, SRY-box transcription factor 17; TAL1, TAL bHLH transcription factor 1.

Validation of gene expression profiles using bisulfite sequencing

Finally, we performed bisulfite sequencing on ADAM Metallopeptidase Domain 6 (ADAM6), Anterior Gradient 2 (AGR2), Aurora Kinase B (AURKB), budding uninhibited by benzimidazoles 1 homolog beta (BUB1B), Caveolin 1 (CAV1), Cyclin B1 (CCNB1), forkhead box P3(FOXP3), high mobility group AT-hook 1(HMGA1), Matrix metallopeptidase 13(MMP13), and Suprabasin (SBSN) using LUAD samples (S3 and S4 Figs, and Table 1). Four CpG sites in AGR2, AURKB, FOXP3, and HMGA1 displayed borderline significance (Fig 4 and Table 5).

Comparison of methylation levels in never-smoker and smoker LUAD tissues.
Fig 4

Comparison of methylation levels in never-smoker and smoker LUAD tissues.

(A) Box plot showing the average methylation score. (B) Methylation profile plot showing differentially methylated CpG sites (DMCpG) by absolute position (red, never-smoker LUAD; blue, smoker LUAD).

Table 5
DNA methylation differences for candidate genes in 52 never-smoker and 24 smoker LUAD tissues.
GeneChromosomeCpG siteMean beta valueP value
Never-smokerSmoker
ADAM614106,438, 1180.280.250.38
106,438,1440.300.260.33
106,438,1590.880.840.14
106,438,1640.340.330.75
106,438,1760.380.370.67
106,438,2190.790.810.68
106,438,2210.800.780.48
106,438,2310.350.320.55
106,438,2510.820.810.94
AGR2716,845,2200.300.390.07
16,845,3310.610.620.83
AURKB178,113,7020.040.050.22
8,113,7140.050.050.69
8,113,7200.060.070.88
8,113,7310.110.120.51
8,113,7550.110.160.14
8,113,7580.100.150.06
8,113,7620.120.140.70
8,113,7640.120.130.85
8,113,7790.110.090.37
8,113,8150.070.060.75
8,113,8180.070.070.76
8,113,8290.080.050.45
BUB1B1540,453,0060.080.090.50
40,453,0100.100.080.62
40,453,0230.080.080.96
40,453,0290.130.080.80
40,453,0340.140.100.73
40,453,0360.120.100.84
40,453,0910.130.140.72
40,453,1310.090.120.26
40,453,1330.080.131.00
40,453,1390.040.050.46
40,453,1410.010.040.24
CAV17116,164,4220.260.230.62
116,164,4360.260.240.55
116,164,5330.230.160.24
CCNB1568,462,6760.070.060.83
68,462,6900.080.080.20
68,462,7110.100.100.36
68,462,7140.100.120.09
68,462,7430.120.120.57
68,462,7560.100.090.93
68,462,7620.090.100.16
68,462,7680.090.080.99
68,462,7830.050.060.33
FOXP3X49,121,8650.870.910.08
49,121,9100.940.930.49
HMGA1634,206,1340.050.050.95
34,206,1520.280.280.36
34,206,1590.250.270.96
34,206,1750.290.300.80
34,206,1920.280.260.09
34,206,2140.380.360.30
34,206,2700.190.160.13
34,206,2770.300.270.06
MMP1311102,826,6800.590.580.90
SBSN1936,019,1080.930.930.84
36,019,1150.890.860.36
36,019,1620.760.730.47
36,019,1700.760.740.61
36,019,1980.530.520.67

ADAM6, ADAM metallopeptidase domain 6; AGR2, anterior gradient 2; AURKB, aurora kinase B; BUB1B, budding uninhibited by benzimidazoles 1 homolog beta; CAV1, caveolin 1; CCNB1, Cyclin B1; FOXP3, forkhead box P3; HMGA1, high mobility group AT-hook 1; MMP13, matrix metallopeptidase 13; SBSN, suprabasin.

Discussion

In this study, we integrated DNA methylation, lncRNA expression, and mRNA expression profiles from TCGA, identified biomarker candidates, and validated our findings using public datasets from the GENT and MENT databases as well as an external cohort. Together, our findings contribute toward our understanding of the interplay between lncRNAs and DNA methylation and provide a map of the epigenetic landscape of lung cancer. In addition, this study is the first to reveal the potential role of lncRNAs in CS-associated epigenetic regulation in LUAD.

Based on the findings of previous reports, we expected to find a significant difference in epigenetic alterations between smoker and never-smoker LUAD tissues. Consistently, we found differences in regulatory genes and identified ten lncRNAs: HOTAIR, SYN2, MALAT1, and H19 in smoker LUAD vs. normal lung tissues, CYP4A22-AS1 and Lnc-MUC2-1 in both smoker and never-smoker LUAD vs. normal lung tissues, and RP11-474D1.3, ATP11AUN, ADAM6, and CTC-518B2.9 in smoker vs. never-smoker LUAD tissues. And the main biochemical functions revealed by our analyses were inconsistent. For the differentially expressed transcripts in smoker LUAD tissues, the major enriched pathways were the coagulation system, granulocyte adhesion, and diapedesis, whereas the primary pathways for transcripts in never-smoker LUAD tissues were axonal guidance signaling and atherosclerosis signaling. GENT and MENT analysis in these two tissue types revealed five genes, including ADAM6 and SBSN, that displayed an inverse correlation between gene expression and methylation levels.

Until now, only a small number of lncRNAs have been identified in CS-associated lung cancer, several of which have been suggested as possible diagnostic and prognostic biomarkers. For instance, the novel lncRNA SCAL1 was reported to be overexpressed in lung cancer cell lines as a result of CS-induced oxidative stress [28]. Moreover, other studies have suggested that SCAL1 expression may be regulated by Nuclear Factor Erythroid 2-Related Factor (NRF2) and that it may mediate cytoprotective functions against CS-induced toxicity [28,35,36]. HOTAIR expression is also significantly up-regulated in lung cancer and correlates with metastasis and poor prognosis [3742], furthermore, Liu et al. found that HOTAIR up-regulation contributes toward CS-induced malignant transformation mediated by STAT3 signaling [27]. Elevated H19 expression has also been detected in lung cancer [4345] and its overexpression has been observed in smokers compared to never-smokers [46]. One in vitro study investigated CS-induced increases in H19 expression and attributed the increase to the mono-allelic up-regulation of normally expressed alleles [29]. In addition, high MALAT1 expression has been identified in metastatic lung cancer and was shown to be an independent prognostic indicator of early-stage tumors [47], and further studies have reported MALAT1 to be involved in CS-induced epithelial-mesenchymal transition and malignant transformation via Enhancer of Zeste Homolog 2 (EZH2), a well-known epigenetic regulator [30,4850].

The majority of previous studies have investigated possible molecular mechanisms and novel biomarkers associated with epigenetic changes using wet laboratory experiments; however, integrated analysis based on bioinformatics methods and prediction may be more efficient for translational research, but such studies are currently lacking. Since a single lncRNA targets numerous transcripts and a single transcript is also regulated by numerous lncRNAs, lncRNAs can induce various functional pathways and have complicated regulatory networks. Consequently, it is difficult to rank candidate lncRNAs during the experimental design and validation processes when exploring the functions of lncRNAs. Considering this complexity, integrating datasets could be an effective and promising approach to infer functional networks and verify potential targets. Indeed, utilizing datasets and developing computational models to predict lncRNA associations and functional annotations are currently emerging fields [51,52].

In this study, we used bioinformatics methods to identify potential targets and their functions that may play critical roles in the control of lung cancer. The most significantly different functional category between never-smoker and smoker LUAD tissues was the LPS/IL-1 mediated inhibition of RXR function. RXRs are retinoid receptors that play a crucial role in regulating the growth and differentiation of normal and tumor cells [53], while retinoids are known for their role as epigenetic modifiers [54]. Su Man et al. previously observed that the effect of RXR gene methylation on prognosis differed significantly between never-smokers and smokers, and suggested that methylation-associated RXR gene down-regulation may play different roles in lung carcinogenesis depending on smoking status [55,56]. To some extent, our findings are consistent with those of this previous study and emphasize the importance of the identified molecules.

Besides the well-known lncRNAs mentioned earlier, we identified other significant DE-lncRNAs in this study, including SYN2, RP11-474D1.3, ATP11AUN, ADAM6, and CTC-518B2.9; however, their molecular mechanisms in CS-induced LUAD remain largely unknown. Since our findings suggest possible associations between these lncRNAs and lung cancer, we believe that their specific functions should be characterized experimentally.

Despite the important findings we have described, this study had several limitations. Firstly, the patients included in TCGA database were mostly white, whereas the samples used for bisulfite sequencing validation were derived from Korean patients. Since genomic mutations such as epigenetic changes can differ between races [5759], these racial disparities may have affected our results. Secondly, the mechanisms of epigenetic regulation by lncRNAs in CS-induced lung cancer development were not confirmed as this can be challenging; however, experimental strategies such as genetically manipulating the lncRNA locus or deleting of the full-length lncRNA locus or its promoter sequence in vivo could provide further functional information. Thirdly, we assumed a negative correlation between DEGs and DMRs when searching for candidate genes since methylation levels are generally negatively correlated with the expression levels of nearby genes [60]. However, when gene expression is tightly regulated or 5-hydroxymethylcytosine (5hMc) activates transcription, this trend is reversed [61,62]. Since we excluded the possibility of this effect in this study, more comprehensive algorithms will be required to determine the diversity of crosstalk between methylation, expression, and regulation elements. Lastly, we did not analyze any other environmental factor except for smoking due to a lack of information regarding the occupation or dwelling of the patients whose samples were used in this study. Recent reports have strongly associated exposure to outdoor particulate matter (PM10) [63,64] or indoor high temperature cooking oil fumes [65] with lung cancer; therefore, controlling such factors that affect epigenetic alterations would provide more accurate results.

In summary, we identified dysregulated lncRNAs that mediate DNA methylation in CS-associated LUAD using integrated analyses. Although the roles of these lncRNAs in LUAD are currently unclear, our findings suggest that their molecular mechanisms warrant further investigation. Therefore, the continued investigation of the lncRNAs identified in this study will aid the development of guidelines to assess individual risk for lung cancer and its prevention.

Acknowledgements

The specimens and data used in this study were provided by Korea University Anam Hospital.

References

FBray, JFerlay, ISoerjomataram, RLSiegel, LATorre, AJemal. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394424. 10.3322/caac.21492

DZong, XLiu, JLi, ROuyang, PChen. The role of cigarette smoke-induced epigenetic alterations in inflammation. Epigenetics Chromatin. 2019;12(1):65. 10.1186/s13072-019-0311-8

LJBuro-Auriemma, JSalit, NRHackett, MSWalters, YStrulovici-Barel, MRStaudt, et al. Cigarette smoking induces small airway epithelial epigenetic changes with corresponding modulation of gene expression. Human Mol Genet. 2013;22(23):47264738. 10.1093/hmg/ddt326

ABryant, RJCerfolio. Differences in epidemiology, histology, and survival between cigarette smokers and never-smokers who develop non-small cell lung cancer. Chest. 2007;132(1):185192. 10.1378/chest.07-0442

SSHecht. Tobacco smoke carcinogens and lung cancer. J Natl Cancer Inst. 1999;91(14):11941210. 10.1093/jnci/91.14.1194

B-QLiu, RPeto, Z-MChen, JBoreham, Y-PWu, J-YLi, et al. Emerging tobacco hazards in China: 1. Retrospective proportional mortality study of one million deaths. BMJ. 1998;317(7170):14111422. 10.1136/bmj.317.7170.1411

RNProctor. Tobacco and the global lung cancer epidemic. Nat Rev Cancer. 2001;1(1):8286. 10.1038/35094091

RSaba, OHalytskyy, NSaleem, IAOliff. Buccal epithelium, cigarette smoking, and lung cancer: review of the literature. Oncology. 2017;93(6):347353. 10.1159/000479796

SSharma, TKKelly, PAJones. Epigenetics in cancer. Carcinogenesis. 2009;31(1):2736. 10.1093/carcin/bgp220

10 

PAJones, SBBaylin. The epigenomics of cancer. Cell. 2007;128(4):683692. 10.1016/j.cell.2007.01.029

11 

PAJones, SBBaylin. The fundamental role of epigenetic events in cancer. Nat Rev Genet. 2002;3(6):415428. 10.1038/nrg816

12 

RDammann, MStrunnikova, USchagdarsurengin, MRastetter, MPapritz, UEHattenhorst, et al. CpG island methylation and expression of tumour-associated genes in lung carcinoma. Eur J Cancer. 2005;41(8):12231236. 10.1016/j.ejca.2005.02.020

13 

IKSundar, QYin, BSBaier, LYan, WMazur, DLi, et al. DNA methylation profiling in peripheral lung tissues of smokers and patients with COPD. Clin Epigenetics. 2017;9:38. 10.1186/s13148-017-0335-5

14 

RZhang, LLai, XDong, JHe, DYou, CChen, et al. SIPA1L3 methylation modifies the benefit of smoking cessation on lung adenocarcinoma survival: an epigenomic-smoking interaction analysis. Mol Oncol. 2019;13(5):12351248. 10.1002/1878-0261.12482

15 

THuang, XChen, QHong, ZDeng, HMa, YXin, et al. Meta-analyses of gene methylation and smoking behavior in non-small cell lung cancer patients. Sci Rep. 2015;5:8897. 10.1038/srep08897

16 

S-WChoi, H-WKim, J-WNam. The small peptide world in long noncoding RNAs. Brief Bioinform. 2019;20(5):18531864. 10.1093/bib/bby055

17 

JLRinn, HYChang. Genome Regulation by Long Noncoding RNAs. Annu Rev Biochem. 2012;81(1):145166. 10.1146/annurev-biochem-051410-092902

18 

TRMercer, MEDinger, JSMattick. Long non-coding RNAs: insights into functions. Nat Rev Genet. 2009;10(3):155159. 10.1038/nrg2521

19 

FJSlack, AMChinnaiyan. The role of non-coding RNAs in oncology. Cell. 2019;179(5):10331055. 10.1016/j.cell.2019.10.017

20 

JChen, RWang, KZhang, LBChen. Long non-coding RNAs in non-small cell lung cancer as biomarkers and therapeutic targets. J Cell Mol Med. 2014;18(12):24252436. 10.1111/jcmm.12431

21 

BSComer, MBa, CASinger, WTGerthoffer. Epigenetic targets for novel therapies of lung diseases. Pharmacol Ther. 2015;147:91110. 10.1016/j.pharmthera.2014.11.006

22 

MSun, XHLiu, KHLu, FQNie, RXia, RKong, et al. EZH2-mediated epigenetic suppression of long noncoding RNA SPRY4-IT1 promotes NSCLC cell proliferation and metastasis by affecting the epithelial-mesenchymal transition. Cell Death Dis. 2014;5:e1298. 10.1038/cddis.2014.256

23 

ZHou, WZhao, JZhou, LShen, PZhan, CXu, et al. A long noncoding RNA Sox2ot regulates lung cancer cell proliferation and is a prognostic indicator of poor survival. Int J Biochem Cell Biol. 2014;53:380388. 10.1016/j.biocel.2014.06.004

24 

EBZhang, DDYin, MSun, RKong, XHLiu, LHYou, et al. P53-regulated long non-coding RNA TUG1 affects cell proliferation in human non-small cell lung cancer, partly through epigenetically regulating HOXB7 expression. Cell Death Dis. 2014;5:e1243. 10.1038/cddis.2014.201

25 

RAGupta, NShah, KCWang, JKim, HMHorlings, DJWong, et al. Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis. Nature. 2010;464(7291):10711076. 10.1038/nature08975

26 

YLiu, BWang, XLiu, LLu, FLuo, XLu, et al. Epigenetic silencing of p21 by long non-coding RNA HOTAIR is involved in the cell cycle disorder induced by cigarette smoke extract. Toxicol Lett. 2016;240(1):6067. 10.1016/j.toxlet.2015.10.016

27 

YLiu, FLuo, YXu, BWang, YZhao, WXu, et al. Epithelial-mesenchymal transition and cancer stem cells, mediated by a long non-coding RNA, HOTAIR, are involved in cell malignant transformation induced by cigarette smoke extract. Toxicol Appl Pharmacol. 2015;282(1):919. 10.1016/j.taap.2014.10.022

28 

PThai, SStatt, CHChen, ELiang, CCampbell, RWu. Characterization of a novel long noncoding RNA, SCAL1, induced by cigarette smoke and elevated in lung cancer cell lines. Am J Respir Cell Mol Biol. 2013;49(2):204211. 10.1165/rcmb.2013-0159RC

29 

FLiu, JKKillian, MYang, RLWalker, JAHong, MZhang, et al. Epigenomic alterations and gene expression profiles in respiratory epithelia exposed to cigarette smoke condensate. Oncogene. 2010;29(25):36503664. 10.1038/onc.2010.129

30 

LLu, FLuo, YLiu, XLiu, LShi, XLu, et al. Posttranscriptional silencing of the lncRNA MALAT1 by miR-217 inhibits the epithelial-mesenchymal transition via enhancer of zeste homolog 2 in the malignant transformation of HBE cells induced by cigarette smoke extract. Toxicol Appl Pharmacol. 2015;289(2):276285. 10.1016/j.taap.2015.09.016

31 

GShin, TWKang, SYang, SJBaek, YSJeong, SYKim. GENT: gene expression database of normal and tumor tissues. Cancer Inform. 2011;10:149157. 10.4137/CIN.S7226

32 

SJBaek, SYang, TWKang, SMPark, YSKim, SYKim. MENT: methylation and expression database of normal and tumor tissues. Gene. 2013;518(1):194200. 10.1016/j.gene.2012.11.032

33 

HJiang, RLei, SWDing, SZhu. Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinformatics. 2014;15:182. 10.1186/1471-2105-15-182

34 

WGuo, PFiziev, WYan, SCokus, XSun, MQZhang, et al. BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data. BMC Genomics. 2013;14:774. 10.1186/1471-2164-14-774

35 

H-YCho, AEJedlicka, SPMReddy, TWKensler, MYamamoto, L-YZhang, et al. Role of NRF2 in protection against hyperoxic lung injury in mice. Am J Respir Cell Mol Biol. 2002;26(2):175182. 10.1165/ajrcmb.26.2.4501

36 

R-HHübner, JDSchwartz, PDe Bishnu, BFerris, LOmberg, JGMezey, et al. Coordinate control of expression of Nrf2-modulated genes in the human small airway epithelium is highly responsive to cigarette smoking. Mol Med. 2009;15(7–8):203219. 10.2119/molmed.2008.00130

37 

YZhuang, XWang, HTNguyen, YZhuo, XCui, CFewell, et al. Induction of long intergenic non-coding RNA HOTAIR in lung cancer cells by type I collagen. J Hematol Oncol. 2013;6:35. 10.1186/1756-8722-6-35

38 

WZhao, YAn, YLiang, XWXie. Role of HOTAIR long noncoding RNA in metastatic progression of lung cancer. Eur Rev Med Pharmacol Sci. 2014;18(13):19301936.

39 

HOno, NMotoi, HNagano, EMiyauchi, MUshijima, MMatsuura, et al. Long noncoding RNA HOTAIR is relevant to cellular proliferation, invasiveness, and clinical relapse in small-cell lung cancer. Cancer Med. 2014;3(3):632642. 10.1002/cam4.220

40 

ZLiu, MSun, KLu, JLiu, MZhang, WWu, et al. The long noncoding RNA HOTAIR contributes to cisplatin resistance of human lung adenocarcinoma cells via downregualtion of p21(WAF1/CIP1) expression. PLoS One. 2013;8(10):e77293. 10.1371/journal.pone.0077293

41 

XHLiu, ZLLiu, MSun, JLiu, ZXWang, WDe. The long non-coding RNA HOTAIR indicates a poor prognosis and promotes metastasis in non-small cell lung cancer. BMC Cancer. 2013;13:464. 10.1186/1471-2407-13-464

42 

TNakagawa, HEndo, MYokoyama, JAbe, KTamai, NTanaka, et al. Large noncoding RNA HOTAIR enhances aggressive biological behavior and is associated with short disease-free survival in human non-small cell lung cancer. Biochem Biophys Res Commun. 2013;436(2):319324. 10.1016/j.bbrc.2013.05.101

43 

JRen, JFu, TMa, BYan, RGao, ZAn, et al. LncRNA H19-elevated LIN28B promotes lung cancer progression through sequestering miR-196b. Cell Cycle. 2018;17(11):13721380. 10.1080/15384101.2018.1482137

44 

YZhou, BSheng, QXia, XGuan, YZhang. Association of long non-coding RNA H19 and microRNA-21 expression with the biological features and prognosis of non-small cell lung cancer. Cancer Gene Ther. 2017;24(8):317324. 10.1038/cgt.2017.20

45 

EZhang, WLi, DYin, WDe, LZhu, SSun, et al. c-Myc-regulated long non-coding RNA H19 indicates a poor prognosis and affects cell proliferation in non-small-cell lung cancer. Tumour Biol. 2016;37(3):40074015. 10.1007/s13277-015-4185-5

46 

RKaplan, KLuettich, AHeguy, NRHackett, BGHarvey, RGCrystal. Monoallelic up-regulation of the imprinted H19 gene in airway epithelium of phenotypically normal cigarette smokers. Cancer Res. 2003;63(7):14751482.

47 

PJi, SDiederichs, WWang, SBöing, RMetzger, PMSchneider, et al. MALAT-1, a novel noncoding RNA, and thymosin β4 predict metastasis and survival in early-stage non-small cell lung cancer. Oncogene. 2003;22(39):80318041. 10.1038/sj.onc.1206928

48 

PVölkel, BDupret, XLe Bourhis, P-OAngrand. Diverse involvement of EZH2 in cancer epigenetics. Am J Transl Res. 2015;7(2):175193.

49 

LGan, YYang, QLi, YFeng, TLiu, WGuo. Epigenetic regulation of cancer progression by EZH2: from biological insights to therapeutic potential. Biomark Res. 2018;6:10. 10.1186/s40364-018-0122-2

50 

LHSchmidt, TSpieker, SKoschmieder, JHumberg, DJungen, EBulk, et al. The long noncoding MALAT-1 RNA indicates a poor prognosis in non-small cell lung cancer and induces migration and tumor growth. J Thorac Oncol. 2011;6(12):19841992. 10.1097/JTO.0b013e3182307eac

51 

X.Chen Predicting lncRNA-disease associations and constructing lncRNA functional similarity network based on the information of miRNA. Sci Rep. 2015;5(1):13186. 10.1038/srep13186

52 

MMAli, VSAkhade, STKosalai, SSubhash, LStatello, MMeryet-Figuiere, et al. PAN-cancer analysis of S-phase enriched lncRNAs identifies oncogenic drivers and biomarkers. Nat Commun. 2018;9(1):883. 10.1038/s41467-018-03265-1

53 

KSwisshelm, KRyan, XLee, HCTsou, MPeacocke, RSager. Down-regulation of retinoic acid receptor beta in mammary carcinoma cell lines and its up-regulation in senescing normal mammary epithelial cells. Cell Growth Differ. 1994;5(2):133141.

54 

LJGudas. Retinoids induce stem cell differentiation via epigenetic changes. Semin Cell Dev Biol. 2013;24(10–12):701705. 10.1016/j.semcdb.2013.08.002

55 

M-ASong, JLFreudenheim, TMBrasky, EAMathé, JPMcElroy, QANickerson, et al. Biomarkers of exposure and effect in the lungs of smokers, non-smokers and electronic cigarette users. Cancer Epidemiol Biomarkers Prev. 2019;29(2):443451. 10.1158/1055-9965.EPI-19-1245

56 

SMLee, JYLee, JEChoi, SYLee, JYPark, DSKim. Epigenetic inactivation of retinoid X receptor genes in non-small cell lung cancer and the relationship with clinicopathologic features. Cancer Genet Cytogenet. 2010;197(1):3945. 10.1016/j.cancergencyto.2009.10.008

57 

ODLara, YWang, AAsare, TXu, H-SChiu, YLiu, et al. Pan-cancer clinical and molecular analysis of racial disparities. Cancer. 2020;126(4):800807. 10.1002/cncr.32598

58 

JSui, YHLi, YQZhang, CYLi, XShen, WZYao, et al. Integrated analysis of long non-coding RNA associated ceRNA network reveals potential lncRNA biomarkers in human lung adenocarcinoma. Int J Oncol. 2016;49(5):20232036. 10.3892/ijo.2016.3716

59 

MBTerry, JSFerris, RPilsner, JDFlom, PTehranifar, RMSantella, et al. Genomic DNA methylation among women in a multiethnic New York City birth cohort. Cancer Epidemiol Biomarkers Prev. 2008;17(9):23062310. 10.1158/1055-9965.EPI-08-0312

60 

JTBell, AAPai, JKPickrell, DJGaffney, RPique-Regi, JFDegner, et al. DNA methylation patterns associate with genetic and gene expression variation in HapMap cell lines. Genome Biol. 2011;12(1):R10. 10.1186/gb-2011-12-1-r10

61 

NEBanovich, XLan, GMcVicker, Bvan de Geijn, JFDegner, JDBlischak, et al. Methylation QTLs are associated with coordinated changes in transcription factor binding, histone modifications, and gene expression levels. PLoS Genet. 2014;10(9):e1004663. 10.1371/journal.pgen.1004663

62 

HWu, YZhang. Mechanisms and functions of Tet protein-mediated 5-methylcytosine oxidation. Genes Dev. 2011;25(23):24362352. 10.1101/gad.179184.111

63 

DKLamichhane, HCKim, CMChoi, MHShin, YMShim, JHLeem, et al. Lung cancer risk and residential exposure to air pollution: a Korean population-based case-control study. Yonsei Med J. 2017;58(6):11111118. 10.3349/ymj.2017.58.6.1111

64 

DConsonni, MCarugno, SDe Matteis, FNordio, GRandi, MBazzano, et al. Outdoor particulate matter (PM10) exposure and lung cancer risk in the EAGLE study. PloS One. 2018;13(9):e0203539. 10.1371/journal.pone.0203539

65 

Z-YJin, MWu, R-QHan, X-FZhang, X-SWang, A-MLiu, et al. Household ventilation may reduce effects of indoor air pollutants for prevention of lung cancer: a case-control study in a Chinese population. PloS One. 2014;9(7):e102685. 10.1371/journal.pone.0102685