PLoS ONE
Home MiR-30a and miR-200c differentiate cholangiocarcinomas from gastrointestinal cancer liver metastases
MiR-30a and miR-200c differentiate cholangiocarcinomas from gastrointestinal cancer liver metastases
MiR-30a and miR-200c differentiate cholangiocarcinomas from gastrointestinal cancer liver metastases

Competing Interests: The authors have declared that no competing interests exist.

Article Type: Research Article Article History
Abstract

Prior studies have demonstrated the utility of microRNA assays for predicting some cancer tissue origins, but these assays need to be further optimized for predicting the tissue origins of adenocarcinomas of the liver. We performed microRNA profiling on 195 frozen primary tumor samples using 14 types of tumors that were either adenocarcinomas or differentiated from adenocarcinomas. The 1-nearest neighbor method predicted tissue-of-origin in 33 samples of a test set, with an accuracy of 93.9% at feature selection p values ranging from 10−4 to 10−10. According to binary decision tree analyses, the overexpression of miR-30a and the underexpression of miR-200 family members (miR-200c and miR-141) differentiated intrahepatic cholangiocarcinomas from extrahepatic adenocarcinomas. When binary decision tree analyses were performed using the test set, the prediction accuracy was 84.8%. The overexpression of miR-30a and the reduced expressions of miR-200c, miR-141, and miR-425 could distinguish intrahepatic cholangiocarcinomas from liver metastases from the gastrointestinal tract.

Park,Jeong,Cho,Cho,Cheon,Choi,Park,Kim,and Sureban: MiR-30a and miR-200c differentiate cholangiocarcinomas from gastrointestinal cancer liver metastases

Introduction

Being able to predict where a metastatic tissue originated from is important for the clinical management of patients with metastatic cancers, and microRNA profiling has been used successfully to predict the tissue-of-origin for metastases [15]. Rosenfeld et al. reported a prediction accuracy of 89% using the first-generation of Rosetta Genomics microRNA assays [2]. Using the second-generation of these assays, Meiri et al. reported an 85% overall accuracy, and a 90% sensitivity for single-answer cases in an independent sample set [3]. Using assays based on 47 microRNAs, Ferracin et al. reported prediction accuracies of 100% and 78% for primary cancers and metastases, respectively [4]. Sokilde et al. reported that their 132 microRNA-based assays correctly predicted tissue-of-origin in 88% of metastases [5]. While these prior microarray studies have demonstrated the utility of microRNA assays for predicting cancer tissue-of-origin, these assays need to be further optimized to predict tissue-of-origin for liver metastases.

Being able to identify liver as the source, and to identify liver cancer by type, is important because the liver is a common site for cancer metastasis. It is therefore clinically important to distinguish cancer metastases from primary liver cancer to plan for optimal patient care. Whereas hepatocellular carcinoma (HCC) is easily distinguished from liver metastases by histology, intrahepatic cholangiocarcinoma, which represents 4–6% of primary liver cancers, is often difficult to differentiate from metastatic adenocarcinoma [6]. Histologically, primary intrahepatic cholangiocarcinomas are adenocarcinomas that resemble the metastases of common solid tumors such as colorectal or pancreatic adenocarcinomas [7, 8]. Being able to distinguish between these two disease entities is important because the treatment plans and prognoses are different, and there are no specific immunohistochemistry markers for intrahepatic cholangiocarcinomas. According to Chiu et al., CK7-positive/CK20-negative staining was seen in 9 out of 12 (75%) cholangiocarcinomas, but in none of the 25 colorectal cancer metastases examined, whereas CK7-negative/CK20-positive staining was seen in 1 out of 12 (8%) cholangiocarcinomas, and in 20 out of 25 (80%) colorectal cancer metastases [8]. The sensitivity of CDX2 for colorectal cancer is 99%, but CDX2 is also expressed in up to 21% of intrahepatic cholangiocarcinomas [9, 10]. Even microRNA-based assays have only demonstrated relatively weak performance for predicting tissue-of-origin for digestive system cancers, especially for cholangiocarcinomas. The microRNA classifiers of Solkide et al. failed to predict tissue-of-origin for most liver metastases, but did classify them as cholangiocarcinomas [5]. Therefore, these authors had to add a rule to their classifier model that metastasis sites cannot be classified as primary tumors [5]. Although cholangiocarcinomas have been included to the second-generation Rosetta Genomics microRNA assay, 4 of 13 biliary tract cancers (30.8%) were either misclassified, or ambiguously predicted to be pancreatobiliary cancers using it [3]. In addition, 5 of 9 pancreatic adenocarcinomas (55.5%) were either misclassified, or ambiguously predicted to be pancreatobiliary cancers [3]. There has been one microRNA study focused on comparing pancreatic cancers with intrahepatic cholangiocarcinomas, but it was limited by a small sample size (n = 9) of pancreatic cancer cases, according to the authors [11]. Moreover, the authors did not directly compare pancreatic cancers with cholangiocarcinomas, only each type of cancer with adjacent normal tissue [11]. The Cancer Genome Atlas project performed integrative genomic analyses including small RNA sequencing analyses of 36 cholangiocarcinoma samples [12]. Integrative clustering from TCGA data revealed the dominant role of cell-of-origin patterns [12]. These data from bulk primary tumors, however, cannot be directly used to determine the tissue-of-origin of liver metastases because of the confounding signals from the microenvironment. Hepatocyte-specific microRNAs, for example, may prevent bulk primary tumor analyses from capturing real microRNA signatures discriminating cholangiocarcinoma cells from gastrointestinal adenocarcinoma cells.

To fully address these issues, the current analyses incorporated data from in situ hybridization and cell line analyses, focusing on the identification of cholangiocarcinoma-specific microRNA profiles. More importantly, the availability of metastasectomy samples provided us with a unique opportunity to validate the predictive value of discriminatory microRNAs identified in the primary tumors. Using our genomics expertise [13, 14], we performed single-protocol microRNA profiling analyses using frozen primary tumors originating from the lung, pancreas, hepatobiliary tree, kidney, bowel, genital system, and stomach; the most common sites for carcinomas of unknown primary-origin according to autopsy studies [15]. We enriched with intrahepatic cholangiocarcinomas, which are relatively common in Korea [16]. As a result, we report microRNA signatures that could differentiate adenocarcinomas in the liver according to their tissues-of-origin.

Materials and methods

Tissue samples

Samples were collected at the time of surgery from patients at the National Cancer Center and the Soonchunhyang University Hospital in Korea, between 2001 to 2013. Specimens were kept frozen in liquid nitrogen until analysis. The training-sample set was composed of 195 frozen primary tumor samples comprised of 14 tumor types that were mostly adenocarcinomas (Table 1). Primary tumors included 23 intrahepatic cholangiocarcinomas (procured between 2001 and 2007), 29 colorectal adenocarcinomas, six gastric adenocarcinomas, 13 pancreatic ductal adenocarcinomas, ten HCCs, 26 lung adenocarcinomas, six small-cell lung cancers (SCLCs), 23 breast adenocarcinomas, 12 endometrial endometrioid adenocarcinomas, 11 ovarian serous adenocarcinomas, nine renal-cell carcinomas (RCCs), eight prostate adenocarcinomas, 11 thyroid papillary adenocarcinomas, and eight acute leukemias (Table 1). The test set was composed of two intrahepatic cholangiocarcinomas (procured in 2011) and 31 liver metastases originating from colon (n = 29) and ovaries (n = 2).

Table 1
Clinicopathological characteristics of tumor samples.
Training set
Primary tumorNo.MaleMedian age(yr)Subtype
Cholangiocarcinoma2315 (65%)60
Colorectal2919 (66%)63
Gastric63 (50%)70intestinal 3, diffuse 3
Pancreatic139 (69%)59ductal 13
HCC109 (90%)59
Lung (adenocarcinoma)2612 (46%)62
Lung (SCLC)65 (83%)59
Breast23044HER2 16, TN 2, luminal 5
Endometrial12058endometrioid 12
Ovarian11054serous 11
Renal94 (44%)61clear cell 9
Prostate88 (100%)67
Thyroid11047papillary 11
Leukemia85 (63%)36ALL 5, AML 3
Test set
Cholangiocarcinoma22 (100%)58
Liver metastases3120 (65%)61colorectal 29, ovarian 2

HCC, hepatocellular carcinoma; SCLC, small-cell lung cancer; TN, triple-negative; ALL, acute lymphocytic leukemia; AML, acute myelogenous leukemia.

MicroRNA microarrays

A 10 μm-thick top slide from tissue samples was stained with hematoxylin and eosin. Guided by this top slide, remaining tissue was macrodissected to trim non-tumorous stromal components. Macrodissected, the frozen tissue sample (>50% tumor content) was then mechanically crushed in liquid nitrogen, homogenized, and subjected to RNA isolation using TRI reagent (Thermo Fisher Scientific, Waltham, MA), according to the manufacturer’s instructions. Total RNA was then subjected to DNAse I treatment. After confirming ribosomal RNA bands were intact, we performed poly-A tailing on 500 ng of total RNA. FlashTag Biotin HSR Labeling Kit (Genisphere LLC, Hatfield, PA) was used to join a proprietary biotin-labeled dendrimer molecule to the 3′ ends using DNA ligase. Labeled samples were then hybridized to GeneChip miRNA 2.0 microarrays (Affymetrix, Santa Clara, CA) at 48°C for 16 h, washed, stained with a Streptavidin-PE solution, and scanned. GeneChip miRNA 2.0 microarrays are based on miRbase (version 15) and contain 15,644 mature microRNA probe sets from 131 organisms. All cell files were robust multi-array average (RMA)-normalized. After filtering out star-form microRNAs, we subjected 914 human microRNAs to further analyses for this study.

Immunohistochemistry

All cases in the test set were subjected to the ImmPRESS peroxidase detection system (Vector Laboratories, MP-7401 and MP-7402) to detect CDX2, CK20, CK7, and CA125 proteins. The following antibodies were used in this study; mouse monoclonal anti-CK7 antibody (1:100; Thermo scientific, MA1-06316), rabbit monoclonal anti-CK20 antibody (1:100; Abcam, ab76126), mouse monoclonal anti-CDX2 antibody (1:100; BioGenex, MU392-UC), and mouse monoclonal anti-CA125 (1:50; Thermo scientific, MA5-11579). Briefly, frozen tissue sections were fixed with acetone for 10 min and immersed for 10 min in 3% hydrogen peroxide to block endogenous peroxidase activity. After washing in PBS, the sections were incubated in normal blocking serum provided in the kit. The sections were then incubated for 30 min at room temperature with the diluted primary antibodies. Negative controls were performed by omitting the primary antibody and diluent-substitution. The sections were then incubated with the appropriate secondary antibody conjugated with horseradish peroxidase for 30 min at room temperature. Subsequently, the sections were subjected to colorimetric detection with ImmPact DAB substrate (Vector Laboratories, SK-4105). The slides were counterstained with Mayer’s hematoxylin for 10s. Immunohistochemical evaluation was performed by two pathologists who were blinded to any clinical information. The nuclear staining for CDX2 and cytoplasmic staining for CK20, CK7, and CA125 were assessed in the tumor cells, and scored according to the percentage of positively stained tumor cells: negative, less than 5%; equivocal, from 5–50%; positive, more than 50%.

Statistical analyses

RMA-summarized microarray data were analyzed using BRB-Arraytools software (NCI, Bethesda, MD) [17]. Principal component analyses (PCA) were performed using 1-correlation as a distance metric. The 1-nearest neighbor (1-NN) algorithm and differentially-expressed microRNAs was used for class prediction. We predicted the primary tumor tissue-of-origin by applying the 1-NN classifier to the test set composed of metastasectomy samples and cholangiocarcinomas. MicroRNAs that were differentially expressed in the training subset were employed to predict the tissue-of-origin in the test set. Binary decision tree analyses were also used to build a microRNA model for predicting tissue-of-origin. Branches were selected at each node of the decision tree using the 1-NN classifiers and microRNAs that were differentially expressed between two tumor types (Fig 1B).

MicroRNAs for tissue origin.
Fig 1

MicroRNAs for tissue origin.

(A) Prediction accuracies of 1-NN analyses on the test set at feature selection p < 10−10, after allocating different numbers of cholangiocarcinomas to the training set according to chronological order (B) Decision tree analysis. MicroRNAs that were differentially expressed at feature selection p cutoff of 0.00005 were used to predict the tissue of origin at each node of the decision tree. Samples for node no. 7 (surrounded by black broken lines) were further evaluated in Fig 1C and 1D. (C) PCA plot for samples in node no. 7 of (B), based on 914 microRNAs. Each sphere represents each sample and ‘1-correlation’ was used as a distance metric. Cholangiocarcinomas (shown in red), gastrointestinal/pancreatic cancers (shown in green), and non-digestive system cancers (shown in blue) clustered separately. (D) Expression profiles for microRNAs in node no. 7 of (B) in the training set. Expression profiles of 3 microRNAs underexpressed in cholangiocarcinomas compared with extrahepatic cancers in digestive and non-digestive systems (upper panel) and 1 overexprexpressed microRNAs (lower panel) at feature selection p < 0.00005. Since the apparent overexpression of miR-122 in cholangiocarcinomas is presumably due to contaminating hepatocytes in the sample, we decided to exclude miR-122 from a set of discriminatory microRNAs comprising the node no. 7 of the decision tree. A heatmap generated using a log2-pseudocolor image with microRNA centering. Red and blue colors denote high and low expression of microRNAs, respectively. A scale bar for the log2-expression is shown at the bottom. (E) Expression profiles of miR-30a and miR-200c between 25 intrahepatic cholangiocarcinomas (in the training and test sets) and 29 colorectal cancer metastases in the test set. Scatter plots display median microarray signal values (***p < 10−14 and p < 10−5; t-test). (F) Real-time RT-PCR profiles of miR-30a and miR-200c in cell lines. Scatter plots display median values of RNU6-normalized–log2 Ct values (p = 0.22 and p = 0.30, respectively; t-test). (G) Expression profiles of cholangiocarcinoma signature in TCGA small RNA sequencing data. Scatter plots display median values of normalized RPKM (*p < 10−10; t-test).

The Cancer Genome Atlas (TCGA) data analysis

TCGA small RNA sequencing data were downloaded from the Genomic Data Commons Data Portal (GDC, http://protal.gdc.cancer.gov) and normalized by z-scores. The cholangiocarcinoma signature score was calculated based on the weighted average of the normalized reads per kilobase of transcript, per million mapped reads (RPKM) of five microRNAs for each tumor.

Ethics statement

National Cancer Center institutional review board waived the requirement for informed consent to participate in this study (NCCNCS12633). All data/samples were fully anonymized and the medical records were accessed from Sep 2012 to Sep 2014.

Results

Nearest neighbor predictions based on 195 primary tumors

When microRNAs that were differentially expressed among the 195 primary tumors in the training set were applied to the test set of liver metastasectomy samples, the prediction accuracy was consistently 93.9% at p values ranging from 10–4 to 10–10. Using 229 microRNAs that were differentially expressed among tumor types at p < 10−10 (S1 Table), 93.9% of the test set samples (31 of 33) were correctly identified for tissue-of-origin. There were two misclassified samples: an ovarian cancer (OV) metastasis sample (predicted to be a breast cancer metastasis) and a colorectal cancer metastasis (predicted to be a gastric cancer metastasis).

These results were obtained when 23 cholangiocarcinoma samples (procured between 2001 and 2007) were allocated to the training set, and two cholangiocarcinoma samples (procured in 2011) were allocated to the test set. To rule out a possibility of overfitting, we tested the performance of our discriminatory microRNAs using various class labeling schemes. First, we allocated different numbers of cholangiocarcinomas to the training set according to chronological order and performed the same 1-NN predictions for the test set (at feature selection p < 10−10). The overall prediction accuracies were 91.4% or higher for various training-to-test allocation schemes (Fig 1A). Second, we conducted the same 1-NN class prediction analyses by randomly dividing a whole set of 25 cholangiocarinoma samples into two (training/test) subsets at 2-to-1 ratio. When we evaluated the prediction accuracy of the 1-NN prediction at feature selection p < 10−10 for each random test set, the median prediction accuracy was 90% (range, 81.0─94.7) in 100 random datasets. Thus, high (≥ 90%) prediction accuracies throughout various class labeling schemes indicate the robustness of our tissue-specific discriminatory microRNAs in predicting the tissue-of-origin for adenocarcinomas in the liver.

Decision tree analyses

Leukemia, thyroid, prostate, renal, neuroendocrine, and hepatocellular cancers (nodes no. 1–6)

To enhance the potential clinical utility of our microRNA profiles, we also employed a binary decision tree-based classification with some modification of similar schemes. In this approach, the tissue-of-origin was assigned by selecting one of the two branches at each node using the 1-NN algorithm, in order to predict the primary tissue origins of the metastases, especially the liver metastases. Branches were selected at each node of the decision tree using microRNAs that were differentially expressed between two tumor types.

According to our unsupervised PCA analysis, leukemias, thyroid, prostate, RCCs, SCLCs, and HCCs formed their own distinct clusters. As initial steps in the decision tree scheme, each of these five tumor types, with distinct microRNA profiles, was differentiated from the rest of the samples using differentially expressed microRNAs at feature selection of p < 0.00005 between the two groups diverging from each node (Fig 1B). The discriminatory microRNAs at each node of the decision tree are summarized in S2 Table. The miR-181 family, which was much more abundant in leukemia than in solid tumors [18], was the most characteristic microRNA at node no. 1 (leukemia vs. non-leukemia) of the decision tree (Fig 2). Thyroid-specific miR-138 and miR-146b-5p were most characteristic at node no. 2 (thyroid vs. non-thyroid) [19]. At node no. 3 (prostate vs. non-prostate), prostate cancer was distinguished from the other tumors by the overexpression of miR-133a and miR-133b (Fig 2) [20]. Overexpressions of miR-204 and miR-122 in RCCs and HCCs most were characteristic at nodes no. 4 (RCC vs. non-RCC) and no. 6 (HCC vs. non-HCC), respectively (Fig 2) [21, 22]. Of note, miR-216 and miR-217 were overexpressed in SCLC (Fig 3).

Expression profiles of selective discriminatory microRNAs of each node in the training set.
Fig 2

Expression profiles of selective discriminatory microRNAs of each node in the training set.

Discriminatory microRNAs were defined as microRNAs differentially expressed between two branches at each node of the decision tree at p < 0.00005. The tissue of origin was assigned by selecting one of the two branches at each node using these discriminatory microRNAs. Acute leukemia (AL), thyroid cancer (THCA), prostate adenocarcinoma (PRAD), renal cell carcinoma (KIRC), small cell lung carcinoma (SCLC), hepatocellular carcinoma (LIHC), cholangiocarcinoma (CHOL), colorectal adenocarcinoma (COAD), gastric adenocarcinoma (STAD), pancreatic adenocarcinoma (PAAD), lung adenocarcinoma (LUAD), breast adenocarcinoma (BRCA), uterine endometrial carcinoma (UCEC), ovarian cancer (OV).

Discriminatory microRNAs for small cell lung cancer.
Fig 3

Discriminatory microRNAs for small cell lung cancer.

Discriminatory microRNAs comprising the node no. 6 of the decision tree that differentiate small cell lung cancer (SCLC) from primary lung adenocarcinoma (LUAD) and extrapulmonary cancers (*p < 10−10; #p = 2 x 10−9; t-test).

Intrahepatic cholangiocarcinoma (node no. 7)

At node no. 7 (cholangiocarcinoma vs. non-cholangiocarcinoma) of the decision tree, there were five differentially expressed microRNAs between cholangiocarcinomas and extrahepatic cancers of digestive and non-digestive systems at p < 0.00005. When solitary lesions are found in patients with known primary tumors, a liver biopsy is usually performed to distinguish between primary and metastatic liver cancers, given that metastasetomy is indicated only in selected clinical setting [2325]. However, unlike HCC, intrahepatic cholangiocarcinoma is an adenocarcinoma that histologically resembles tumors originating from the pancreas, stomach, or colon, and currently lacks any validated, tissue-specific immunohistochemistry markers (Fig 1B). Given the differences in treatment strategy (local versus systemic) for these two diseases, there is an important unmet clinical need to develop a method for distinguishing between them.

At node no. 7, the overexpression of miR-122 and miR-30a, and the underexpression of the miR-200 family (miR-200c and miR-141), were most characteristic for cholangiocarcinoma. However, miR-122 is expressed at an extremely high level in both normal liver and in HCCs [22, 26], but at a relatively low level in cholangiocarcinoma cell lines [27]. We therefore reasoned that the apparent overexpression of miR-122 in cholangiocarcinomas might be primarily due to contaminating hepatocytes in bulk tumor samples, and tested for this possibility using in situ hybridization experiments. Indeed, in situ assessment of miR-122 demonstrated a very low expression of it in cholangiocarcinomas and a moderate-strong expression of it in HCCs (S1 Fig). Since the level of miR-122 expression in bulk tumor cannot differentiate between cholangiocarcinoma and liver metastasis, we decided to exclude miR-122 from a set of discriminatory microRNAs characterizing node no. 7 and defined the remaining four microRNAs as a cholangiocarcinoma signature (Table 2 and Fig 1D).

Table 2
The cholangiocarcinoma signature for the differential expression of microRNAs between intrahepatic cholangiocarcinomas and extrahepatic adenocarcinomas originating from the colon, stomach, pancreas, lung, breast, uterus, and ovary, at p < 0.00005 (node no. 7 of the decision tree; miR-122 was omitted from the list.).
Increased in cholangiocarcinoma
Probe setpt-valueCHOL1ExtrahepaticFC2
miR-30a3.7 × 10−54.2611.129.551.57
Decreased in cholangiocarcinoma
Probe setpt-valueCHOL1ExtrahepaticFC2
miR-200c<1 × 10−7-9.10110.8713.84-2.97
miR-141<1 × 10−7-6.4344.799.12-4.33
miR-4252 × 10−6-4.93710.511.31-0.81

1CHOL, cholangiocarcinoma

2FC, expression fold change of cholangiocarcinoma to extrahepatic adenocarcinomas.

Most prominent in this cholangiocarcinoma signature was the overexpression of miR-30a in cholangiocarcinomas (Table 2). The expression level of miR-30a of cholangiocarcinoma was higher than that of extrahepatic primary adenocarcinomas from the gastrointestinal tract, the pancreas, the lung, the breast, the uterus, and the ovary (at node no. 7), although it was lower than those of RCCs and thyroid cancers. This finding is consistent with a report that miR-30a knockdown in zebrafish larvae results in defective biliary morphogenesis [28]. Primary intrahepatic carcinomas expressed stronger miR-30a and weaker miR-200c than did colorectal cancer metastases (p < 10−14 and p < 10−5, respectively; Fig 1E). According to the quantitative real-time polymerase chain reaction (qRT-PCR), cholangiocarcinoma cell lines also exhibited a trend for miR-30a overexpression and miR-200c underexpression, as compared with colorectal and gastric cancer cell lines (Fig 1F).

Digestive (nodes no. 8–10) and non-digestive (nodes no. 11–13) extrahepatic primary adenocarcinomas

The cholangiocarcinoma signature enabled accurate distinctions between primary and metastatic adenocarcinomas of the liver as shown above. Clinically, once the possibility of a primary liver adenocarcinoma is ruled out, determining the tissue-of-origin for metastatic adenocarcinoma is the next step for planning management. When a tissue-of-origin was assigned by selecting one branch at each node using the 1-NN algorithm and differentially expressed microRNAs (p < 0.00005), miR-1281 overexpression in colorectal cancers was the most characteristic microRNA to differentiate them from other tumors, including gastric cancers (node no. 8 (colorectal vs. non-colorectal)). MiR-215 was a relatively stomach-specific microRNA and characterized node no. 9 (gastric vs. non-gastric). While accumulating data suggest a role for miR-215 in the development and progression of gastric cancer [29], our study is the first to suggest stomach-specificity for miR-215 in the digestive system. As previously reported, miR-194 and miR-192 were relatively abundant in both colorectal and gastric cancers compared to non-gastrointestinal tumors [30]. At node no. 12 (breast vs. non-breast), miR-196a was abundant in breast cancers, whereas miR-449a and miR-449b were relatively abundant in endometrial and ovarian cancers [31] (Fig 2).

Application of the decision tree to the test set

When the present decision tree was applied to the test set, the prediction accuracy was 84.8% (28 out of 33 samples). Four colorectal cancer metastases were misclassified as cholangiocarcinomas (n = 2), a gastric carcinoma (n = 1), and a lung adenocarcinoma (n = 1). An ovarian cancer metastasis was misclassified as a cholangiocarcinoma.

External validation of node no. 7

Given the important clinical relevance, the cholangiocarcinoma signature (node no. 7; Table 2) was additionally validated using small RNA sequencing data from TCGA. As metastasectomy samples were not available in dataset from TCGA, TCGA primary tumors were compared for their expression of the cholangiocarcinoma signature corresponding to the node no. 7 (cholangiocarcinoma vs. non-cholangiocarcinoma) of our decision tree. As in the training set, cholangiocarcinoma signatures of TCGA colorectal (n = 221), lung (n = 245), breast (n = 748), and endometrial (n = 406) adenocarcinomas were weaker than that of intra- and extra-hepatic cholangiocarcinomas in the dataset from TCGA (n = 36) (Fig 1G). Thus, cholangiocarcinoma signature, which was identified in our training set, was validated in the external dataset from TCGA.

Immunohistochemistry

Finally, we performed in-parallel immunohistochemical staining analyses for the test set, to evaluate the potential utility of the present microRNA profiling methods, as complementary assays for conventional immunohistochemistry (Table 3). A typical colorectal immunophenotype of CDX2+/CK20+/CK7- [32] was observed in 26 of 29 samples, but three of them were CK20-, CK7+, or had equivocal CDX2 staining, which meant that these cases could not be confidently diagnosed as being of colorectal origin (S2 Fig). These three equivocal cases were correctly predicted as having a colorectal origin by our 1-NN prediction based on microRNA profiles of 195 primary tumors, again suggesting the potential usefulness of the present microRNA profiling methods. Both primary cholangiocarcinomas in the test set showed a CK20-/CK7+/CDX2-/CA125- immunophenotype, indicating a low probability of colorectal, ovarian, and pancreatic origin [32, 33]. While these two cases could have been considered cholangiocarcinomas using immunohistochemical exclusion of other tissue origins, our microRNA assessment could also have been used clarify the immunohistochemistry diagnosis by correctly assigning them to the cholangiocarcinoma category. These results indicate that microRNA profiles may also supplement immunohistochemistry assessment in determining tissue origin for liver metastases.

Table 3
CDX2, CK20, CK7, and CA125 immunostaining in the test set.
Primary tumorCDX2CK20CK7CA125
No 1CHOL--+-
No 2CHOL--+-
No 3COAD++--
No 4COAD++--
No 5COAD++/---
No 6COAD+---
No 7COAD++--
No 8COAD++--
No 9COAD++--
No 10COAD++--
No 11COAD++--
No 12COAD++--
No 13COAD++--
No 14COAD++--
No 15COAD++--
No 16COAD++--
No 17COAD++--
No 18COAD++-+/-
No 19COAD++--
No 20COAD++--
No 21COAD+---
No 22COAD++--
No 23COAD++--
No 24COAD++--
No 25COAD++--
No 26COAD++--
No 27COAD+/-++-
No 28COAD++--
No 29COAD++--
No 30COAD++-N/T
No 31COAD++-+/-
No 32OVN/T-++
No 33OV--++/-

+, positive; -, negative; +/-, equivocal; N/T, not tested; CHOL, cholangiocarcinoma; COAD, colorectal carcinoma; OV, ovarian cancer.

Discussion

This study suggests that microRNA profiles can distinguish between intrahepatic cholangiocarcinomas and extrahepatic cancers of the digestive system, which is often difficult due to the lack of validated tissue-specific biomarkers for cholangiocarcinoma. Cytokeratin 19, for example, is useful to differentiate cholangiocarcinoma from HCC, but its positivity is similar between cholangiocarcinoma and gastrointestinal adenocarcinoma [34]. Reports in the literature have suggested that microRNA profiling may be useful in determining tumor origin, but this profiling has not been successful in accurately distinguishing cholangiocarcinomas from metastatic liver cancers. A microRNA assay developed by Solkide et al. predicted most liver metastases to be cholangiocarcinomas, and was therefore not able to differentiate between primary and metastatic adenocarcinomas in the liver. An appreciable fraction of biliary tract and pancreatic cancers were also misclassified or vaguely predicted to be pancreatobiliary cancers in another study [3]. Whereas most of previous microRNA-based tissue-origin assays were developed based on formalin-fixed, paraffin-embedded common tumors, we chose to enrich our datasets using frozen intrahepatic cholangiocarcinomas and liver metastases in our study to avoid possible artefacts. Thinking ahead for possible clinical applications, we also minimized the number of discriminatory microRNAs in the cholangiocarcinoma signature. It should be noted that our cholangiocarcinoma signature was optimized using additional in situ hybridization experiments to increase the specificity that is often lacking in bulk tumor profiling studies.

To our knowledge, this is the first study to demonstrate that the overexpression of miR-30a and the reduced expressions of miR-200c, miR-141, and miR-425 could accurately distinguish intrahepatic cholangiocarcinomas from extrahepatic adenocarcinomas, especially in gastrointestinal cancer metastases. MiR-30a is crucial in biliary development [28], and promotes the proliferation of cholangiocarcinoma cells [35]. While miR-30a plays an oncogenic role in biliary epithelium, miR-30a acts as a tumor suppressor in prostate and gastric tissue [36, 37]. Carcinogenic role of miR-30a is therefore dependent on the tissue context. The current study is the first to leverage tissue-specific roles miR-30a for the purpose of distinguishing the tissue-of-origin of liver neoplasms. During the progression of colorectal cancer, miR-200c is overexpressed [38]. To the contrary, miR-200c is downregulated in intrahepatic cholangiocarcinoma [39], consistent with the current paper. No previous studies, however, have suggested that miR-200c could be used to differentiate intrahepatic cholangiocarcinomas from liver metastases.

In addition to the potential diagnostic role of the overexpression of miR-30a and the reduced expressions of miR-200c, miR-141, and miR-425, our study reveals several novel discoveries for the tissue-dependent microRNA expression. For example, our study is the first to reveal the overexpression of miR-216/miR-217, miR-1281, and miR-215 in SCLC, colorectal, and gastric cancers, respectively. Given the potential roles in clinical tissue-of-origin diagnosis, these novel findings warrant validation studies in the future.

Although our study is still limited by the relatively small sample size of cholangiocarcinomas, our finding was validated across the analysis platform using datasets from TCGA and cell line data. We were not able to evaluate whether these microRNAs are differentially expressed in serum samples because they were not available for this study. While the diagnosis of colorectal cancer benefits from relatively sensitive and specific biomarkers indicating a CDX2 and CK7-/CK20+ phenotype, some colorectal cancers lack such typical molecular characteristics as exemplified by our samples from colorectal cancer metastases. Notably, the present microRNA profiles were proven very useful for tissue-origin diagnosis for these equivocal colorectal metastases, suggesting their supplementary diagnostic value. Using a large number of frozen tissue samples, the current study clearly demonstrates that gastrointestinal cancer can be differentiated from cholangiocarcinomas by the 4-microRNA cholangiocarcinoma signature using miR-30a, miR-200c, miR-141, and miR-425. Although the present data requires further validation using broader and larger datasets, our results provide important clues for differentiating between adenocarcinoma tissue origins in the liver, especially for the diagnosis of gastrointestinal cancer metastases.

References

JLu, GGetz, EAMiska, EAlvarez-Saavedra, JLamb, DPeck, et al. MicroRNA expression profiles classify human cancers. Nature. 2005;435: 834838. 10.1038/nature03702

NRosenfeld, RAharonov, EMeiri, SRosenwald, YSpector, MZepeniuk, et al. MicroRNAs accurately identify cancer tissue origin. Nat Biotechnol. 2008;26: 462469. 10.1038/nbt1392

EMeiri, WCMueller, SRosenwald, MZepeniuk, EKlinke, TBEdmonston, et al. A Second‐Generation MicroRNA‐Based Assay for Diagnosing Tumor Tissue Origin. Oncologist. 2012;17: 801812. 10.1634/theoncologist.2011-0466

MFerracin, MPedriali, AVeronese, BZagatti, RGafà, EMagri, et al. MicroRNA profiling for the identification of cancers with unknown primary tissue-of-origin. J Pathol. 2011;225: 4353. 10.1002/path.2915

RSøkilde, MVincent, AKMøller, AHansen, PEHøiby, TBlondal, et al. Efficient identification of miRNAs for classification of tumor origin. J Mol Diagnostics. 2014;16: 106115. 10.1016/j.jmoldx.2013.10.001

DMParkin, FBray, JFerlay, PPisani. Global Cancer Statistics, 2002. CA Cancer J Clin. 2005;55: 74108. 10.3322/canjclin.55.2.74

ASasaki, KKawano, MAramaki, KNakashima, TYoshida, SKitano. Immunohistochemical expression of cytokeratins in intrahepatic cholangiocarcinoma and metastatic adenocarcinoma of the liver. J Surg Oncol. 1999;70: 103108. 10.1002/(sici)1096-9098(199902)70:2&lt;103::aid-jso8&gt;3.0.co;2-h

CTChiu, JMChiang, TSYeh, JHTseng, TCChen, YYJan, et al. Clinicopathological analysis of colorectal cancer liver metastasis and intrahepatic cholangiocarcinoma: Are they just apples and oranges? Dig Liver Dis. 2008;40: 749754. 10.1016/j.dld.2008.01.018

NCPanarelli, RKYantiss, MMYeh, YLiu, YTChen. Tissue-specific cadherin CDH17 is a useful marker of gastrointestinal adenocarcinomas with higher sensitivity than CDX2. Am J Clin Pathol. 2012;138: 211222. 10.1309/AJCPKSHXI3XEHW1J

10 

PGChu, RESchwarz, SKLau, YYen, LMWeiss. Immunohistochemical staining in the diagnosis of pancreatobiliary and ampulla of Vater adenocarcinoma: Application of CDX2, CK17, MUC1, and MUC2. Am J Surg Pathol. 2005;29: 359367. 10.1097/01.pas.0000149708.12335.6a

11 

ALCollins, SWojcik, JLiu, WLFrankel, HAlder, LYu, et al. A Differential MicroRNA Profile Distinguishes Cholangiocarcinoma from Pancreatic Adenocarcinoma. Ann Surg Oncol. 2014;21: 133138. 10.1245/s10434-013-3240-y

12 

KAHoadley, CYau, THinoue, DMWolf, AJLazar, EDrill, et al. Cell-of-Origin Patterns Dominate the Molecular Classification of 10,000 Tumors from 33 Types of Cancer. Cell. 2018;173: 291304.e6. 10.1016/j.cell.2018.03.022

13 

HYang, DHong, SYCho, YSPark, WRKo, JHKim, et al. RhoGAP domain-containing fusions and PPAPDC1A fusions are recurrent and prognostic in diffuse gastric cancer. Nat Commun. 2018;9: 4439. 10.1038/s41467-018-06747-4

14 

DGMun, JBhin, SKim, HKim, JHJung, YJung, et al. Proteogenomic Characterization of Human Early-Onset Gastric Cancer. Cancer Cell. 2019;35: 111124.e10. 10.1016/j.ccell.2018.12.003

15 

GPentheroudakis, VGolfinopoulos, NPavlidis. Switching benchmarks in cancer of unknown primary: From autopsy to microarray. Eur J Cancer. 2007;43: 20262036. 10.1016/j.ejca.2007.06.023

16 

HRShin, JKOh, MKLim, AShin, HJKong, KWJung, et al. Descriptive epidemiology of cholangiocarcinoma and clonorchiasis in Korea. J Korean Med Sci. 2010;25: 10111016. 10.3346/jkms.2010.25.7.1011

17 

RSimon, ALam, MCLi, MNgan, SMenenzes, YZhao. Analysis of gene expression data using BRB-array tools. Cancer Inform. 2007;3: 1117. 10.1177/117693510700300022

18 

FCichocki, MFelices, VMcCullar, SRPresnell, AAl-Attar, CTLutz, et al. Cutting Edge: MicroRNA-181 Promotes Human NK Cell Development by Regulating Notch Signaling. J Immunol. 2011;187: 61716175. 10.4049/jimmunol.1100835

19 

JZhang, YLiu, ZLiu, XMWang, DTYin, LLZheng, et al. Differential expression profiling and functional analysis of microRNAs through stage I-III papillary thyroid carcinoma. Int J Med Sci. 2013;10: 585592. 10.7150/ijms.5794

20 

JTao, DWu, BXu, WQian, PLi, QLu, et al. microRNA-133 inhibits cell proliferation, migration and invasion in prostate cancer cells by targeting the epidermal growth factor receptor. Oncol Rep. 2012;27: 19671975. 10.3892/or.2012.1711

21 

OMikhaylova, YStratton, DHall, EKellner, BEhmer, AFDrew, et al. VHL-Regulated MiR-204 Suppresses Tumor Growth through Inhibition of LC3B-Mediated Autophagy in Renal Clear Cell Carcinoma. Cancer Cell. 2012;21: 532546. 10.1016/j.ccr.2012.02.019

22 

YSaito, HSuzuki, MMatsuura, ASato, YKasai, KYamada, et al. MicroRNAs in Hepatobiliary and Pancreatic Cancers. Front Genet. 2011;2: 15. 10.3389/fgene.2011.00001

23 

PRubin, RBrasacchio, AKatz. Solitary Metastases: Illusion Versus Reality. Semin Radiat Oncol. 2006;16: 120130. 10.1016/j.semradonc.2005.12.007

24 

PMordant, AArame, FDe Dominicis, CPricopi, CFoucault, ADujon, et al. Which metastasis management allows long-term survival of synchronous solitary M1b non-small cell lung cancer? Eur J Cardio-Thoracic Surg. 2012;41: 617622. 10.1093/ejcts/ezr042

25 

EOki, STokunaga, YEmi, TKusumoto, MYamamoto, KFukuzawa, et al. Surgical treatment of liver metastasis of gastric cancer: a retrospective multicenter cohort study (KSCC1302). Gastric Cancer. 2016;19: 968976. 10.1007/s10120-015-0530-z

26 

OGovaere, MKomuta, JBerkers, BSpee, CJanssen, FDe Luca, et al. Keratin 19: A key role player in the invasion of human hepatocellular carcinomas. Gut. 2014;63: 674685. 10.1136/gutjnl-2012-304351

27 

FMeng, RHenson, MLang, HWehbe, SMaheshwari, JTMendell, et al. Involvement of Human Micro-RNA in Growth and Response to Chemotherapy in Human Cholangiocarcinoma Cell Lines. Gastroenterology. 2006;130: 21132129. 10.1053/j.gastro.2006.02.057

28 

NJHand, ZRMaster, SFEauclaire, DEWeinblatt, RPMatthews, JRFriedman. The microRNA-30 Family Is Required for Vertebrate Hepatobiliary Development. Gastroenterology. 2009;136: 10811090. 10.1053/j.gastro.2008.12.006

29 

YZang, TWang, JPan, FGao. miR-215 promotes cell migration and invasion of gastric cancer cell lines by targeting FOXO1. Neoplasma. 2017;64: 579587. 10.4149/neo_2017_412

30 

KSchee, SLorenz, MMWorren, C-CGünther, MHolden, EHovig, et al. Deep Sequencing the MicroRNA Transcriptome in Colorectal Cancer. GLHold, editor. PLoS One. 2013;8: e66165. 10.1371/journal.pone.0066165

31 

YLi, MZhang, HChen, ZDong, VGanapathy, MThangaraju, et al. Ratio of miR-196s to HOXC8 messenger RNA correlates with breast cancer cell migration and metastasis. Cancer Res. 2010;70: 78947904. 10.1158/0008-5472.CAN-10-1675

32 

RBayrak, HHaltas, SYenidunya. The value of CDX2 and cytokeratins 7 and 20 expression in differentiating colorectal adenocarcinomas from extraintestinal gastrointestinal adenocarcinomas: Cytokeratin 7-/20+ phenotype is more specific than CDX2 antibody. Diagn Pathol. 2012;7: 9. 10.1186/1746-1596-7-9

33 

JLDennis, TRHvidsten, ECWit, JKomorowski, AKBell, IDownie, et al. Markers of adenocarcinoma characteristic of the site of origin: Development of a diagnostic algorithm. Clin Cancer Res. 2005;11: 37663772. 10.1158/1078-0432.CCR-04-2236

34 

PGChu, LMWeiss. Keratin expression in human tissues and neoplasms. Histopathology. 2002;40: 403439. 10.1046/j.1365-2559.2002.01387.x

35 

JWZhang, XWang, GCLi, DWang, SHan, YDZhang, et al. MiR-30a-5p promotes cholangiocarcinoma cell proliferation through targeting SOCS3. J Cancer. 2020;11: 36043614. 10.7150/jca.41437

36 

QZhu, HLi, YLi, LJiang. MicroRNA-30a functions as tumor suppressor and inhibits the proliferation and invasion of prostate cancer cells by down-regulation of SIX1. Hum Cell. 2017;30: 290299. 10.1007/s13577-017-0170-1

37 

XLiu, QJi, CZhang, XLiu, YLiu, NLiu, et al. miR-30a acts as a tumor suppressor by double-targeting COX-2 and BCL9 in H. pylori gastric cancer models. Sci Rep. 2017;7: 7113. 10.1038/s41598-017-07193-w

38 

YToiyama, KHur, KTanaka, YInoue, MKusunoki, CRBoland, et al. Serum miR-200c Is a Novel Prognostic and Metastasis-Predictive Biomarker in Patients With Colorectal Cancer. Ann Surg. 2014;259: 735743. 10.1097/SLA.0b013e3182a6909d

39 

AKarakatsanis, IPapaconstantinou, MGazouli, ALyberopoulou, GPolymeneas, DVoros. Expression of microRNAs, miR-21, miR-31, miR-122, miR-145, miR-146a, miR-200c, miR-221, miR-222, and miR-223 in patients with hepatocellular carcinoma or intrahepatic cholangiocarcinoma and its prognostic significance. Mol Carcinog. 2013;52: 297303. 10.1002/mc.21864