PLoS ONE
Home Position preference of essential genes in prokaryotic operons
Position preference of essential genes in prokaryotic operons
Position preference of essential genes in prokaryotic operons

Competing Interests: The authors have declared that no competing interests exist.

Article Type: Research Article Article History
Abstract

Essential genes, which form the basis of life activities, are crucial for the survival of organisms. Essential genes tend to be located in operons, but how they are distributed in operons is still unclear for most prokaryotes. In order to clarify the general rule of position preference of essential genes in operons, an index of the average position of genes in an operon was proposed, and the distributions of essential and non-essential genes in operons in 51 bacterial genomes and two archaeal genomes were analyzed based on this new index. Consequently, essential genes were found to preferentially occupy the front positions of the operons, which tend to be expressed at higher levels.

Liu,Luo,Gao,and Ning: Position preference of essential genes in prokaryotic operons

Introduction

Essential genes usually refer to genes whose inactivation or loss causes either severe growth impairment, irreversible growth arrest, or cell death [1]. Essential genes are necessary for cells or organisms to survive under specific conditions [2, 3]. These genes constitute the minimal gene set required for living cells. Therefore, the functions encoded by this gene set are considered the basis of life [4, 5]. The study of essential genes has become a hot topic, as it is helpful to explore the origin and evolution of life, as well as provide an important basis for discovery of drug targets [6, 7], treatment of diseases [1, 8], and design of minimal genomes [9, 10]. Currently, essential genes can be identified through a series of experimental methods, including transposon mutagenesis [11], antisense RNA silencing [12], single-gene knockout technology [13], and other methods. An increasing number of essential genes have been genome-widely identified, and this facilitates the study of characteristic differences between essential and non-essential genes. For example, in prokaryotes, essential genes are found to be preferentially located on the leading strand of chromosomes [14, 15], and further studies have shown that only those with certain COG functional subclasses are preferentially located on the leading strand [16, 17]. Proteins corresponding to essential genes were enriched in the cytoplasm, and the proportion of non-essential genes in the plasma membrane, periplasm, outer membrane, cell wall, and extracellular space is significantly higher than that of essential genes [18]. Essential genes in genomic islands are significantly fewer than those outside of genomic islands [19]. Compared with non-essential genes, bacterial essential genes tend to encode core functions related to transcription, translation and replication [4, 20], and have a higher ratio of enzymes [21]. In addition, essential genes have higher expression levels than non-essential genes [22, 23] and are more evolutionarily conserved [24, 25].

An operon is the set of one or several genes and their associated regulatory elements, which are transcribed as a polycistronic unit [26, 27]. Operons are widely used as basic transcriptional and functional units [28]. Regarding operon formation, the most widely accepted theory is the co-regulation hypothesis, which assumes that operons are formed by rearranging two or more genes together, while maintaining this structure by selecting a coordinated transcriptional regulation and translation of functionally related proteins [29, 30]. Regarding the evolution of operons, the regulatory model and selfish model are two generally accepted models [31]. The former emphasizes the advantage of co-transcription for regulatory purposes, while the latter emphasizes the advantage of genome proximity for co-transfer of adjacent functions [32]. Other proposed operon evolution models have received less attention, mainly because they do not conform to the existing evidence [33]. According to the co-regulation hypothesis, essential genes are preferentially located in operons, which has been confirmed in Escherichia coli [29, 30, 34]. In addition, studies have found that essential genes are not only preferentially located in operons, but also often occupy the first position in operons [35]. However, this research has certain limitations, such as the relatively small number of prokaryotic genomes analyzed, and conclusions drawn without considering the influence of the proportion of essential genes in an operon on which gene occupies the first position. In particular, focusing only on the preference of the first operon position does not lead to a general conclusion on the position preference of essential genes in operons.

With the wide application of high-throughput experimental technologies in the identification of essential genes, essential genes data has increased rapidly, and the essential genes database DEG is also constantly updated to include these essential genes data. However, at present, the distribution of essential genes in most prokaryotic operons listed in DEG 15 is not clear. As reliable information in the operons database becomes available for more prokaryotic genomes, a systematic study on the distribution of essential genes in operons in prokaryotic genomes is possible.

In the present work, the preferences of essential and non-essential genes for special positions in operons were studied for 53 prokaryotic genomes, including 51 bacteria and 2 archaea. By analyzing the distribution of essential genes in operons, it was found that essential genes preferentially occupy the first position of operons, as reported in a previous study. However, after removing operons in which all genes are essential genes, the rule becomes invalid. Here, an index of the average position of genes in an operon is proposed to measure the position preference of essential genes in operons. By comparing the average positions of essential and non-essential genes in operons, it was found that essential genes tend to occupy the front positions of operons compared to non-essential genes, which was also confirmed by analyzing the proportion of essential genes located in the first half of operons.

Materials and methods

Data source

The essential genes data of the 53 prokaryotic genomes studied here were downloaded from the DEG database (version 15) [36] (http://essentialgene.org/). For some genomes, essential genes have been identified through different experimental methods. In this study, only one essential genes set was reserved by considering the reliability of the method used or the results. The corresponding operons data were obtained from the DOOR database [28] (http://161.117.81.224/DOOR3). For the prokaryotic genome with multiple chromosomes, only the essential genes on the main chromosome were studied. For the operons data in the DOOR database, only multi-gene operons were regarded as operons.

Determination of DNA strands

The replication origins and termini were derived from the DoriC database [37, 38] (http://tubic.tju.edu.cn/doric/), based on which the leading and lagging strands for each genome can be determined.

Index of average position of genes in an operon

Assuming that an operon contains n genes, including n1 essential genes and n2 non-essential genes (1≤n1<n, 1≤n2<n), the position occupied by a certain gene is x, and the average position of genes in an operon is defined as

Similarly, the average position of essential genes in an operon is

And the average position of non-essential genes in an operon is

And the relative position of essential genes in an operon is calculated as follows:

Only operons containing at least one essential gene were considered. It should be noted that if all the genes in an operon are essential genes, the position is all occupied by an essential gene. Therefore, only the positions in operons in which both essential and non-essential genes exist were analyzed.

Results and discussion

Position preference of essential genes in operons

Position preference of essential and non-essential genes in special positions of operons

Essential genes in E. coli have been found to be enriched in operons [39], but whether this is a common feature of other bacteria and archaea needs to be verified. There was a clear trend for essential genes to occupy operons across 44 prokaryotic genomes (P ≤ 0.05, Fisher’s exact test) (S1 Table in S1 File). Further, the statistical significance was very high in 33 of these conditions (P < 2.0 × 10−4, Fisher’s exact test) (S1 Table in S1 File).

It was also found that most of the essential genes preferentially occupied the first position of the operon they were located in (Fig 1). Among them, in 44 genomes, there are more than 50% of operons in which the essential genes occupy the first position (S2 Table in S1 File), consistent with previous results. Among 39 genomes, compared with non-essential genes, essential genes tend to occupy the first position of the operon (P ≤ 0.05, Fisher’s exact test) (S2 Table in S1 File). We also studied the distribution of essential genes in operons containing two and three genes, and performed a chi-squared test, which confirmed that essential genes preferentially occupy the first position in operons of most species (P ≤ 0.05; S3 Table in S1 File). In addition, the distribution of non-essential genes in the operons was analyzed. Consequently, in 53 prokaryotic genomes, non-essential genes were found to frequently occupy the last position of the operon (Fig 1). Among them, in 51 genomes, in more than 50% of operons, non-essential genes occupy the last position (S2 Table in S1 File). In 37 genomes, compared with essential genes, non-essential genes tend to occupy the last position of the operon (P ≤ 0.05, Fisher’s exact test) (S2 Table in S1 File).

The relationship between distribution of essential and non-essential genes and proportion of essential genes.
Fig 1

The relationship between distribution of essential and non-essential genes and proportion of essential genes.

The heatmap was plotted using the heatmap function in the R package. The cells in the heatmap correspond to the proportion of genes under different conditions, and the value range is displayed in different colors. The color bar on the left side of the heatmap corresponds to the phylum classification of the species. Hierarchical clustering of analysis results in two dimensions is represented by a tree diagram. Species whose distribution of essential genes occupies the first position in less than 50% of operons are shown in red square boxes b-d, and species whose distribution of non-essential genes occupies the last position in less than 50% of operons are shown in red square box a.

We found that the positions occupied by essential and non-essential genes were related to the proportion of essential genes out of all the genes in operons (Fig 1). As can be seen from Fig 1, the essential genes of Mycoplasma genitalium G37 and Mycoplasma pneumoniae M129 account for a higher proportion of the genes in operons, resulting in a lower proportion of non-essential genes occupying the last position of the operon (box a in Fig 1). The essential genes of Staphylococcus aureus N315, Bacteroides thetaiotaomicron VPI-5482, Streptococcus pneumoniae TIGR4, Pseudomonas aeruginosa UCBPP-PA14, Campylobacter jejuni NCTC 11168, Bacillus thuringiensis BMB171, Helicobacter pylori 26695, Salmonella enterica serovar Typhimurium 14028S, and Salmonella Typhimurium LT2 account for a low proportion of the genes in operons, resulting in a low proportion of essential genes occupying the first position of operons (boxes b-d in Fig 1). The Pearson correlation coefficient [40] between the proportion of essential genes occupying the first position of operons and the proportion of essential genes in operons was 0.88, while the Pearson correlation coefficient between the proportion of non-essential genes occupying the last position of operons and the proportion of essential genes in operons was −0.52. From these 53 prokaryotic genomes, the rule can be summarized as follows: the higher the proportion of essential genes in the genes in operons, the higher the proportion of essential genes occupying the first position of operons, and the lower the proportion of non-essential genes occupying the last position of operons. Conversely, the lower the proportion of essential genes in the genes in operons, the lower the proportion of essential genes occupying the first position of operons, and the higher the proportion of non-essential genes occupying the last position of operons.

Position preference of essential genes in general positions of operons

It should be noted that if all the genes in an operon are essential genes, the first position is occupied by an essential gene. Therefore, operons whose genes are exclusively essential genes were removed from analysis, and then the distribution of essential genes in hybrid operons (operons containing both essential and non-essential genes), was analyzed again (S2 Table in S1 File). It was found that among 53 prokaryotic genomes, the number of genomes in which essential genes occupy the first position in more than 50% of the operons was reduced from 44 to 19 under this analysis (S2 Table in S1 File). The average position of essential genes in hybrid operons and the proportion of essential genes in the first half of the hybrid operons were studied (Table 1). Consequently, by analyzing the average positions of essential and non-essential genes in hybrid operons of 53 prokaryotic genomes, it was found that essential genes preferentially occupied the front positions of operons compared to non-essential genes (P = 0.004257, Student’s t-test). We also calculated the DEG, the relative position of the essential genes in operons, which is defined in Eq (4). If the relative position DEG is negative, it means that the average position of essential genes is in front of the average position of all genes, whereas if the relative position DEG is positive, it means that the average position of essential genes is behind the average position of all genes. As shown in Fig 2, the relative positions of essential genes in most genomes were negative, indicating that essential genes were biased toward the front positions of operons. Compared with the random arrangement result, the relative position of essential genes is different from zero, and essential genes tend to be located in the front positions of operons (P = 9.772e-07, Student’s t-test).

Bubblechart of essential genes proportion and the relative positions.
Fig 2

Bubblechart of essential genes proportion and the relative positions.

In the left part of the figure, the size of the dot represents the number of essential genes occupying the first half of operons, and the color of the dot represents the proportion of essential genes occupying the first half of operons. The part on the left is sorted according to the proportion of essential genes in the first half of operons from high to low. In the right part of the figure, the size of the dot represents the number of operons, and the color of the dot represents the relative positions of essential genes.

Table 1
The average position distribution of essential and non-essential genes in operons and the proportion of essential genes in the first half of operons.
OrganismConditionRefSeqalternativesalternativesalternativesDEGNo. EG in the first half of operonsNo. EG in operonsProportionNo. Operons
Bacillus subtilis 168RichNC_0009642.182.332.31-0.136811758.12%72
Staphylococcus aureus N315RichNC_0027452.242.412.44-0.208014057.14%105
Haemophilus influenzae Rd KW20RichNC_0009072.302.302.300.0018533255.72%189
Mycoplasma genitalium G37RichNC_0009083.524.153.55-0.0311020354.19%44
Streptococcus pneumoniae TIGR4RichNC_0030282.442.592.51-0.07518560.00%60
Streptococcus pneumoniae R6RichNC_0030982.192.522.45-0.26548662.79%68
Helicobacter pylori 26695RichNC_0009152.903.073.02-0.1215726060.38%129
Mycobacterium tuberculosis H37RvRichNC_0009622.252.332.28-0.0320736456.87%229
Salmonella Typhimurium LT2RichNC_0031972.442.272.320.127714553.10%116
Francisella novicida U112RichNC_0086012.232.722.52-0.2913021061.90%125
Acinetobacter baylyi ADP1RichNC_0059662.012.342.17-0.1614122762.11%140
Mycoplasma pulmonis UAB CTIPRichNC_0027712.172.582.31-0.148213162.60%70
Pseudomonas aeruginosa UCBPP-PA14RichNC_0084632.532.542.54-0.0113123855.04%156
Staphylococcus aureus NCTC 8325RichNC_0077952.122.152.17-0.057913956.83%93
Escherichia coli MG1655RichNC_0009132.412.412.340.0710417958.10%108
Caulobacter crescentus NA1000RichNC_0119162.072.492.23-0.1616325464.17%149
Streptococcus sanguinis SK36RichNC_0090092.102.442.29-0.196310659.43%70
Porphyromonas gingivalis ATCC 33277RichNC_0107291.962.782.34-0.3816923970.71%121
Bacteroides thetaiotaomicron VPI-5482RichNC_0046632.232.392.38-0.1511319258.85%143
Burkholderia thailandensis E264Rich.NC_0076512.152.332.23-0.0811818962.43%113
Salmonella enterica serovar Typhimurium 14028SRichNC_0168562.902.342.420.48235442.59%44
Sphingomonas wittichii RW1RichNC_0095112.162.332.22-0.0618529762.29%208
Shewanella oneidensis MR-1RichNC_0043472.292.672.43-0.1411118659.68%100
Campylobacter jejuni NCTC 11168RichNC_0021632.953.553.38-0.4313120364.53%117
Salmonella enterica serovar SL1344RichNC_0168102.202.302.26-0.069717455.75%106
Salmonella enterica serovar Typhi Ty2RichNC_0046312.302.192.210.098115452.60%104
Bacteroides fragilis 638RRich.NC_0167762.002.542.29-0.2918727667.75%176
Burkholderia pseudomallei K96243RichNC_0063502.362.692.55-0.1916326860.82%150
Pseudomonas aeruginosa PAO1RichNC_0025162.402.592.50-0.1013321761.29%121
Streptococcus pyogenes MGAS5005Todd-HewittNC_0072972.262.452.44-0.187313155.73%81
Streptococcus pyogenes NZ131Todd-HewittNC_0113752.092.242.26-0.177413256.06%88
Synechococcus elongatus PCC 7942RichNC_0076041.852.121.99-0.1418229162.54%205
Rhodopseudomonas palustris CGA009RichNC_0052961.932.022.00-0.0713022158.82%162
Streptococcus agalactiae A909RichNC_0074322.092.382.32-0.238815058.67%95
Acinetobacter baumannii ATCC 17978Murine model of pneumoniaNC_0090851.611.861.71-0.10101566.67%14
Agrobacterium fabrum str. C58RichNC_0030621.872.262.06-0.199614466.67%93
Brevundimonas subvibrioides ATCC 15264RichNC_0143752.282.522.33-0.0514123560.00%142
Bacillus thuringiensis BMB171RichNC_0141712.132.222.22-0.0913223256.90%207
Campylobacter jejuni 81–176RichNC_0087872.783.263.15-0.3716126860.07%127
Francisella tularensis Schu 4RichNC_0065702.232.782.53-0.3013321262.74%115
Streptococcus mutans UA159RichNC_0043502.312.492.49-0.186511457.02%70
Escherichia coli O157:H7 EDL933RichNC_0026552.232.432.33-0.1022737760.21%239
Ralstonia solanacearum GMI1000RichNC_0032952.182.472.36-0.1813621363.85%151
Streptococcus suis P1/7Columbia blood base agarNC_0129252.062.172.13-0.0711018160.77%131
Staphylococcus aureus USA300_TCH1516RichNC_0100792.222.412.41-0.197613755.47%87
Staphylococcus aureus MW2RichNC_0039232.302.362.36-0.067413455.22%84
Staphylococcus aureus MSSA476RichNC_0029532.502.432.430.078115452.60%88
Staphylococcus aureus MRSA252RichNC_0029522.442.532.45-0.018214955.03%87
Burkholderia cenocepacia J2315RichNC_0110002.012.472.22-0.2112519165.45%118
Vibrio cholerae O1 biovar eltor N16961RichNC_0025052.883.002.90-0.029717156.73%77
Mycoplasma pneumoniae M129RichNC_0009123.303.853.37-0.0712221457.01%53
Methanococcus maripaludis S2RichNC_0057912.212.162.150.069217652.27%106
Sulfolobus islandicus M.16.4RichNC_0127262.462.482.460.0013124154.36%130

We also studied the proportion of essential genes in the first half of hybrid operons. Please note that if the number of genes in the operon is odd, the middle gene is considered to be in the first half of the operon. The bubblechart of the relative position of essential genes in operons and the proportion of essential genes occupying the first half of operons is shown in Fig 2. It was found that the relative positions of essential genes in the genomes with a lower proportion of essential genes occupying the first half of operons tended to be positive. The Pearson correlation coefficient between them was −0.78. By analyzing the relative position of essential genes in operons and the proportion of essential genes occupying the first half of operons in 53 prokaryotic genomes, it was confirmed that essential genes tend to occupy the front positions of operons. Moreover, the Pearson correlation coefficients between DEG and the proportion of essential genes in operons was only 0.02, while the Pearson correlation coefficients between the proportion of essential genes occupying the first half of operons and the proportion of essential genes in operons was −0.12. This indicates that these results are independent of the proportion of essential genes in operons. Therefore, compared to the previous result that essential genes tend to occupy the first position of operons [35], the present conclusion on the position preference of essential genes in operons is more general and reliable.

The possible reason for position preference of essential genes in operons

Depending on whether the operon contains essential genes, operons can be divided into three categories: operons containing only essential genes, operons containing both essential and non-essential genes, and operons containing only non-essential genes. By analyzing these three types of operons in 53 prokaryotic genomes, we found that essential genes have an impact on both gene number and the location of operons. Operons containing essential genes were more biased to be on the leading strand, and the average gene number of operons containing essential and non-essential genes was higher (S4 Table in S1 File).

Previous studies have shown that there is a strong relationship between gene expression and the number, length, and order of genes in operons [41]. In operons, the distance from the start of the gene to the end of the operon is defined as the transcription distance. Gene expression increases with an increase in the transcription distance; that is, gene expression increases with an increase in the length of the operon [42, 43]. Changes in the order of genes in operons also affect gene expression. The gene farthest from the end of the operon (or the gene closer to the promoter) was always more expressed. That is, the expression level of the gene in the first position is higher than that of the same gene at other positions [41]. In 46 prokaryotic genomes, the average position of essential genes is generally in front of the average position of non-essential genes, which indicates that essential genes tend to have a higher expression level than non-essential genes (Table 1). Operons containing essential and non-essential genes have more genes, thereby increasing the expression of genes in operons. This is consistent with the fact that essential genes are crucial genes with higher expression levels and encode proteins that perform important functions. It also explains the fact that essential genes tend to be located in operons rather than alone. This work will be of great significance for understanding the functional basis of genome organization and the practical application of synthetic biology.

Conclusion

In the present study, the position preference of essential genes in prokaryotic operons was explored systematically. The result of a previous study showed that essential genes tend to occupy the first position of operons was related to the proportion of essential genes in operons. To solve this problem, a new index, the average position of genes in an operon, is proposed, which better reflects the position preference of essential genes in operons. Thus, previous shortcomings were avoided, and more general and reliable conclusions were reached. Our work provides new insights into related research on synthetic biology, such as the construction of cell factories and the design of artificial genomes.

Acknowledgements

The authors would like to thank Prof. Chun-Ting Zhang for the invaluable assistance and inspiring discussion.

Abbreviations

COGcluster of orthologous group
EGessential gene
NEGnon-essential gene

References

GRancati, JMoffat, ATypas, NPavelka. Emerging and evolving concepts in gene essentiality. Nature Reviews Genetics. 2018;19(1):34. 10.1038/nrg.2017.74

EVKoonin. How many genes can make a cell: the minimal-gene-set concept. Annual review of genomics and human genetics. 2000;1(1):99116. 10.1146/annurev.genom.1.1.99

IBartha, Jdi Iulio, JCVenter, ATelenti. Human gene essentiality. Nature Reviews Genetics. 2018;19(1):51. 10.1038/nrg.2017.75

KKobayashi, SDEhrlich, AAlbertini, GAmati, KAndersen, MArnaud, et al. Essential Bacillus subtilis genes. Proceedings of the National Academy of Sciences. 2003;100(8):467883. 10.1073/pnas.0730515100

MItaya. An estimation of minimal genome size required for life. FEBS letters. 1995;362(3):25760. 10.1016/0014-5793(95)00233-y

MYGalperin, EVKoonin. Searching for drug targets in microbial genomes. Current Opinion in Biotechnology. 1999;10(6):5718. 10.1016/s0958-1669(99)00035-x

FYan, FGao. A systematic strategy for the investigation of vaccines and drugs targeting bacteria. Computational and Structural Biotechnology Journal. 2020;18:152538. 10.1016/j.csbj.2020.06.008

PChen, DWang, HChen, ZZhou, XHe. The nonessentiality of essential genes in yeast provides therapeutic insights into a human disease. Genome Research. 2016;26(10):135562. 10.1101/gr.205955.116

MJuhas, LEberl, JIGlass. Essence of life: essential genes of minimal genomes. Trends in Cell Biology. 2011;21(10):5628. 10.1016/j.tcb.2011.07.005

10 

CAHutchison, R-YChuang, VNNoskov, NAssad-Garcia, TJDeerinck, MHEllisman, et al. Design and synthesis of a minimal bacterial genome. Science. 2016;351: aad6253. 10.1126/science.aad6253

11 

CAHutchison, SNPeterson, SRGill, RTCline, OWhite, CMFraser, et al. Global transposon mutagenesis and a minimal Mycoplasma genome. Science. 1999;286(5447):21659. 10.1126/science.286.5447.2165

12 

RAForsyth, RJHaselbeck, KLOhlsen, RTYamamoto, HXu, JDTrawick, et al. A genome-wide strategy for the identification of essential genes in Staphylococcus aureus. Molecular Microbiology. 2002;43(6):1387400. 10.1046/j.1365-2958.2002.02832.x

13 

TBaba, TAra, MHasegawa, YTakai, YOkumura, MBaba, et al. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Molecular Systems Biology. 2006;2(1):2006.0008. 10.1038/msb4100050

14 

EPRocha, ADanchin. Essentiality, not expressiveness, drives gene-strand bias in bacteria. Nature Genetics. 2003;34(4):3778. 10.1038/ng1209

15 

JRepar, TWarnecke. Non-random inversion landscapes in prokaryotic genomes are shaped by heterogeneous selection pressures. Molecular Biology and Evolution. 2017;34(8):190211. 10.1093/molbev/msx127

16 

YLin, FGao, C-TZhang. Functionality of essential genes drives gene strand-bias in bacterial genomes. Biochemical and Biophysical Research Communications. 2010;396(2):4726. 10.1016/j.bbrc.2010.04.119

17 

MNPrice, EJAlm, APArkin. Interruptions in gene expression drive highly expressed operons to the leading strand of DNA replication. Nucleic Acids Research. 2005;33(10):322434. 10.1093/nar/gki638

18 

CPeng, FGao. Protein localization analysis of essential genes in prokaryotes. Scientific Reports. 2014;4:6001. 10.1038/srep06001

19 

XZhang, CPeng, GZhang, FGao. Comparative analysis of essential genes in prokaryotic genomic islands. Scientific Reports. 2015;5(1):12561. 10.1038/srep12561

20 

ARMushegian, EVKoonin. A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proceedings of the National Academy of Sciences. 1996;93(19):1026873. 10.1073/pnas.93.19.10268

21 

FGao, RRZhang. Enzymes are enriched in bacterial essential genes. PloS One. 2011;6(6):e21683. 10.1371/journal.pone.0021683

22 

HChen, ZZhang, SJiang, RLi, WLi, CZhao, et al. New insights on human essential genes based on integrated analysis and the construction of the HEGIAP web-based platform. Briefings in Bioinformatics. 2020;21(4):1397410. 10.1093/bib/bbz072

23 

TWang, KBirsoy, NWHughes, KMKrupczak, YPost, JJWei, et al. Identification and characterization of essential genes in the human genome. Science. 2015;350(6264):1096101. 10.1126/science.aac7041

24 

HLuo, FGao, YLin. Evolutionary conservation analysis between the essential and nonessential genes in bacterial genomes. Scientific Reports. 2015;5(1):13210. 10.1038/srep13210

25 

IKJordan, IBRogozin, YIWolf, EVKoonin. Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Research. 2002;12(6):9628. 10.1101/gr.87702

26 

FJacob, JMonod. Genetic regulatory mechanisms in synthesis of proteins. J Mol Biol. 1961;3(3):31856. 10.1016/s0022-2836(61)80072-7

27 

AMHuerta, HSalgado, DThieffry, JCollado-Vides. RegulonDB: a database on transcriptional regulation in Escherichia coli. Nucleic Acids Research. 1998;26(1):559. 10.1093/nar/26.1.55

28 

XMao, QMa, CZhou, XChen, HZhang, JYang, et al. DOOR 2.0: presenting operons and their functions through dynamic and integrated views. Nucleic Acids Research. 2014;42(D1):D6549. 10.1093/nar/gkt1048

29 

CPál, LDHurst. Evidence against the selfish operon theory. Trends in Genetics. 2004;20(6):2324. 10.1016/j.tig.2004.04.001

30 

MNPrice, KHHuang, APArkin, EJAlm. Operon formation is driven by co-regulation and not by horizontal gene transfer. Genome Research. 2005;15(6):80919. 10.1101/gr.3368805

31 

JGLawrence, JRRoth. Selfish operons: horizontal transfer may drive the evolution of gene clusters. Genetics. 1996;143(4):184360.

32 

EPRocha. The organization of the bacterial genome. Annual Review of Genetics. 2008;42:21133. 10.1146/annurev.genet.42.110807.091653

33 

JGLawrence. Gene organization: selection, selfishness, and serendipity. Annual Reviews in Microbiology. 2003;57(1):41940.

34 

SOkuda, SKawashima, KKobayashi, NOgasawara, MKanehisa, SGoto. Characterization of relationships between transcriptional units and operon structures in Bacillus subtilis and Escherichia coli. BMC Genomics. 2007;8(1):48. 10.1186/1471-2164-8-48

35 

ALGrazziotin, NMVidal, TMVenancio. Uncovering major genomic features of essential genes in Bacteria and a methanogenic Archaea. The FEBS Journal. 2015;282(17):3395411. 10.1111/febs.13350

36 

HLuo, YLin, TLiu, F-LLai, C-TZhang, FGao, et al. DEG 15, an update of the Database of Essential Genes that includes built-in analysis tools. Nucleic Acids Research. 2021;49(D1):D67786. 10.1093/nar/gkaa917

37 

HLuo, FGao. DoriC 10.0: an updated database of replication origins in prokaryotic genomes including chromosomes and plasmids. Nucleic Acids Research. 2019;47(D1):D747. 10.1093/nar/gky1014

38 

FGao, HLuo, C-TZhang. DoriC 5.0: an updated database of oriC regions in both bacterial and archaeal genomes. Nucleic Acids Research. 2012;41(D1):D903. 10.1093/nar/gks990

39 

MNPrice, APArkin, EJAlm. The life-cycle of operons. PLoS Genet. 2006;2(6):e96. 10.1371/journal.pgen.0020096

40 

ICohen, YHuang, JChen, JBenesty. Pearson Correlation Coefficient. 2009;(Chapter 5):In Noise Reduction in Speech Processing (JBenesty, JChen, YHuang and ICohen, eds), pp. 14. Springer Berlin Heidelberg, Berlin, Heidelberg.

41 

HNLim, YLee, RHussein. Fundamental relationship between operon organization and gene expression. Proceedings of the National Academy of Sciences of the United States of America. 2011;108(26):1062631. 10.1073/pnas.1105692108

42 

TNishizaki, KTsuge, MItaya, NDoi, HYanagawa. Metabolic engineering of carotenoid biosynthesis in Escherichia coli by ordered gene assembly in Bacillus subtilis. Applied and Environmental Microbiology. 2007;73(4):135561. 10.1128/AEM.02268-06

43 

KKovács, LDHurst, BPapp. Stochasticity in protein levels drives colinearity of gene order in metabolic operons of Escherichia coli. PLoS Biol. 2009;7(5):e1000115. 10.1371/journal.pbio.1000115