cancer lncRNA gene database (PCdb) Home
cancer lncRNA database

Data collection of lncRNACancer and how to use lncRNACancer:

1. Data generation of lncRNACancer database

    Cancer gene collection

    Calculating co-expression between cancer genes and lncRNA with matched samples for 11 TCGA cancer types

2. Information for cancer genes and lncRNAs

    General information and literature evidence

    Gene expression profile

    Co-expressing lncRNAs

    Gene regulation

    Mutation information


    Protein-protein interaction

    lncRNA information and expression

3. Query and search database

    Text search of cancer lncRNA gene

    Quick access information in database

    BLAST all cancer genes

4. Browse database

    By chromosome and cancer types

    By cancer gene set and KEGG pathway

5. Data download

6. Reference

Data generation of lncRNACancer database

Long non-coding RNA (lncRNA) refers as non-coding transcripts longer than 200 bases and are abundant in various normal and cancer tissues (Kung, Colognori et al. 2013). In general, lncRNAs have relatively low expression than protein-coding genes which more tissue-specificity (Derrien, Johnson et al. 2012). Rather than being regarded transcriptional , there is emerging evidence that lncRNAs play important roles in regulation of gene expression through influencing chromatin modification, transcriptional complexes targeting, mRNA splicing, protein translation (Rinn and Chang 2012, Roberts, Morris et al. 2014). In addition, recent accumulated evidences about dysregulation of lncRNA indicate their critical roles to activate different hallmark of human disease such as cancer by influencing gene expression in epigenetic, transcriptional and post-transcriptional level (Prensner and Chinnaiyan 2011).

The primary aim of the database is to support cancer lncRNA research by maintaining a high quality cancer lncRNA/coding-gene co-expression network that serves as a comprehensive and accurately gene regulatory network for cancer lncRNAs , with extensive cross-references and querying interfaces freely accessible to the scientific community.

cancer gene collection

Intuitively, protein-coding genes that are neighbors or overlap with the expressed lncRNAs are thought to be in close with lncRNA function with concordant expression patterns (Cabili, Trapnell et al. 2011, Spurlock, Tossberg et al. 2015). However, the expression levels of majority lncRNAs show more discordant expression pattern with their protein-coding gene neighbors than expected in various model organisms (Guttman and Rinn 2012, Nam and Bartel 2012, Pauli, Valen et al. 2012, Zhang, Liao et al. 2014). Therefore, the mere physical proximity of genes to a specific lncRNA is not useful to imply cellular functional clues. More importantly, thorough searches and analyses of the interactions between lncRNA and non-neighboring genes may help to infer potential biological function. However, these is rare data resource to provide the comprehensive co-expression pattern of lncRNAs and well-known cancer genes across multiple cancer types. This lncRNANet database contains the summarized billons of pre-computed co-expression pattern for 11 cancers from 2922 matched TCGA samples with both lncRNA and coding-gene expression. This resource will enable researcher to explore lncRNA expression pattern, their affected cancer genes and pathways, biological significance in the context of specific cancer types and other useful annotation related to particular kind of lncRNA-cancer gene interaction.

Exhaustive cancer gene collection:
The first step to explore the co-expression between lncRNAs and cancer genes is collecting a reliable cancer gene list. To this aim, we performed extensive data collection. The Human oncogene (Zhao, Sun et al. 2013) and experimentally characterized tumor suppressor (Zhao, Sun et al. 2013) were collected from our previous network analysis. A non-redundant list of 2102 human cancer genes were also downloaded from allOnco database. This gene list are integrated from 8 previous studies (Akagi, Suzuki et al. 2004, Futreal, Coin et al. 2004, Sjoblom, Jones et al. 2006, Huret, Ahmad et al. 2013, Vogelstein, Papadopoulos et al. 2013). For all the cancer genes, we reconciled them with current NCBI Entrez gene database where outdated synonyms were updated. In total, we have 9 cancer gene set as below:

  • tumor suppressor, human tumor suppressors from TSGene database
  • human oncogenes collected from M Zhao, J Sun, Z Zhao. PloS one 7 (8), e44175
  • Atlas, the atlas of genetics and cytogenetics in oncology and haematology, an interactive database. Nucleic Acids Res. 2000 Jan 1;28(1):349-51
  • CANgenes, the consensus coding sequences of human breast and colorectal cancers. Science. 2006 Oct 13;314(5797):268-74
  • CIS, RTCGD: retroviral tagged cancer gene database. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D523-7
  • Human Lymphoma, a list of lymphoid-specific oncogenes compiled by Marina Cavazzana-Calvo and colleagues from bushman lab site
  • Cancer census gene list, a census of human cancer genes. Nature Reviews Cancer 4, 177-183 (March 2004)
  • Waldman, the cancer genes sorted by chromosomal locus, with links to OMIM
  • Vogelstein, the cancer genes related to chromosomal breakpoints; Science. 2013 Mar 29;339(6127):1546-58
  • Other collection, from Cold Spring Harbor Retroviruses Chapter on Oncogenes, an early version of the CIS database, a list from Dr. Tony Hunter, and misc. additions from the literature, see bushman lab site
  • Calculating co-expression between cancer genes and lncRNA with matched samples for 11 TCGA cancer types

    The protein-coding gene expression data for 11 cancer types were downloaded from pan-cancer gene expression analysis. The eleven cancer types are BLCA (bladder urothelial carcinoma carcinoma), BRCA (breast invasive carcinoma), COAD (colon adenocarcinoma), HNSC (head and neck squamous cell carcinoma), KIRC (kidney renal clear cell carcinoma), LUAD (lung adenocarcinoma), LAML (acute myeloid leukemia), LUSC (lung squamous cell carcinoma), OV (high-grade serous ovarian cancer), READ (rectum adenocarcinoma), and UCEC (uterine corpus endometrial carcinoma). To explore the co-expression network of lncRNA, required we represent the mutual regulation between TF and miRNA. We estimated the expression correlation among these regulatory pairs using the s correlation method that is implemented in the R language package (version 2.14.0) to calculate their expression correlation scores and corresponding p-values 19. For all the p-values in each type of pair, a false discovery rate (FDR) was applied to correct the statistical significance of multiple testing. For all the pairs from both OVC-related gene and TF, we required their expression correlation scores be less than miRNA miRNA Spearman the match samples between lncRNA and mRNA expression profiles from TCGA. Over 9,000 lncRNAs with gene expressions were downloaded from Mitranscriptome.

    Based on the expression profiles of cancer genes and lncRNAs, we first computed expression correlation scores, which represent the potential regulation between gene and lncRNA. We estimated the expression correlation among all the genes each collected cancer gene set and all lncRNA transcripts using the s correlation method that is implemented in the R language package (version 3.0.2) to calculate their expression correlation scores and corresponding P-values. For all the P-values in each type of cancer genes, a false discovery rate (FDR) was applied to correct the statistical significance of multiple testing. For all the pairs between cancer genes and lncRNSs, we required their absolute expression correlation scores greater than 0.3 and the FDR adjusted P-values be less than 0.01. According to these criteria, we have an appropriate number of pairs in each type of regulatory relationship.Spearman

    Information for cancer genes and lncRNAs  [ top ]

    For cancer genes, the annotattions are represented on seven different types of pages, including general information view, lncRNA view, gene expression view, gene regulation view, gene mutation view, gene homologs view, and gene interaction view.

    The general information page is like the following:

    In this page, users can find the general information about collected cancer genes from various data sources. It is easy to switch to other annotations by clicking the hyperlink at the top of the page.

    User can find the details of the co-expressed lncRNAs in the lncRNA highlight page as below. The lncRNA transcription ID, Cancer Network, Correlation coefficient, Raw P-value, and FDR corrected P-values can be found in the page.

    The gene expression page is as below:

    In the page, users can find gene expression profiles from 184 human tumor samples and 84 normal tissue samples from BioGPS. It is easy to view all the sample information by clicking the hyperlink in the profile images. Some genes have multiple probes; to provide an unbiased view for users, we presented all the gene expressions from all probes without any modification.

    User can obtain all the sample inforamtion by clicking on the expression images.

    The gene regulation page appears as follows:

    The transcription factor regulation and post-transcriptional modification information were integrated from the TRANSFAC and dbPTM databases. In addition, the methylation in promoter regions was annotated based on data from the DiseaseMeth database.

    The gene mutation page appears as follows:

    All the cancer related mutations were collected from the COSMIC database.

    The gene homolog page appears as follows:

    All the homologs from NCBI HomoloGene were collected from its public website data portal database.

    The gene interaction page appears as follows:

    All the related protein-protein interactions were collected from the PathwayCommon database; we further divided the interactions into three main types, including "Physical Interaction," "Metabolic Interaction," and "Signaling Interaction."

    lncRNA information and expression:

    User can access all the lncRNA information in lncRNA annotation page. We also downloaded the gene expression summary across multiple cancer and normal tissues from MItranscriptome.

    Query and sequence search against database   [ top ]

    All the cancer genes and their annotations in our database are searchable. The text search (Query) and sequence-based BLAST (BLAST) are provided.

    Text search of various annotation in our database

    Users can search against the lnCaNet by typing its name, accession IDs and its characteristics, including genomic location, interaction partner, mutation, biological pathway, and genetic disease. In total, we provided four different search forms for users, including "Gene General Information Search", "Literature Search", "Mutation Search", and "Other Annotation Search" allow users to access general information, literature-based information, mutation, and other annotation information respectively.

    The search is performed by typing keywords into any field separately or into several fields simultaneously in the query forms. Generally, text search information in the each searching form mainly includes three steps. Take the basic information query as an example below

  • select a specific annotation or field from from the dropdown menu in basic gene information and mutation query forms.

  • Input your interesting keyword.

  • In addition, the basic gene information and mutation query forms support the logical 'And,' 'Or,' and 'Not' operators to combine multiple keywords.

    The search result shows the list of matched cancer lncRNA genes linked to the detailed gene information page below.

    Quick search a list of genes in database:

    To quickly access the information in the database, a quick search form is provided at the top of each page.

    Blast all sequences of genes in our database

    In the BLAST menu, users can search the lnCaNet database based on their input sequences. The high similarity cancer lncRNA genes with input sequences will be listed in the BLAST result page. In the input page, users can choose various sequence alignment options such as E-value and identity. The matched sequence signatures are visualized on the query sequence.

    To do a sequence-based search for all the cancer lncRNA genes, please access the BLAST pagepage.

    The output of BLAST is as below

    Click on the hyperlink in the Blast result page, users can access the cancer lncRNA genes in our database.

  • Browse database  [ top ]

    The lnCaNet database supports browsing cancer lncRNA genes using cancer types and curated organ and tissue types. In the cancer type page, users can explore the 288 cancer lncRNA types. In addition, to help users get a bird's eye view for specific topic of cancer lncRNA genes, the classified organ and tissue types were provided.

    In addtion, lnCaNet also supports annotation-based browsing including chromosome.

    Using different chromosomes

    From the Browser page, users can browse the genes in lnCaNet by their chromosome location. Moreover, users can obtain the cancer lncRNA gene lists from different cancer type and organ tissue information.

    Data download   [ top ]

    Users can freely download all the pre-computed 110 cancer gene-lncRNA co-expression network in our lnCaNet for academic researchers, but not for profit purposes. Please access Download page.

    If users have any suggestion to add new comment to records in current lnCaNet or to revise wrong information in current lnCaNet,please send us email directly.

    Reference   [ top ]

    Akagi, K., T. Suzuki, R. M. Stephens, N. A. Jenkins and N. G. Copeland (2004). "RTCGD: retroviral tagged cancer gene database." Nucleic Acids Res 32(Database issue): D523-527.
    Cabili, M. N., C. Trapnell, L. Goff, M. Koziol, B. Tazon-Vega, A. Regev and J. L. Rinn (2011). "Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses." Genes Dev 25(18): 1915-1927.
    CDerrien, T., R. Johnson, G. Bussotti, A. Tanzer, S. Djebali, H. Tilgner, G. Guernec, D. Martin, A. Merkel, D. G. Knowles, J. Lagarde, L. Veeravalli, X. Ruan, Y. Ruan, T. Lassmann, P. Carninci, J. B. Brown, L. Lipovich, J. M. Gonzalez, M. Thomas, C. A. Davis, R. Shiekhattar, T. R. Gingeras, T. J. Hubbard, C. Notredame, J. Harrow and R. Guigo (2012). "The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression." Genome Res 22(9): 1775-1789.
    CFutreal, P. A., L. Coin, M. Marshall, T. Down, T. Hubbard, R. Wooster, N. Rahman and M. R. Stratton (2004). "A census of human cancer genes." Nat Rev Cancer 4(3): 177-183.
    CGuttman, M. and J. L. Rinn (2012). "Modular regulatory principles of large non-coding RNAs." Nature 482(7385): 339-346.
    CHuret, J. L., M. Ahmad, M. Arsaban, A. Bernheim, J. Cigna, F. Desangles, J. C. Guignard, M. C. Jacquemot-Perbal, M. Labarussias, V. Leberre, A. Malo, C. Morel-Pair, H. Mossafa, J. C. Potier, G. Texier, F. Viguie, S. Yau Chun Wan-Senon, A. Zasadzinski and P. Dessen (2013). "Atlas of genetics and cytogenetics in oncology and haematology in 2013." Nucleic Acids Res 41(Database issue): D920-924.
    CKung, J. T., D. Colognori and J. T. Lee (2013). "Long noncoding RNAs: past, present, and future." Genetics 193(3): 651-669.
    CNam, J. W. and D. P. Bartel (2012). "Long noncoding RNAs in C. elegans." Genome Res 22(12): 2529-2540.
    CPauli, A., E. Valen, M. F. Lin, M. Garber, N. L. Vastenhouw, J. Z. Levin, L. Fan, A. Sandelin, J. L. Rinn, A. Regev and A. F. Schier (2012). "Systematic identification of long noncoding RNAs expressed during zebrafish embryogenesis." Genome Res 22(3): 577-591.
    CPrensner, J. R. and A. M. Chinnaiyan (2011). "The emergence of lncRNAs in cancer biology." Cancer Discov 1(5): 391-407.
    CRinn, J. L. and H. Y. Chang (2012). "Genome regulation by long noncoding RNAs." Annu Rev Biochem 81: 145-166.
    CRoberts, T. C., K. V. Morris and M. S. Weinberg (2014). "Perspectives on the mechanism of transcriptional regulation by long non-coding RNAs." Epigenetics 9(1): 13-20.
    CSjoblom, T., S. Jones, L. D. Wood, D. W. Parsons, J. Lin, T. D. Barber, D. Mandelker, R. J. Leary, J. Ptak, N. Silliman, S. Szabo, P. Buckhaults, C. Farrell, P. Meeh, S. D. Markowitz, J. Willis, D. Dawson, J. K. Willson, A. F. Gazdar, J. Hartigan, L. Wu, C. Liu, G. Parmigiani, B. H. Park, K. E. Bachman, N. Papadopoulos, B. Vogelstein, K. W. Kinzler and V. E. Velculescu (2006). "The consensus coding sequences of human breast and colorectal cancers." Science 314(5797): 268-274.
    CSpurlock, C. F., 3rd, J. T. Tossberg, Y. Guo, S. P. Collier, P. S. Crooke, 3rd and T. M. Aune (2015). "Expression and functions of long noncoding RNAs during human T helper cell differentiation." Nat Commun 6: 6932.
    Vogelstein, B., N. Papadopoulos, V. E. Velculescu, S. Zhou, L. A. Diaz, Jr. and K. W. Kinzler (2013). "Cancer genome landscapes." Science 339(6127): 1546-1558.
    Volders, P. J., K. Verheggen, G. Menschaert, K. Vandepoele, L. Martens, J. Vandesompele and P. Mestdagh (2015). "An update on LNCipedia: a database for annotated human lncRNA sequences." Nucleic Acids Res 43(8): 4363-4364.
    Zhang, Y. C., J. Y. Liao, Z. Y. Li, Y. Yu, J. P. Zhang, Q. F. Li, L. H. Qu, W. S. Shu and Y. Q. Chen (2014). "Genome-wide screening and functional analysis identify a large number of long noncoding RNAs involved in the sexual reproduction of rice." Genome Biol 15(12): 512.
    Zhao, M., J. Sun and Z. Zhao (2012). "Distinct and competitive regulatory patterns of tumor suppressor genes and oncogenes in ovarian cancer." PLoS One 7(8): e44175.
    Zhao, M., J. Sun and Z. Zhao (2013). "Synergetic regulatory networks mediated by oncogene-driven microRNAs and transcription factors in serous ovarian cancer." Mol Biosyst 9(12): 3187-3198.
    Zhao, M., J. Sun and Z. Zhao (2013). "TSGene: a web resource for tumor suppressor genes." Nucleic Acids Res 41(Database issue): D970-976.