skip to main content
10.1145/2665970.2665992acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

TILD: A Strategy to Identify Cancer-related Genes Using Title Information in Literature Data

Published: 07 November 2014 Publication History

Abstract

After genome project in 1990s, researches which are involved with gene have been progressed. These studies unearthed that gene is cause of disease, and relations between gene and disease are important. In this reason, we proposed a strategy called TILD that identifies cancer-related genes using title information in literature data. To implement our method, we selected cancer-specific literature data from the online database. We then extracted genes using text mining. In the next step, we classified into two kinds for extracted genes using title information. If genes are located in title, then they are classified as hub genes. In the contrast, if genes are located in body, then they are classified as sub genes which are connected with hub genes. We iterated the processes for each paper to construct the cancer-specific local gene network. In the last step, we constructed global cancer-specific gene network by integrating all local gene network, and calculated a score for each gene based on analysis of the global gene network. We assumed that genes in title have meaningful relations with cancer, and other genes in the body are related with the title genes. For validation, we compared with other methods for the top 20 genes inferred by each approach. Our approach found more cancer-related genes than comparable methods.

References

[1]
Swanson DR. 1986. Fish oil, Raynaud's syndrome, and undiscovered public knowledge. Perspect Biol Med 1986, 30(1):7--18
[2]
Chiang, J.H., Yu, H.C., and Hsu, H.J. 2004. GIS: a biomedical text-mining system for gene information discovery. Bioinformatics. 20, 1 (2004), 120--121.
[3]
National Library of Medicine (US). Genetics Home Reference {Internet}. Bethesda (MD): The Library; 2013 Sep 16 {cited 2013 Sep 19}. DOI=http://ghr.nlm.nih.gov/.
[4]
HGNC Database, HUGO Gene Nomenclature Committee (HGNC), EMBL Outstation - Hinxton, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK. DOI=http://www.genenames.org/ {June. 2014}
[5]
KEGG: Kyoto Encyclopedia of Genes and Genomes. DOI=http://www.genome.jp/kegg/ {June. 2014}
[6]
Lee, S., Choi, J., Park, K., Song, M., and Lee, D. 2012. Discovering context-specific relationships from biological literature by using multi-level context terms. BMC Medical Informatics and Decision Making 12(Suppl 1):S1 (2012)
[7]
Li, S., Wu, L., and Zhang, Z. 2006. Constructing biological networks through combined literature mining and microarray analysis: a LMMA approach. Bioinformatics. 22, 17 (2006), 2143--2150.
[8]
Hong, L., Han, Y., Zhang, H., Zhao, Q., Yang, J., Ahuja, N. 2013. High expression of epidermal growth factor receptor might predict poor survival in patient with colon cancer: a meta-analysis. Genet Text Mol Biomarkers. 2013; 17(4) :348--51.
[9]
Teng, Z., Wang, L., Cai, S., Yu, P., Wang, J., Gong, J., Liu, Y. 2013. The 677C>T (rs1801133) polymorphism in the MTHFR gene contributes to colorectal cancer risk: a meta-analysis based on 71 research studies. PLoS One. 2013; 8(2):e55332.
[10]
Saito, S., Okabe, H., Watanabe, M., Ishimoto, T., Iwatsuki, M., Baba, Y., Tanaka, Y., Kurashige, J., Miyamoto, Y., Baba, H. 2013. CD44v6 expression is related to mesenchymal phenotype and poor prognosis in patients with colorectal cancer. Oncol Rep. 2013 Apr; 29(4):1570--8
[11]
Hinoi, T., Loda, M., Fearon, ER. . 2003. Silencing of CDX2 expression in colon cancer via a dominant repression pathway. J Biol Chem. 2003 Nov 7;278(45):44608--16
[12]
Park, JH., Kim, NS., Park, JY., Chae, YS., Kim, JG., Sohn, SK., Moon, JH., Kang, BW., Tyoo, HM., Bae, SH., Choi, GS., Jun, SH. 2010. MGMT -533G>T polymorphism is associated with prognosis for patients with metastatic colorectal cancer treated with oxaliplatin-based chemotherapy. J Cancer Res Clin Oncol. 2010 Aug;136(8):1135--42
[13]
Lin, C., Wang, QS., Wang, YJ. 2012. The CHEK2 I157T variant and colorectal cancer susceptibility: a systematic review and meta-analysis. Asian Pan J Cancer Prev. 2012;13(5);2051--5.
[14]
Bajro, MH., Josifovski, T., Panovski, M., Jankulovski, N., Nestorovska, AK., Metevska, N., Petrusevska, N., Dimovski, AJ. 2012. Promoter length polymorphism in UGT1A1 and the risk of sporadic colorectal cancer. 2012 Apr;205(4):163--7
[15]
Wang, W., Zhao, C., Jou, D., Lu, J., Zhang, C., Lin, L., Lin, J. 2013. Ursolic acid inhibits the growth of colon cancer-initiating cells by targeting STAT3. Anticancner Res. 2013 Oct;33(10):4279--84
[16]
Tang, Y., Zhu, L., Li, Y., Ji, J., Li, J., Yuan, F., Wang, D., Chen, W., Huang, O., Chen, X., Wu, J., Shen, K., Loo, WT., Chow, LW. 2012. Overexpression of epithelial growth factor receptor (EGFR) predicts better response to neo-adjuvant chemotherapy in patients with triple-negative breast cancer. J Transl Med. 2012 Sep 19;10 Suppl 1:S4.
[17]
Tulsyan, S., Agarwal, G., Lal, P., Agrawal, S., Mittal, RD., Mittal, B. 2013. CD44 gene polymorphisms in breast cancer risk and prognosis: a study in North Indian population. PLoS One. 2013 Aug 5;8(8):e71073
[18]
Jung, JA., Lim, HS. 2014. Association between CYP2D6 genotypes and the clinical outcomes of adjuvant tamoxifen for breast cancer: a meta-analysis. Pharmacogenomics. 2014 Jan;15(1):49--60.
[19]
Buck, K., Hug, S., Seibold, P., Ferschke, I., Altevogt, P., Sohn, C., Schneeweiss, A., Burwinkel, B., Jager, D., Flesch-Janys, D., Chang-Claude, J., Marme, F. 2013. CD24 polymorphisms in breast cancer: impact on prognosis and risk. Breast Cancer Res Treat. 2013 Feb;137(3):927--37.
[20]
Piotrowski, P., Lianeri, M., Rubis, B., Knula, H., Rybczynska, M., Grodecka-Gazdecka, S., Jagodzinski, PP. 2012. Murine double minute clone 2,309T/G and 285G/C promoter single nucleotide polymorphism as a risk factor for breast cancer: a Polish experience. Int J Biol Markers. 2012 Jul 19;27(2):e105--10.
[21]
Araujo, AP., Ribeiro, R., Pinto, D., Pereira, D., Sousa, B., Mauricio, J., Lopes, C., Medeiros, R. 2009. Epidermal growth factor genetic variation, breast cancer risk, and waiting time to onset of disease. DNA Cell Biol. 2009 May;28(5):265--9.
[22]
National Cancer Institute: Comprehensive Cancer Information. DOI=http://www.cancer.gov/{June. 2014}
[23]
Vanunu, O., Magger, O., Ruppin, E., Shlomi, T., Sharan, R. 2010. Associating Genes and Protein Complexes with Disease via Network Propagation. PLoS Comput Biol 6(1): e1000641.
[24]
Gottlieb, A., Magger, O., Berman, I., Ruppin, E., Sharan, R. 2011. PRINCIPLE: a tool for associating genes with diseases via network propagation. Bioinformatics. 2011. 27(23):3325--3326
[25]
: MEDLINE Retrieval on the World Wide Web. DOI=http://www.ncbi.nlm.nih.gov/ / {June. 2014}
[26]
Wellcome Trust Sanger Institute. DOI=http://www.sanger.ac.uk/ {June. 2014}
[27]
Xie, B., Dimg, Q., Han, H., Wu, D. 2013. miRCancer: a microRNA-cancer association database constructed by text mining on literature. Bioinformatics. 2013. 29(5):638--644

Index Terms

  1. TILD: A Strategy to Identify Cancer-related Genes Using Title Information in Literature Data

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        DTMBIO '14: Proceedings of the ACM 8th International Workshop on Data and Text Mining in Bioinformatics
        November 2014
        60 pages
        ISBN:9781450312752
        DOI:10.1145/2665970
        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 07 November 2014

        Check for updates

        Author Tags

        1. analysis
        2. cancer
        3. gene
        4. network
        5. relation
        6. text-mining

        Qualifiers

        • Poster

        Funding Sources

        Conference

        CIKM '14
        Sponsor:

        Acceptance Rates

        DTMBIO '14 Paper Acceptance Rate 22 of 211 submissions, 10%;
        Overall Acceptance Rate 41 of 247 submissions, 17%

        Upcoming Conference

        CIKM '25

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 74
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 17 Feb 2025

        Other Metrics

        Citations

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media