skip to main content
10.1145/2851613.2851866acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

A method for obtaining rich data from PubMed using SVM

Published: 04 April 2016 Publication History

Abstract

As text mining advances rapidly in the biomedical field, the importance of text data is increasing. Most text data is obtained through a Medical Subjects Headings (MeSH) term search; in this process, a large amount of valuable data is missed because the data is not indexed yet with MeSH terms. In this paper, we propose a method for obtaining additional text data in addition to that obtained using a conventional MeSH term search.
In order to obtain additional data, we used the Support Vector Machine (SVM) as the data mining method for classifying documents to related or unrelated. We evaluated the results using a frequency-based text mining approach measuring the quality of data in study of lung cancer. This was confirmed that the data extracted using our method provided as much valuable information as searching using MeSH terms. Further, we found that the amount of information found was increased by 40% using additional extracted data.

References

[1]
El-Telbany, A. and Ma, P. C. Cancer Genes in Lung Cancer. Genes & cancer, 3, 7-8 (2012), 467--80.
[2]
Goto, S. and Kanehisa, M. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Research, 28, 1 (2000), 27--30.
[3]
HUGO Gene Nomenclature Committee. HGNC Database. 2011. http://www.genenames.org/.
[4]
Le, Q. and Mikolov, T. Distributed Representations of Sentences and Documents. Proceedings of The 31st International Conference on Machine Learning, (2014), 1188--1196.
[5]
Manevitz, L. M. and Yousef, M. One-class svms for document classification. The Journal of Machine Learning Research, 2, (2002), 139--154.
[6]
Porter, M. F. An algorithm for suffix stripping. Program: electronic library and information systems, 14, 3 (1980), 130--137.

Cited By

View all
  • (2016)GRiD: Gathering rich data from PubMed using one-class SVM2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC.2016.7844911(004325-004331)Online publication date: 9-Oct-2016

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '16: Proceedings of the 31st Annual ACM Symposium on Applied Computing
April 2016
2360 pages
ISBN:9781450337397
DOI:10.1145/2851613
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 April 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bioinformatics
  2. document classification
  3. text mining

Qualifiers

  • Research-article

Funding Sources

Conference

SAC 2016
Sponsor:
SAC 2016: Symposium on Applied Computing
April 4 - 8, 2016
Pisa, Italy

Acceptance Rates

SAC '16 Paper Acceptance Rate 252 of 1,047 submissions, 24%;
Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2016)GRiD: Gathering rich data from PubMed using one-class SVM2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC.2016.7844911(004325-004331)Online publication date: 9-Oct-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media