Abstract
Lung cancer is one type of the malignant tumors. Its morbidity and mortality rate increase fast among all kinds of tumors in recent years. Therefore, lung cancer research is of great importance in biology and medicine, which has attracted lots of countries spending hundreds of millions of dollars. With the advent of the era of big data, advanced facilities and technology are no longer a luxury imagination in researchers’ mind, but a new question emerges that how to get to know the trends of research through millions of papers. In this paper, Word2Vec is used as a word representation method for abstracts of lung cancer research papers to get the feature words’ vectors. Before applying Word2Vec, TextRank is used to extract feature words at first. Cosine similarity algorithm based on Word2Vec is used to compare the similarity of the annual lung cancer papers. Moreover, the similarity description model based on Word2Vec is built and further analysis based on this model is applied to explore the trends of lung cancer research papers. Researchers could utilize it to understand the trend of lung cancer research papers, obtain clear direction of research and carry out further research based on this trend.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Hoffman, P.C., Mauer, A.M., Vokes, E.E.: Lung cancer. Lancet 355(9202), 479–485 (2000)
Mnih, A., Kavukcuoglu, K.: Learning word embeddings efficiently with noise-contrastive estimation. In: Advances in Neural Information Processing Systems, pp. 2265–2273 (2013)
Mikolov, T., Chen, K., Corrado, G., et al.: Efficient estimation of word representations in vector space. In: ICLR Workshop (2013)
Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: EMNLP, 2004, pp. 404–411 (2004)
Bengio, Y., Ducharme, R., Vincent, P., et al.: A neural probabilistic language model. J. Mach. Learn. Res. 3(6), 1137–1155 (2003)
Mikolov, T., Sutskever, I., Chen, K., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26, pp. 3111–3119 (2013)
Luo, Q., Xu, W., Guo, J.: A study on the CBOW model’s overfitting and stability. In: International Workshop on Web-Scale Knowledge Representation Retrieval & Reasoning, pp. 9–12. ACM (2014)
Cheng, W., Greaves, C., Warren, M.: From n-gram to skipgram to concgram. Int. J. Corpus Linguist. 11(4), 411–433 (2006)
Langville, A.N., Meyer, C.D.: Deeper inside PageRank. Internet Math. 1(3), 335–380 (2004)
Page, L.: The PageRank citation ranking: bringing order to the web. Stanford Digital Libraries Working Paper, vol. 9(1), pp. 1–14 (1998)
Goldberg, Y., Levy, O.: word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method. Eprint Arxiv (2014)
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. Official J. Int. Neural Netw. Soc. 61, 85 (2014)
Abramson, N., Braverman, D., Sebestyen, G.: Pattern Recognition and Machine Learning. IEEE Trans. Inf. Theory 9, 257–261 (1963)
Knuth, D.E.: Optimum binary search trees. Acta Informatica 1(1), 14–25 (1971)
Jiang, N., Rong, W, Gao M, et al. Exploration of tree-based hierarchical Softmax for recurrent language models. In: Twenty-Sixth International Joint Conference on Artificial Intelligence, pp. 1951–1957 (2017)
Papadimitriou, C.H.: Local search in combinatorial optimization. In: Computational Complexity, pp. 36–60. Springer, Basel (1994)
Pesek, M., Muzik, J.: Small-cell lung cancer: epidemiology, diagnostics and therapy. Vnitr. Lek. 63(11), 876–883 (2018)
Acknowledgement
The authors are grateful for the support of the National Natural Science Foundation of China (61602207, 61572228, and 61472158), the Zhuhai Premier Discipline Enhancement Scheme, and the Guangdong Premier Key-Discipline Enhancement Scheme.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Wu, J., Liang, Y., Feng, X., Song, G. (2018). Exploring Trends of Lung Cancer Research Based on Word Representation. In: Aiello, M., Yang, Y., Zou, Y., Zhang, LJ. (eds) Artificial Intelligence and Mobile Services – AIMS 2018. AIMS 2018. Lecture Notes in Computer Science(), vol 10970. Springer, Cham. https://doi.org/10.1007/978-3-319-94361-9_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-94361-9_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94360-2
Online ISBN: 978-3-319-94361-9
eBook Packages: Computer ScienceComputer Science (R0)