History-Driven Entity Categorization

Duan, Yijun; Jatowt, Adam; Tanaka, Katsumi

doi:10.1007/978-3-030-26075-0_27

History-Driven Entity Categorization

Yijun Duan¹⁴,
Adam Jatowt¹⁴ &
Katsumi Tanaka¹⁴

Conference paper
First Online: 17 July 2019

997 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11642))

Abstract

Knowledge of entity histories is often necessary for comprehensive understanding and characterization of entities. In this paper we introduce a novel task of history-based entity categorization. Taking a set of entity-related documents as an input we detect latent entity categories whose members share similar histories, effectively, grouping entities based on the similarities of their historical developments. Next, we generate comparative timelines for each generated group allowing users to spot similarities and differences in entity histories. We evaluate our approach on several datasets of different entity types demonstrating its effectiveness against competitive baselines.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
We experimentally set the value of \(\lambda \) to be 0.4.
2.
For example, https://en.wikipedia.org/wiki/1939.
3.
Note that the standard deviations of event occurrence times are 0 here as the total number of used events is quite small.

References

Au Yeung, C.M., Leung, H.F.: A formal model of ontology for handling fuzzy membership and typicality of instances. Comput. J. 53(3), 316–341 (2008)
Article Google Scholar
Bairi, R.B., Carman, M., Ramakrishnan, G.: On the evolution of Wikipedia: dynamics of categories and articles. In: AAAI (2015)
Google Scholar
Bamman, D., Smith, N.A.: Unsupervised discovery of biographical structure from text. TACL 2, 363–376 (2014)
Google Scholar
Barsalou, L.W.: The instability of graded structure: implications for the nature of concepts. In: Concepts and Conceptual Development: Ecological and Intellectual Factors in Categorization, pp. 10139 (1987)
Google Scholar
Brooks, L.R.: Nonanalytic concept formation and memory for instances (1978)
Google Scholar
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: SIGIR, pp. 335–336. ACM (1998)
Google Scholar
Chang, A.X., Manning, C.D.: SUTime: a library for recognizing and normalizing time expressions. In: LREC 2012, pp. 3735–3740 (2012)
Google Scholar
Chen, Y.N., Metze, F.: Two-layer mutually reinforced random walk for improved multi-party meeting summarization. In: 2012 IEEE SLT, pp. 461–466 (2012)
Google Scholar
Li, C., Cheng, H., Xiao, Y., Xie, C., Jiang, H., Feng, S.: Timeline: a Chinese event extraction and exploration system. In: SoMeT 2018 (2018)
Google Scholar
Duan, Y., Jatowt, A., Tanaka, K.: Discovering typical histories of entities by multi-timeline summarization. In: Proceedings of the 28th ACM HT, pp. 105–114 (2017)
Google Scholar
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Article MathSciNet Google Scholar
Hartigan, J.A., Wong, M.A.: A k-means clustering algorithm. JSTOR: Appl. Stat. 28(1), 100–108 (1979)
MATH Google Scholar
Hintzman, D.L., Ludlam, G.: Differential forgetting of prototypes and old instances: simulation by an exemplar-based classification model. Mem. Cogn. 8(4), 378–382 (1980)
Article Google Scholar
Kschischang, F.R., Frey, B.J., Loeliger, H.A., et al.: Factor graphs and the sum-product algorithm. IEEE Trans. Inf. Theory 47(2), 498–519 (2001)
Article MathSciNet Google Scholar
Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.S.: Building text classifiers using positive and unlabeled examples. In: ICDM 2003, pp. 179–186. IEEE (2003)
Google Scholar
Liu, B., Lee, W.S., Yu, P.S., Li, X.: Partially supervised classification of text documents. In: ICML, vol. 2, 387–394 (2002)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Nosofsky, R.M.: Similarity, frequency, and category representations. J. Exp. Psychol.: Learn. Mem. Cogn. 14(1), 54 (1988)
Google Scholar
Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: LREC 2010, pp. 45–50 (2010)
Google Scholar
Rosch, E.: Cognitive representations of semantic categories. J. Exp. Psychol. Gen. 104(3), 192 (1975)
Article Google Scholar
Sanner, S., Guo, S., Graepel, T., Kharazmi, S., Karimi, S.: Diverse retrieval via greedy optimization of expected 1-call@ k in a latent subtopic relevance model. In: CIKM, pp. 1977–1980. ACM (2011)
Google Scholar
Wang, J., Zhu, J.: Portfolio theory of information retrieval. In: SIGIR, pp. 115–122. ACM (2009)
Google Scholar
Wang, Y., Chen, L.: K-MEAP: multiple exemplars affinity propagation with specified \(k\) clusters. IEEE Trans. Neural Netw. Learn. Syst. 27(12), 2670–2682 (2016)
Article Google Scholar
Xiao, J., Wang, J., Tan, P., Quan, L.: Joint affinity propagation for multiple view segmentation. In: ICCV 2007, pp. 1–7. IEEE (2007)
Google Scholar
Yu, H.T., et al.: A concise integer linear programming formulation for implicit search result diversification. In: WSDM, pp. 191–200. ACM (2017)
Google Scholar
Yu, H., Han, J., Chang, K.C.C.: PEBL: positive example based learning for web page classification using SVM. In: SIGKDD, pp. 239–248. ACM (2002)
Google Scholar
Zuccon, G., Azzopardi, L., Zhang, D., Wang, J.: Top-k retrieval using facility location analysis. In: Baeza-Yates, R., et al. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 305–316. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28997-2_26
Chapter Google Scholar

Download references

Acknowledgements

This research has been supported by JSPS KAKENHI grants (#17H01828, #18K19841, #18H03243).

Author information

Authors and Affiliations

Graduate School of Informatics, Kyoto University, Kyoto, Japan
Yijun Duan, Adam Jatowt & Katsumi Tanaka

Authors

Yijun Duan
View author publications
You can also search for this author in PubMed Google Scholar
Adam Jatowt
View author publications
You can also search for this author in PubMed Google Scholar
Katsumi Tanaka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yijun Duan .

Editor information

Editors and Affiliations

University of Electronic Science and Technology of China, Chengdu, China
Jie Shao
Hong Kong Polytechnic University, Hong Kong, China
Man Lung Yiu
The University of Tokyo, Tokyo, Japan
Masashi Toyoda
Zhejiang University, Hangzhou, China
Dongxiang Zhang
National University of Singapore, Singapore, Singapore
Wei Wang
Peking University, Beijing, China
Bin Cui

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Duan, Y., Jatowt, A., Tanaka, K. (2019). History-Driven Entity Categorization. In: Shao, J., Yiu, M., Toyoda, M., Zhang, D., Wang, W., Cui, B. (eds) Web and Big Data. APWeb-WAIM 2019. Lecture Notes in Computer Science(), vol 11642. Springer, Cham. https://doi.org/10.1007/978-3-030-26075-0_27

Download citation

DOI: https://doi.org/10.1007/978-3-030-26075-0_27
Published: 17 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26074-3
Online ISBN: 978-3-030-26075-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics