Abstract
Vertical Search Engines (VSEs), which usually work on specific domains, are designed to answer complex queries of professional users. VSEs usually have large repositories of structured instances. Traditional instance ranking methods do not consider the categories that instances belong to. However, users of different interests usually care only the ranking list in their own communities. In this paper we design a ranking algorithm –ZRank, to rank the classified instances according to their importances in specific categories. To test our idea, we develop a scientific paper search engine–CPaper. By employing instance classifying and ranking algorithms, we discover some helpful facts to users of different interests.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Arocena, G.O., Mendelzon, A.O.: Weboql: Restructuring documents, databases, and webs. In: Proc of ICDE (1998)
Balmin, A., Hristidis, V., Papakonstantinou, Y.: ObjectRank: Authority-based keyword search in databases. In: Proc. of VLDB (2004)
Guo, H., Zhou, L.: Segmented document classification: Problem and solution. In: Bressan, S., Küng, J., Wagner, R. (eds.) DEXA 2006. LNCS, vol. 4080, pp. 41–48. Springer, Heidelberg (2006)
Guo, Q., et al.: A highly adaptable web extractor based on graph data model. In: Proc. of 6th Asia Pacific Web Conference (April 2004)
Jin, R., Hauptmann, A.G., Zhai, C.X.: Title language model for information retrieval. In: Proc. of SIGIR (2002)
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Proc. of 10th European Conference on Machine Learning, Chemnitz (1998)
kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM (1999)
Botev, C., Guo, L., Shao, F., Shanmugasundaram, J.: Xrank: Ranked keyword search over xml documents. In: Proc. of SIGMOD (2003)
Lam-Adesina, A.M., Jones, G.J.F.: Applying summarization techniques for term selection in relevance feedback. In: Proc. of 24th SIGIR (2001)
McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: Proc. of AAAI workshop on Learning for Text Categorization, pp. 41–48. American Association for AI (July 1998)
Meng, X., Hu, D., Li, C.: Sg-wrap: A schema-guided wrapper generator. In: Proc of ICDE (2002)
Nie, Z., Zhang, Y., Wen, J., Ma, W.: Object-level ranking: bringing order to web objects. In: Proc. of WWW, pp. 567–574. ACM Press, New York (2005)
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34 (2002)
Tejada, S., Knoblock, C., Minton, S.: Learning domain-independent string transformation weights for high accuracy object identification. In: Proc of KDD (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Guo, H., Zhang, J., Zhou, L. (2007). Classifying and Ranking: The First Step Towards Mining Inside Vertical Search Engines. In: Wagner, R., Revell, N., Pernul, G. (eds) Database and Expert Systems Applications. DEXA 2007. Lecture Notes in Computer Science, vol 4653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74469-6_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-74469-6_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74467-2
Online ISBN: 978-3-540-74469-6
eBook Packages: Computer ScienceComputer Science (R0)