Skip to main content

Collecting University Rankings for Comparison Using Web Extraction and Entity Linking Techniques

  • Conference paper
  • First Online:
Information and Communication Technologies in Education, Research, and Industrial Applications (ICTERI 2014)

Abstract

University rankings are rankings of institutions in higher education, ordered by combinations of factors. Rankings are conducted by various organizations, such as news media, websites, governments, academics and private corporations. Due to huge financial and other interests, the rankings of universities worldwide recently received increasing attention. The rankings are based on different criteria and collect data in various ways. As a result, there is a large divergence in the specific rankings of different institutions. In order to compare rankings so that safe conclusions about their reliability are drawn, data from the sites of different such ranking lists must be collected. In this paper we present this first step for university ranking comparison, namely we discuss in detail how we have developed a Prolog application, called URank, that collects the data, by (a) extracting them from the various ranking list web sites using web data extraction techniques, (b) uniquely identifying the University entities within the above lists by linking them to the DBpedia linked open data set, and (c) constructing a combined data set by merging the individual ranking list data sets using their DBpedia URI as a primary key.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www3.imperial.ac.uk/

  2. 2.

    http://dbpedia.org/

  3. 3.

    http://www.ncl.ac.uk/

  4. 4.

    http://www.newcastle.edu.au/

  5. 5.

    http://en.wikipedia.org/wiki/University_of_Paris

  6. 6.

    http://wiki.dbpedia.org/DBpediaLive

  7. 7.

    http://deixto.com/

  8. 8.

    E.g. http://www.shanghairanking.com/World-University-Rankings-2012/USA.html

  9. 9.

    E.g. http://www.webometrics.info/sites/default/files/logos/us.png

  10. 10.

    http://www.iso.org/iso/country_codes.htm

  11. 11.

    http://www.webometrics.info/en/node/36

  12. 12.

    http://wiki.dbpedia.org/lookup/

  13. 13.

    http://dbpedia.org/sparql

  14. 14.

    http://dbpedia.org/fct/

  15. 15.

    http://en.wikipedia.org/wiki/University_of_Paris-Sud

  16. 16.

    http://sydney.edu.au/

  17. 17.

    http://virtuoso.openlinksw.com/

  18. 18.

    http://www.openrdf.org/

References

  1. Aguillo, I.F., Bar-llan, J., Levene, M.: Priego, J.L.O: Comparing University Rankings. Scientometrics 85(1), 243–256 (2010)

    Article  Google Scholar 

  2. Angelis, L., Bassiliades, N., Manolopoulos, Y.: Evaluation of University International Rankings (in Greek). In: Proceedings of the Conference on Quality Assurance and Quality Management: Governance and Good Practices, Thessaloniki (2012)

    Google Scholar 

  3. Buela-Casal, G., Gutiérrez-Martínez, O., Bermúdez-Sánchez, M.P., Vadillo-Muñoz, O.: Comparative study of international academic rankings of universities. Scientometrics 71, 349–365 (2007)

    Article  Google Scholar 

  4. Cheng, Y., Liu, N.C.: Examining major rankings according to the Berlin principles. High. Educ. Europe 33(2–3), 201–208 (2008)

    Article  Google Scholar 

  5. Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: a framework and graphical development environment for robust NLP tools and applications. In: 40th Anniversary Meeting of the Association for Computational Linguistics (2002)

    Google Scholar 

  6. Ferragina, P., Scaiella, U.: TAGME: On-the-fly annotation of short text fragments (by wikipedia entities). In: 19th ACM International Conference on Information and Knowledge Management (CIKM ‘10), pp. 1625–1628. ACM (2010)

    Google Scholar 

  7. Ferrara, E., de Meo, P., Fiumara, G., Baumgartner, R.: Web Data Extraction, Applications and Techniques: A Survey. CoRR. arXiv:1207.0246 [cs.IR] (2012)

    Google Scholar 

  8. Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from wikipedia. Artif. Intell. 194, 28–61 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  9. Huang, M.-H.: A comparison of three major academic rankings for world universities: from a research evaluation perspective. J. Libr. Inf. Stud. 9(1), 1–25 (2011)

    Google Scholar 

  10. Ioannidis, J., Patsopoulos, N., Kavvoura, F., Tatsioni, A., Evangelou, E., Kouri, I., Contopoulos-Ioannidis, D., Liberopoulos, G.: International ranking systems for universities and institutions: a critical appraisal. BMC Med. 5(1), 30 (2007)

    Article  Google Scholar 

  11. Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: a generic architecture for storing and querying RDF and RDF schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 54–68. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  12. Kokkoras, F., Ntonas, K., Bassiliades, N.: DEiXTo: a web data extraction suite. In: 6th Balkan Conference in Informatics (BCI-2013), pp. 9–12. ACM, Thessaloniki (2013)

    Google Scholar 

  13. Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The stanford CoreNLP natural language processing toolkit. In: 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)

    Google Scholar 

  14. Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: 7th International Conference on Semantic Systems (I-Semantics 2011), pp. 1–8. ACM, Graz (2011)

    Google Scholar 

  15. Milne, D., Witten, I.H.: Learning to link with wikipedia. In: 17th ACM Conference on Information and Knowledge Management (CIKM ‘08), pp. 509–518. ACM (2008)

    Google Scholar 

  16. Nothman, J., Ringland, N., Radford, W., Murphy, T., Curran, J.R.: Learning multilingual named entity recognition from wikipedia. Artif. Intell. 194, 151–175 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  17. Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: 13th Conference on Computational Natural Language Learning (CoNLL ‘09), pp. 147–155. Association for Computational Linguistics, Stroudsburg (2009)

    Google Scholar 

  18. Ratinov, L., Roth, D., Downey, D., Anderson, M.: Local and global algorithms for disambiguation to wikipedia. In: 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (HLT ‘11), vol. 1, pp. 1375–1384. Association for Computational Linguistics, Stroudsburg (2011)

    Google Scholar 

  19. Rauhvargers, A.: EUA Report on Rankings 2011. Global University Rankings and their Impact. European University Association, Brussels (2011)

    Google Scholar 

  20. Stoilos, G., Stamou, G., Kollias, S.D.: A String Metric for Ontology Alignment. In: Gil, Y., Motta, E., Benjamins, V., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 624–637. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  21. Stolz, I., Hendel, D.D., Horn, A.S.: Ranking of rankings: benchmarking twenty-five higher education ranking Systems in Europe. High. Educ. 60(5), 507–528 (2010)

    Article  Google Scholar 

  22. Taylor, P., Braddock, R.: International university ranking systems and the idea of university excellence. J. High. Educ. Policy Manage. 29(3), 245–260 (2007)

    Article  Google Scholar 

  23. Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Discovering and Maintaining Links on the Web of Data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 650–665. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  24. Wielemaker, J., Schrijvers, T., Triska, M., Lager, T.: SWI-Prolog. Theory Pract. Logic Program. – Prolog Syst. 12(1-2), 67–96 (2012)

    Google Scholar 

  25. Yosef, M.A., Hoffart, J., Bordino, I., Spaniol, M., Weikum, G.: AIDA: an online tool for accurate disambiguation of named entities in text and tables. In: Proceedings of the VLDB Endowment, vol. 4(12), pp. 1450–1453 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nick Bassiliades .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Bassiliades, N. (2014). Collecting University Rankings for Comparison Using Web Extraction and Entity Linking Techniques. In: Ermolayev, V., Mayr, H., Nikitchenko, M., Spivakovsky, A., Zholtkevych, G. (eds) Information and Communication Technologies in Education, Research, and Industrial Applications. ICTERI 2014. Communications in Computer and Information Science, vol 469. Springer, Cham. https://doi.org/10.1007/978-3-319-13206-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13206-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13205-1

  • Online ISBN: 978-3-319-13206-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics