skip to main content
10.1145/2479787.2479805acmotherconferencesArticle/Chapter ViewAbstractPublication PageswimsConference Proceedingsconference-collections
research-article

Towards automatic assessment of government web sites

Published:12 June 2013Publication History

ABSTRACT

This paper presents an approach for automatic assessment of web sites in large scale e-Government surveys. The approach aims at supplementing and to some extent replacing human evaluation which is typically the core part of these surveys.

The heart of the solution is a colony inspired algorithm, called the lost sheep, which automatically locates targeted governmental material online. The algorithm centers around classifying link texts to determine if a web page should be downloaded for further analysis.

The proposed algorithm is designed to work with minimum human interaction and utilize the available resources as best possible. Using the lost sheep, the people carrying out a survey will only provide sample data for a few web sites for each type of material sought after. The algorithm will automatically locate the same type of material in the other web sites part of the survey. This way it significantly reduces the need for manual work in large scale e-Government surveys.

References

  1. Millard, J.: eGovernment measurement for policy makers. European Journal of ePractice 4 (2008)Google ScholarGoogle Scholar
  2. Pina, V., Torres, L., Royo, S.: Is E-Government Leading to More Accountable and Transparent Local Governments? An Overall View. Financial Accountability & Management 26 (2010) 3--20Google ScholarGoogle Scholar
  3. The Consumer Council of Norway: Testfakta kommunetest januar 2011. Retrieved March 23rd, 2011, from http://forbrukerportalen.no/Artikler/2011/testfakta_kommunetest_januar_2011 (2011)Google ScholarGoogle Scholar
  4. United Nations Department of Economic and Social Affairs: Global e-government survey 2012, e-government for the people. Retrieved March 21st, 2013, from http://unpan1.un.org/intradoc/groups/public/documents/un/unpan048065.pdf (2012)Google ScholarGoogle Scholar
  5. United Nations Department of Economic and Social Affairs: Global e-government survey 2010, leveraging e-government at a time of financial and economic crisis. Retrieved May 11th, 2010, from http://www2. unpan.org/egovkb/global_reports/10report.htm(2010)Google ScholarGoogle Scholar
  6. Capgemini: Digitizing public services in europe: Putting ambition into action. Retrieved March 16th, 2011, from http://www.capgemini.com/insights-and-resources/by-publication/2010-egovernment-benchmark/(2010)Google ScholarGoogle Scholar
  7. Berntzen, L., Olsen, M. G.: Benchmarking e-government - a comparative review of three international benchmarking studies. International Conference on the Digital Society 0 (2009) 77--82 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Heeks, R.: Understanding and measuring egovernment: international benchmarking studies. In: UNDESA workshop,âĂIJE-Participation and E-Government: Understanding the Present and Creating the FutureâĂİ, Budapest, Hungary. (2006) 27--28Google ScholarGoogle Scholar
  9. Goodwin, M., Susar, D., Nietzio, A., Snaprud, M., Jensen, C.: Global Web Accessibility Analysis of National Government Portals and Ministry Web Sites. Journal of Information Technology & Politics 8 (2011) 41--67Google ScholarGoogle Scholar
  10. Olston, C., Najork, M.: Web Crawling. Information Retrieval 4 (2010) 175--246 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sun, Y., Zhuang, Z., Giles, C. L.: A large-scale study of robots.txt. In: WWW '07: Proceedings of the 16th international conference on World Wide Web, New York, NY, USA, ACM (2007) 1123--1124 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Goodwin, M.: A solution to the exact match on rare item searches: introducing the lost sheep algorithm. In: Proceedings of the International Conference on Web Intelligence, Mining and Semantics, ACM (2011) 38 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ke, W., Mostafa, J.: Scalability of findability: effective and efficient IR operations in large information networks. In: Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, ACM (2010) 74--81 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. da Costa Jr, M., Gong, Z.: Web structure mining: an introduction. In: Information Acquisition, 2005 IEEE International Conference on, IEEE (2005) 6--ppGoogle ScholarGoogle Scholar
  15. Chun, A.: An AI framework for the automatic assessment of e-government forms. AI Magazine 29 (2008) 52Google ScholarGoogle Scholar
  16. Davison, B.: Topical locality in the Web. In: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, ACM (2000) 272--279 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Chakrabarti, S.: Data mining for hypertext: A tutorial survey. ACM SIGKDD Explorations Newsletter 1 (2000) 1--11 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Menczer, F.: Mapping the semantics of web text and links. IEEE Internet Computing 9 (2005) 27--36 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Goodwin, M.: Towards Automated eGovernment Monitoring. PhD thesis, Ph.D. Dissertation to the Faculty of Engineering and Science at Aalborg University, Denmark (2011)Google ScholarGoogle Scholar
  20. Greening, D.: Data mining on the web. Web Techniques 5 (2000)Google ScholarGoogle Scholar
  21. De Bra, P., Houben, G., Kornatzky, Y., Post, R.: Information retrieval in distributed hypertexts. In: Proceedings of the 4th RIAO Conference. (1994) 481--491Google ScholarGoogle Scholar
  22. Hersovici, M., Jacovi, M., Maarek, Y., Pelleg, D., Shtalhaim, M., Ur, S.: The shark-search algorithm. An application: tailored Web site mapping. Computer Networks and ISDN Systems 30 (1998) 317--326 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Dong, J., Zuo, W., Peng, T.: Focused crawling guided by link context. In: Proceedings of the 24th IASTED international conference on Artificial intelligence and applications, ACTA Press (2006) 365--369 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Zahiri, S.: Learning automata based classifier. Pattern Recognition Letters 29 (2008) 40--48 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Oommen, B.: Stochastic searching on the line and its applications to parameter learning in nonlinear optimization. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on 27 (1997) 733--739 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Granmo, O. C., Oommen, B. J.: Optimal sampling for estimation with constrained resources using a learning automaton-based solution for the nonlinear fractional knapsack problem. Applied Intelligence 33 (2010) 3--20 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ulltveit-Moe, N., Olsen, M. G., Pillai, A., Thomsen, C., Gjøsæter, T., Snaprud, M.: Architecture for large-scale automatic web accessibility evaluation based on the uwem methodology. In: Norwegian Conference for Informatics (NIK). (2008)Google ScholarGoogle Scholar
  28. World Wide Web Consortium: Web Content Accessibility Guidelines (WCAG) 2.0. Retrieved November 4th, 2009, from http://www.w3.org/TR/REC-WCAG20--20081211/(2008)Google ScholarGoogle Scholar
  29. Kan, M. Y.: Web page classification without the web page. In: WWW Alt. '04: Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters, New York, NY, USA, ACM (2004) 262--263 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Qi, X., Davison, B.: Web page classification: Features and algorithms. ACM Computing Surveys (CSUR) 41 (2009) 1--31 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Joachims, T.: Learning to classify text using support vector machines: Methods, theory, and algorithms. Computational Linguistics 29 (2002) 656--664Google ScholarGoogle Scholar
  32. Náther, P.: N-gram based Text Categorization. (2005)Google ScholarGoogle Scholar
  33. Resnik, P.: Mining the web for bilingual text. In: Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, Association for Computational Linguistics (1999) 527--534 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums. ACM Transactions on Information Systems (TOIS) 26 (2008) 1--34 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Towards automatic assessment of government web sites

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        WIMS '13: Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics
        June 2013
        408 pages
        ISBN:9781450318501
        DOI:10.1145/2479787

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 June 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        WIMS '13 Paper Acceptance Rate28of72submissions,39%Overall Acceptance Rate140of278submissions,50%
      • Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader