Research Note
Liberation of public data: Exploring central themes in open government data and freedom of information research

https://doi.org/10.1016/j.ijinfomgt.2017.05.009Get rights and content

Highlights

  • Central themes (topics) in open government data and freedom of information research are explored.

  • With the aid of topic modelling, topics were extracted and labelled to obtain relevant semantic information.

  • The major theme in FOI research borders on issues relating to disclosure, publishing, access and cost of requests.

  • The results also indicate that research in OGD have for the most part focused on technology and related topics.

  • The approach also helped in determining key similarities and differences in the two campaigns as reported in research.

Abstract

This paper conducts a comparative literature survey of Open Government Data (OGD) and Freedom of Information (FOI), with a view to tracking the central themes in the two civil society campaigns. With seeming similarities and a growing popularity in research, the major themes framing research on the two movements have not clearly emerged. Topic modelling, text mining and document analysis methods are used to extract the themes as well as key named entities. The topics are subsequently labeled and with expert guidance, their semantic meaning are provided. The results indicate that the major theme in FOI research borders on issues relating to disclosure, publishing, access and cost of requests. On the other hand, themes in OGD research have largely centered on technology and related concepts. The approach also helped in determining key similarities and differences in the two campaigns as reported in research.

Introduction

Freedom of Information (FOI) and Open Government Data (OGD), are two prominent civil society campaigns championing the course of liberating government controlled data to the people. Both FOI and OGD primarily seek to make government data progressively free and easily accessible (Ubaldi, 2013). Largely, the twin but independent campaigns of FOI and OGD have been driven by (1) a global call on nations to offer a more accountable and transparent governance (2) a growing trend of sophistication in citizens’ preferences and choices of government services (Holler, 2012; Lau, Patel, Fahmy, & Kaufman, 2014; Van Dooren, Bouckaert, & Halligan, 2015; Weisberg & Nawara, 2010) and (3) an opportunity to amend past policies, where data collected by government agencies tended to be the exclusive reserve of the state (Yiu, 2012). The two movements have thus become a global mouth piece of advocacy towards a more open and transparent governance. Over the years, the two campaigns have received considerable traction in the media as well as in academia (Charalabidis, Alexopoulos, & Loukis, 2016). For instance, basic statistics regarding the yearly number of Freedom of Information Act (FOIA) requests, downloads, appraisal reports, number of workshops and conferences held, together with other specific country initiatives, point to a growing interest among stakeholders (Whitmore, 2012). A similar trend is seen in yearly OGD reports, where accounts by the Independent Reporting Mechanism (IRM) of Open Government Partnership (the body responsible for the launch of OGD in 2011), indicate a steady progress by most member states (Frey, 2014). Other independent accounts in the literature also show significant progress in OGD in the US (Krishnamurthy & Awazu, 2016), UK (Tinati, Carr, Halford, & Pope, 2012), Taiwan (Wang & Lo, 2016), Spain (Carrasco & Sobrepere, 2015), and a host of many other countries including local government authorities such as cities and federal states. In addition, many global experiences have also been shared of how FOI and OGD are impacting governance particularly in the fight against corruption, economic empowerment and the quest for greater citizen engagements (Birkinshaw, 2010, Halstuk and Chamberlin, 2006; Jetzek, Avital, & Bjørn-Andersen, 2012; Shepherd, Stevenson, & Flinn, 2009; US. Senate, 2007Zeleti et al., 2016).

In the wave of the relative progress, there have also been reports of misconceptions, myths, definitional challenges and general obstacles besetting real-world practice by the two campaign groups (Camaj, 2016, Evans and Campos, 2013; Gigler, Custer, & Rahemtulla, 2011; Hubbard, 2008; Janssen, Charalabidis, & Zuiderwijk, 2012; Schartum, 1998, Zuiderwijk and Janssen, 2014). Such wide ranging experiences of “the good, bad and the ugly” of FOI and OGD, have occasioned numerous research publications covering a range of topics. However, as the fields of FOI and OGD continue to evolve, what the central themes are as far as research publications are concerned, have not clearly emerged. Furthermore, given that the two campaigns not only share similarities but differences (Ubaldi, 2013), it is imperative to understand how key concepts are unconsciously being framed in publications relating to the two notions. In view of this, this paper seeks to determine what the major themes have so far been in relation to the two campaigns. It is our view that determining the central message shaping the two campaigns, would not only shed light on what key topics define each movement, but would also help establish whether the similarities and the seeming differences between the two concepts naturally emerge especially in research.

In the years before and after the launch of FOI and OGD, a considerable number of academic publications have been authored spanning various topics and issues. However, none so far has attempted to comparatively explore the ‘running’ themes in the two concepts. The few literature reviews available were conducted separately by only focusing on either of the two notions. For instance, Attard, Orlandi, Scerri, and Auer (2015) and Novais, de Albuquerque, and da Silva Craveiro (2013) conducted separate reviews of literature on open government data whiles Halstuk and Chamberlin (2006) conducted a retrospective analysis of the Freedom of Information Act from 1966 to 2006. Mendel (2008) conducted a comparative legal survey but however only centered on FOI. Furthermore, the methodological approaches adopted in review articles conducted on FOI or OGD are different from what this paper proposes. This study is therefore uniquely positioned to contribute to theory and fill research gaps in the following ways. First, key topics and associative terms ‘running’ in scientific discourses on the two civil movements are identified and compared. Secondly, the paper also identifies location-based named entities of interest to frame how FOI and OGD are being implemented around the world. The results provide a means to understand the central themes shaping each campaign and some potential future research directions on OGD and FOI.

The rest of the paper is organized as follows. First, a brief overview of FOI and OGD covering history and key concepts are explained. Further, differences and similarities between the two notions as captured in the literature, are presented. This is followed by the methodology which explains the approach to data collection and presents a brief introduction to text analysis concepts and their relevance to the study. Research questions guiding the study are subsequently presented. The results of the study, discussion and conclusion are further presented.

Section snippets

FOI Vs OGD

The idea of open government data (OGD) is generally viewed as an offshoot of Freedom of Information (FOI), also sometimes known as Right to Information (Ubaldi, 2013). However, the two movements come under the broader concept of open government, which seeks transparency and greater rights of information access for citizens (Tauberer, 2012). It must be noted that while civil resistance movements are not completely new, the quest for openness and access to government controlled information

Methodology

The main part of the research design used topic modelling. Text mining and document analysis methods were used mainly to clean and transform the textual data. The methodology is conveniently segmented into three phases of text pre-processing, processing and information extraction as shown in Fig. 2. The three stages were however preceded by a data collection phase which primarily employed document analysis (Owen, 2014) techniques to gather the data. Particularly, the inclusion and exclusion

Data collection

Research publications were selected guided by the following criteria (i) an automated search availability (ii) quality and prominence of publications and (iii) reputation of the bibliographic database. In line with the above considerations, we settled on the Web of Science and Scopus; widely recognized as the two most prominent bibliographic databases (Aghaei Chadegani et al., 2013, Wang and Waltman, 2016).

Journal articles, conference proceedings and book chapters were the only kinds of

Research questions

To get the most out of the comparative literature survey, a set of pre-defined questions were used to guide the study. This approach was particularly useful as it helped to map the results generated to the research questions. In all, the questions were designed to aid in ‘framing’ trending concepts in FOI and OGD as captured in research publications. The following questions guided the research:

RQ1. What are the central themes in FOI and OGD research publications?

RQ2. How do similarities and

Topic interpretation

FOI

The topic labelling or classification was done by interpreting what a body of topics appears to convey. Guided by expert knowledge and the literature, it was realized that a number of the topics seemed to fall under some relevant issues in the two campaigns. In row 1 in Table 4 for instance, the topic label apparently frames issues relating to some FOI guiding principles and key operational terms. This is because, most authoritative texts on FOI particularly those that focus on Article 19 (

Discussion and conclusion

Several decades have passed since the freedom of information act (FOIA) was conceived as a means to providing access to public data and with a view to entrenching the values of democracy. After many years of global successes and challenges in implementation, a similar movement in the form of open government data (OGD) was launched to help support the idea of greater openness and accountability in governance. Though run independently, the two campaigns continue to draw the world’s attention to

References (64)

  • Q. Wang et al.

    Large-scale analysis of the accuracy of the journal classification systems of Web of Science and Scopus

    Journal of Informetrics

    (2016)
  • A. Whitmore

    Extracting knowledge from US department of defense freedom of information act requests with social media

    Government Information Quarterly

    (2012)
  • D. Xing et al.

    Employing latent Dirichlet allocation for fraud detection in telecommunications

    Pattern Recognition Letters

    (2007)
  • E. Afful-Dadzie et al.

    Framing media coverage of the 2014 sony pictures entertainment hack: A topic modelling approach

    Proceedings of the 11th international conference on cyber warfare and security: ICCWS2016

    (2016)
  • C.C. Aggarwal et al.

    Mining text data

    (2012)
  • A. Aghaei Chadegani et al.

    A comparison between two main academic literature collections: Web of science and Scopus databases

    Asian Social Science

    (2013)
  • I. Bíró et al.

    Latent dirichlet allocation in web spam filtering

    Proceedings of the 4th ACM international workshop on adversarial information retrieval on the web

    (2008)
  • T. Berners-Lee

    Linked data-design issues

    (2006)
  • D.M. Blei et al.

    Latent dirichlet allocation

    Journal of Machine Learning Research

    (2003)
  • D.M. Blei

    Probabilistic topic models

    Communications of the ACM

    (2012)
  • C. Carrasco et al.

    Open government data an assessment of the spanish municipal situation

    Social Science Computer Review

    (2015)
  • J. Chang et al.

    Reading tea leaves: How humans interpret topic models

    Proceedings in Advances in Neural Information Processing Systems

    (2009)
  • Y. Charalabidis et al.

    A taxonomy of open government data research areas and topics

    Journal of Organizational Computing and Electronic Commerce

    (2016)
  • S. Chignard

    A brief history of open data

    (2013)
  • J. Donnelly

    Universal human rights in theory and practice

    (2013)
  • D. Downey et al.

    Locating complex named entities in web text

    Proceedings of International Joint Conference on Artificial Intelligence (IJCAI)

    (2007)
  • A.M. Evans et al.

    Open government initiatives: Challenges of citizen participation

    Journal of Policy Analysis and Management

    (2013)
  • H.N. Foerstel

    Freedom of information and the right to know: The origins and applications of the Freedom of Information Act

    (1999)
  • L. Frey

    Open government partnership four-year strategy 2015–2018

    (2014)
  • C.P. Geiger et al.

    Open government and (linked)(open)(government)(data)

    JeDEM-eJournal of eDemocracy and Open Government

    (2012)
  • B.S. Gigler et al.

    Realizing the vision of open government data: Opportunities, challenges, and pitfalls

    (2011)
  • M.E. Halstuk et al.

    The Freedom of Information Act 1966–2006: A retrospective on the rise of privacy protection over the public interest in knowing what the government’s up to

    Communication Law and Policy

    (2006)
  • Cited by (44)

    • Detecting information requirements for crisis communication from social media data: An interactive topic modeling approach

      2020, International Journal of Disaster Risk Reduction
      Citation Excerpt :

      Although social media data have the potential for information retrieval, it is challenging to distill information requirements and grasp the dynamic evolution from a large amount of textual data [15]. Machine learning methods have a big advantage to discover the latent topics from a big volume of the complex textual data [61], such as topic models, one of the most trending research areas in text analysis [1]. Topic models have been widely used to summarize what people talk about from social media data [52], such as messages on Twitter [5] and Twitter hashtags [21,36].

    View all citing articles on Scopus
    View full text