skip to main content
research-article
Free access

An Open Data Platform to Advance Gender Equality in STEM in Latin America

Published: 16 July 2024 Publication History
Expanding the involvement of women in Science, Technology, Engineering, and Mathematics (STEM) across Latin America is crucial for economic advancement, social equity, and global competitiveness; however, these efforts have proven to be challenging. Women in the region are underrepresented in STEM10 and even more so in leadership positions.17,18 The limited availability of current information and the difficulties associated with obtaining reliable data to mitigate gender disparities create difficulties in implementing policies to reduce the gender gap in STEM. Researchers, organizations, and policymakers working to reduce the gender gap need access to dependable data to understand the root causes of gender disparities, promote evidence-based interventions, and increase accountability and transparency.
In the quest for solutions to these challenges, an international research network between Bolivia, Brazil, and Peru, “Equality in Leadership for Latin America STEM” (ELLAS), emerged in 2022.6 This network, formed by eight Latin American universities and one from the U.S., runs the research project entitled “Latin American Open Data for Gender Equality Policies Focusing on Leadership in STEM”, funded by the International Development Research Centre (Project ID #109798).a
The project’s objective is to generate and promote the use of a cross-country comparable open data platform related to gender disparity within STEM in involved countries,13 with a focus on leadership.14 With this purpose, it is essential to define an architecture that can deal with the complete process of data curation.
In this article, we present an innovative architecture that allows for the curation of different data sources, from raw data to data consumption of individual users such as researchers, policymakers, and decision makers working on STEM and gender issues. This architecture alleviates the challenge for users in locating and accessing trustworthy information concerning gender policies, initiatives, and contextual factors, consolidating them into a single source. This contrasts with the scattered nature of such information across various formats, vocabularies, and sources.
The Open Data ELLAS Platform Architecture is composed of three layers, as presented in the accompanying figure. The data layer (from the bottom up) organizes two different types of data sources: “primary data,’’ which comprises mostly unstructured data in PDF formats (that is, academic papers), data from social media, and data collected via a survey—for which data fields have been identified about contextual factors, initiatives, and policies related to gender representation and leadership; and “secondary data,’’ which comprises semi-structured data about women in STEM in Latin America from various websites of national and international organizations.3,12,15,16 This layer relies on the collaboration of multidisciplinary teams to curate the data, ensuring its readiness for integration into the subsequent layer.
Figure. Open Data Ellas platform architecture.
Photo
Figure. Open Data Ellas platform architecture.
The processing layer involves data collection of structured comma separated values (CSV) files for the process of ontology modeling that will represent the knowledge around policies, factors, and initiatives in three languages (Portuguese, English, and Spanish). The tool Protégé is used to model the ontology, which is created in Web Ontology Language (OWL). The next process is semantic mapping that materializes the knowledge graph7 where primary and secondary data structured in CSV files are instantiated into the OWL ontology and become resource description framework (RDF) data through mapping technologies like the Ontotext Refine tool. This process generates a mapping file in JavaScript Object Notation (JSON) format that can be reused to update data as new data is generated. These three processes form one complex pipeline orchestrated and integrated by Pentaho and Python technologies. This layer depends on the work of platform developers like app and ontology developers. The processing layer also includes the knowledge graph integration that involves triplification, where specific knowledge graphs from different data sources come together and are stored in GraphDB TripleStore.
Finally, the application layer allows users to search, understand, and use data. This layer mediates the access to data through an interface focused on end-users with no technical knowledge, but with interests in gender equality in STEM. Technical users also can access the knowledge graph in GraphDB to query the data using an application program interface (API) like SPARQL or with a non-specific language. The development of this layer follows human-centered design approaches, such as value-sensitive design8 and feminist theories.1 All processes in ELLAS platform utilize cloud services.
We actively engage stakeholders such as policymakers and researchers to identify requirements for our platform and participate in potential interaction scenarios via quantitative and qualitative user studies.4

Data Layer Curation

In order to have the right amount of data integrated in the processing layer, we defined a rigorous and replicable methodology for data curation which includes identifying, collecting, and organizing primary and secondary data.2 Here, we present the resulting instantiation of the data layer.
As shown in the accompanying table, for each kind of data, data sources were defined, as well as the appropriate collection techniques. Each collection of data was analyzed to select reliable and relevant data for our context. In addition, the table shows the number of instances in each data source.
All the selected data about policies,11 initiatives,9 and contextual factors5 was transformed into a knowledge graph with more than 295.000 triples by the end of 2023.
Table. Data Layer Curation Results
Kind of dataData sourceCollection TechniquesAnalyzed data
Primary DataSurvey DataSurvey Design10.000+ responses
Academic PapersSystematic Literature Review352 about Latin American policies, 231 about international policies, 259 about contextual factors, 775 about initiatives, 74 about women leadership in STEM
Social MediaSystematic Gray Literature Review300+ profiles
Gray literature (Governmental websites, official reports, and more)Systematic Gray Literature Review26
Secondary DataOpen Data websitesWeb scraping8
Table. Data Layer Curation Results
For access to the ELLAS platform and to learn more about the project, visit the ELLAS website.6

Final Remarks

In this article, we described the three-layer architecture of the open data platform and the resulting instantiation of the data layer. The establishment of an open-data platform focused on women in STEM that has been curated from different data sources allows users like researchers, policymakers, and decision makers access to reliable information. Once the platform is finalized and published on the ELLAS website, a significant challenge lies in effectively engaging stakeholders to utilize it. While scientific contributions from the project have been disseminated in more than 30 academic papers and conference presentations,6 this outreach is insufficient. Hence, we have initiated efforts to secure public endorsements from interested groups such as universities and international organizations. This strategy aims to enhance awareness of the platform and encourage its use. Ultimately, the use of the platform has the potential to promote informed decision-making, transparency, and active public engagement for the development of gender equality policies in leadership in STEM. While this project initiative began with three countries in Latin America, our aim is to expand to other countries in the region.
Cristiano Maciel is an associate professor at the Institute of Computing and the Graduate Program in Education at the Universidade Federal de Mato Grosso (UFMT), Cuiabá, Brazil. He is a postdoctoral researcher at California State Polytechnic University Pomona, USA, and general coordinator of ELLAS.
Indira R. Guzman is an assistant professor of Computer Information Systems and director of the MHC Center for Digital Innovation at the College of Business Administration at California State Polytechnic University Pomona, USA. She is a research consultant for ELLAS.
Rita Cristina Galarraga Berardi is an adjunct professor at the Universidade Tecnológica Federal do Paraná (UTFPR), Curitiba, Brazil. She leads the ELLAS project at UTFPR.
Nadia Rodriguez-Rodriguez is a principal professor on the Faculty of Engineering of Universidad de Lima–ULima, in Peru, and Dean for the 2023-2026 term. She leads the ELLAS project at the ULima.
Luciana Salgado is an assistant professor at the Computer Science Department (DCC) of Universidade Federal Fluminense (UFF), Niterói, Brazil. She leads the ELLAS project at UFF.
Luciana Bolan Frigo is an associate professor at the Universidade Federal de Santa Catarina (UFSC), Brazil. She leads the ELLAS project at UFSC.
Boris Branisa is a professor at the Universidad Católica Boliviana San Pablo (UCB) in Bolivia. He leads the ELLAS project at UCB.
Elizabeth Jiménez is a professor at Center for Interdisciplinary Development Studies (CIDES) at the Universidad Mayor de San Andrés (UMSA) in Bolivia. She leads ELLAS at UMSA.

Footnote

References

[1]
Bardzell, S. and Bardzell, J. Towards a feminist HCI methodology: Social science, feminism, and HCI. In Proceedings of the SIGCHI Conf. Human Factors in Computing Systems. ACM, New York, NY, USA, 675–684;
[2]
Berardi, R. et al. ELLAS: Uma plataforma de dados abertos com foco em lideranças femininas em STEM no contexto da América Latina. Anais do XVII Women in Information Technology. Sociedade Brasileira de Computação, 2023, 124–135; https://sol.sbc.org.br/index.php/wit/article/view/25016
[3]
CEUB. Comité Ejecutivo de la Universidad Boliviana, 2023; https://ceub.edu.bo/.
[4]
Creswell, J. and Creswell, J.D. Research Design: Qualitative, Quantitative, and Mixed Methods Approaches. Sage Publications, 2022; https://us.sagepub.com/en-us/nam/research-design/book270550.
[5]
Drummond, B. et al. Mapping Contextual Aspects that Influences Women in Computing in Latin America. Interfases 018, (2023), 1930;.
[6]
ELLAS NETWORK. Equality in Leadership for Latin America STEM, 2023; https://ellas.ufmt.br/
[7]
Fensel, D. et al. Introduction: What is a knowledge graph? Knowledge Graphs: Methodology, Tools and Selected Use Cases, 2020, 1–10.
[8]
Friedman, B., Kahn, P., Borning, A., and Huldtgren, A. Value sensitive design and information systems. Early engagement and new technologies: Opening up the laboratory. Philosophy of Engineering and Technology 16, 2013. N. Doorn, D. Schuurbiers, I. van de Poel, and M. Gorman, (eds.) Springer, Dordrecht;
[9]
Frigo, L.B. et al. Mapping women STEM initiatives in Latin American countries: Bolivia, Brazil, and Peru. Information Technology and Systems. A. Rocha, C. Ferrás, J. Hochstetter Diez, and M. Diéguez Rebolledo, eds. Springer Nature Switzerland, Cham, 2024, 401–409;
[10]
Guzman, I.R. et al. Gender gap in IT in Latin America. AMCIS 2020 Proceedings; https://aisel.aisnet.org/amcis2020/panels/panels/4
[11]
Guzman, I.R. et al. Gender equality policies in STEM in Latin America—A systematic literature review. Information Technology and Systems. A. Rocha, C. Ferrás, J. Hochstetter Diez, and M. Diéguez Rebolledo, eds. Springer Nature Switzerland, Cham, 2024, 410–419; https://link.springer.com/chapter/10.1007/978-3-031-54256-5_39
[12]
INEP. Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira, 2023; www.inep.gov.br
[13]
Keserű, J. and Kin-Sing Chan, J. The social impact of open data. In Proceedings of the 3rd Intern. Open Data Conf., (Ottawa, Canada, May 28–29, 2015); https://www.researchgate.net/publication/298646716_The_Social_Impact_of_Open_Data
[14]
Maciel, C. et al. Open data platform to promote gender equality policies in STEM. In Proceedings of the Western Decision Sciences Institute. (Portland, OR, USA, Apr. 2023); https://wdsinet.org/Annual_Meetings/2023_Proceedings/papers/198.pdf
[15]
SIES. Sistema de Información de Educación Superior, 2023; https://www.gob.pe/minedu
[16]
UNESCO. Core Data Portal, 2023; https://core.unesco.org/en/home
[17]
Wang, M.-T. and Degol, J.L. Gender gap in science, technology, engineering, and mathematics (STEM): Current knowledge, implications for practice, policy, and future directions. Educ Psychol Rev 29, (2017), 119–140;
[18]
World Economic Forum. Global Gender Gap Report 2023; https://www.weforum.org/publications/global-gender-gap-report-2023.

Cited By

View all
  • (2024)Mapeamento e Análise sobre a Presença dos Projetos Parceiros do Programa Meninas Digitais em Espaços DigitaisAnais do XVIII Women in Information Technology (WIT 2024)10.5753/wit.2024.2451(115-126)Online publication date: 21-Jul-2024
  • (2024)Fatores que Influenciam as Escolhas Profissionais de Mulheres em ComputaçãoAnais Estendidos do IV Simpósio Brasileiro de Educação em Computação (EDUCOMP Estendido 2024)10.5753/educomp_estendido.2024.239184(69-70)Online publication date: 22-Apr-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 67, Issue 8
August 2024
134 pages
EISSN:1557-7317
DOI:10.1145/3686019
  • Editor:
  • James Larus
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2024
Online First: 16 July 2024
Published in CACM Volume 67, Issue 8

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

  • International Development Research Centre - IDRC

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,153
  • Downloads (Last 6 weeks)107
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Mapeamento e Análise sobre a Presença dos Projetos Parceiros do Programa Meninas Digitais em Espaços DigitaisAnais do XVIII Women in Information Technology (WIT 2024)10.5753/wit.2024.2451(115-126)Online publication date: 21-Jul-2024
  • (2024)Fatores que Influenciam as Escolhas Profissionais de Mulheres em ComputaçãoAnais Estendidos do IV Simpósio Brasileiro de Educação em Computação (EDUCOMP Estendido 2024)10.5753/educomp_estendido.2024.239184(69-70)Online publication date: 22-Apr-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media