Skip to main content

Advertisement

Log in

Kaitiaki: closing the door on open Indigenous data

  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

The mainstream narrative about open data excludes any consideration of Indigenous voices. Instead, there is primarily Western rhetoric reiterating the benefits and advantages it provides for society. The discussion fails to recognise the negative impact open data can have on Indigenous peoples. This paper explores the problem of open data in digital repositories and discusses why open data is harmful in the development of Indigenous natural language processing tools. It begins with a brief introduction to open data before examining the negative impact that open data has on Indigenous peoples by describing what data is collected, how data is collected, who has access to the data and what the data is used for. The paper then offers an alternative solution to reach an ideal state for good data sharing by drawing on the experiences of Te Reo Irirangi o Te Hiku o te Ika (Te Hiku Media), a tribal media hub based in Aotearoa New Zealand. It provides an example of how a small Indigenous community-based organisation collects, stores, and protects its data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. Social security number, national insurance number.

References

  1. GeeksforGeeks: Open Source and Open Data. Geeksforgeeks. (2021). https://www.geeksforgeeks.org/open-source-and-open-data/. Accessed 25 July 2022

  2. Open Data Charter (n.d.) Principles. Open Data Charter. https://opendatacharter.net/principles/. Accessed 25 July 2022

  3. New Zealand Government: International Open Data Charter. Digital Government. (2020). https://www.digital.govt.nz/digital-government/international-partnerships/international-open-data-charter/. Accessed 25 July 2022

  4. Hao, K.: Artificial intelligence is creating a new colonial world order. MIT Technology Review. (2022). https://www.technologyreview.com/2022/04/19/1049592/artificial-intelligence-colonialism/. Accessed 25 July 2022

  5. Henry, J.: Snapchat Users Beware! New ‘SnapMap’ Update Can Track Your Location. Tech Times. (2022). https://www.techtimes.com/articles/277616/20220705/snapchat-users-beware-new-snap-map-update-track-location.htm. Accessed 25 July 2022

  6. ODSC - Open Data Science: 20 Open Datasets for Natural Language Processing. ODSC Medium. (2019). https://odsc.medium.com/20-open-datasets-for-natural-language-processing–538fbfaf8e38. Accessed 20 June 2023

  7. Montantes, J.: 7 Top Open Datasets to Train Natural Language Processing (NLP) & Text Models. Becoming Human. (2021). https://becominghuman.ai/7-top-open-source-datasets-to-train-natural-language-processing-nlp-text-models–8debdc240ca9. Accessed 20 June 2023

  8. iMerit: 25 Best NLP Datasets for Machine Learning. iMerit. (2021). https://imerit.net/blog/25-best-nlp-datasets-for-machine-learning-all-pbm/. Accessed 20 June 2023

  9. National Congress of American Indians: Resolution KAN–18–011: Support of US Indigenous Data Sovereignty and Inclusion of Tribes in the Development of Tribal Data Governance Principles. 4 June 2018. (2018). http://www.ncai.org/attachments/Resolution_gbuJbEHWpkOgcwCICRtgMJHMsUNofqYvuMSnzLFzOdxBlMlRjij_KAN–18–011%20Final.pdf. Accessed 25 July 2022

  10. Te Hiku Media: He reo tuku iho, he reo ora. MAI J. 11(1), 40–49 (2022)

    Google Scholar 

  11. Carroll, S.R., Rodriguez-Lonebear, D., Martinez, A.: Indigenous Data Governance: Strategies from United States native nations. Data Sci. Jour. 18(31), 1–15 (2019). https://doi.org/10.5334/dsj-2019-031

    Google Scholar 

  12. Rainie, S.C., Rodriguez-Lonebear, D., Martinez, A.: Policy Brief: Data Governance for Native Nation Rebuilding (Version 2). (2017). Available at http://nni.arizona.edu/application/files/8415/0007/5708/Policy_Brief_Data_Governance_for_Native_Nation_Rebuilding_Version_2.pdf/application/files/8415/0007/5708/Policy_

  13. Rainie, S.C., Kukutai, T., Walter, M., Figueroa-Rodríguez, O.L., Walker, J., Axelsson, P.: Indigenous data sovereignty. In: Davies, T., Walker, S., Rubinstein, M., Perini, F. (eds.) The State of Open Data: Histories and Horizons, pp. 300–319. African Minds and International Development Research Centre, Cape Town and Ottawa (2019)

    Google Scholar 

  14. Sherman, J.: Big Data May Not Know Your Name. But It Knows Everything Else. Wired. (2021). https://www.wired.com/story/big-data-may-not-know-your-name-but-it-knows-everything-else/. Accessed 25 July 2022

  15. Kukutai, T., Taylor, J.: Data Sovereignty for indigenous peoples: Current practice and future needs. In: Kukutai, T., Taylor, J. (eds.) Indigenous data Sovereignty: Toward an Agenda, pp. 2–24. Australian National University, Australia (2016). https://doi.org/10.22459/CAEPR38.11.2016.14

    Chapter  Google Scholar 

  16. Walter, M., Lovett, R., Maher, B., Williamson, B., Prehn, J., Bodkin-Andrews, F.: Australian J. social issues. 56, 143–156 (2020). https://doi.org/10.1002/ajs4.141 Indigenous Data Sovereignty in the Era of Big Data and Open Data

  17. Oguamanam, C.: Indigenous peoples, Data Sovereignty, and Self-Determination: Current realities and imperatives. Afr. J. Inform. Communication. 26, 1–20 (2020). https://doi.org/10.23962/10539/30360

    Article  MATH  Google Scholar 

  18. Te Hiku Media: Kaitiakitanga License. (2022). https://github.com/TeHikuMedia/Kaitiakitanga-License. Accessed 30 July 2022

  19. Te Hiku Media: Kaitiakitanga License - Papa Reo. (2022). https://github.com/TeHikuMedia/Kaitiakitanga-License/blob/tumu/papareo_api.md. Accessed 30 July 2022

  20. Te Hiku Media: Kaitiakitanga License - Whare Kōrero. (2022). https://github.com/TeHikuMedia/Kaitiakitanga-License/blob/tumu/wharekorero_app.md. Accessed 30 July 2022

  21. Te Hiku Media: Whare Kōrero App Privacy. Whare Kōrero. (2022). https://wharekōrero.nz/privacy. Accessed 30 July 2022

  22. Te Hiku Media: Rongo App Privacy. Rongo. (2022). https://rongo.app/privacy. Accessed 30 July 2022

  23. Te Hiku Media: About. Te Hiku Media. (2022). https://tehiku.nz/about/. Accessed 30 July 2022

  24. Hao, K., Hernández, A.P.: How the AI industry profits from catastrophe. (2022). https://www.technologyreview.com/2022/04/20/1050392/ai-industry-appen-scale-data-labels/, Accessed 30 July 2022

  25. Jones, P., Mahelona, K., Duncan, S., Leoni, G.: Kia tangata whenua: Artificial intelligence that grows from the land and people. Ethical Space: Int. J. Communication Ethics. 20, 23 (2023)

    Google Scholar 

  26. Finn, A., Jones, P.L., Mahelona, K., Duncan, S., Leoni, G.: Developing a Part-Of-Speech tagger for te reo Māori, ComputEL 2022, (2022). https://aclanthology.org/2022.computel–1.12

  27. Te Hiku Media: Te Reo o te Kāinga. Te Hiku Media. (2022). https://tehiku.nz/te-reo/te-reo-o-te-kainga/. Accessed 30 July 2022

Download references

Acknowledgements

The work completed by Te Hiku Media that has led to this paper was funded by the Ministry of Business Innovation and Employment through the Strategic Science Investment Fund and by Te Puni Kōkiri through the Ka Hao fund.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed, read and approved the final manuscript.

Corresponding author

Correspondence to Gianna Leoni.

Ethics declarations

Conflict of interest

All authors are currently employed by Te Reo Irirangi o Te Hiku o te Ika.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jones, PL., Mahelona, K., Duncan, S. et al. Kaitiaki: closing the door on open Indigenous data. Int J Digit Libr 26, 1 (2025). https://doi.org/10.1007/s00799-025-00410-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00799-025-00410-2

Keywords