Skip to main content

COCO: Semantic-Enriched Collection of Online Courses at Scale with Experimental Use Cases

  • Conference paper
  • First Online:
Trends and Advances in Information Systems and Technologies (WorldCIST'18 2018)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 746))

Included in the following conference series:

Abstract

With the proliferation in number and scale of online courses, several challenges have emerged in supporting stakeholders during their delivery and fruition. Machine Learning and Semantic Analysis can add value to the underlying online environments in order to overcome a subset of such challenges (e.g. classification, retrieval, and recommendation). However, conducting reproducible experiments in these applications is still an open problem due to the lack of available datasets in Technology-Enhanced Learning (TEL), mostly small and local. In this paper, we propose COCO, a novel semantic-enriched collection including over 43 K online courses at scale, 16 K instructors and 2,5 M learners who provided 4,5 M ratings and 1,2 M comments in total. This outruns existing TEL datasets in terms of scale, completeness, and comprehensiveness. Besides describing the collection procedure and the dataset structure, we depict and analyze two potential use cases as meaningful examples of the large variety of multi-disciplinary studies made possible by having COCO.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Please contact the authors by e-mail to obtain a copy of the dataset.

  2. 2.

    https://www.udemy.com/.

  3. 3.

    https://www.udemy.com/developers/.

  4. 4.

    http://www.seleniumhq.org/.

  5. 5.

    https://www.udemy.com/spark-and-python-for-big-data-with-pyspark/.

  6. 6.

    https://pypi.python.org/pypi/langdetect?.

  7. 7.

    http://scikit-learn.org.

  8. 8.

    http://www.nltk.org/.

  9. 9.

    https://www.ibm.com/watson/services/natural-language-understanding/.

  10. 10.

    http://surpriselib.com/.

References

  1. eLearning Market Trends and Forecast 2017–2021 (2017). https://www.docebo.com/resource/elearning-market-trends-and-forecast-2017-2021/. Accessed 20 Nov 2017

  2. G2crowd grid for online course providers (2017). https://www.g2crowd.com/categories/online-course-providers. Accessed 20 Nov 2017

  3. Basu, S., Yu, Y., Zimmermann, R.: Fuzzy clustering of lecture videos based on topic modeling. In: 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–6. IEEE (2016)

    Google Scholar 

  4. Dessì, D., Fenu, G., Marras, M., Recupero, D.R.: Leveraging cognitive computing for multi-class classification of e-learning videos. In: European Semantic Web Conference, pp. 21–25. Springer (2017)

    Google Scholar 

  5. Yang, H., Meinel, C.: Content lecture video retrieval using speech and video text information. IEEE Trans. Learn. Technol. 7(2), 142–154 (2014)

    Article  Google Scholar 

  6. Drachsler, H., Verbert, K., Santos, O.C., Manouselis, N.: Panorama of recommender systems to support learning. In: Recommender Systems Handbook, pp. 421–451. Springer (2015)

    Chapter  Google Scholar 

  7. Fazeli, S., Rajabi, E., Lezcano, L., Drachsler, H., Sloep, P.: Supporting users of open online courses with recommendations: an algorithmic study. In: Advanced Learning Technologies (ICALT), pp. 423–427. IEEE (2016)

    Google Scholar 

  8. Class central. https://www.class-central.com/. Accessed 20 Nov 2017

  9. Course talk. https://www.coursetalk.com. Accessed 20 Nov 2017

  10. Aggarwal, C.C., Zhai, C.: A survey of text classification algorithms. In: Mining Text Data, pp. 163–222 (2012)

    Chapter  Google Scholar 

  11. Estivill-Castro, V., Limongelli, C., Lombardi, M., Marani, A.: DAJEE: a dataset of joint educational entities for information retrieval in technology enhanced learning. In: Proceedings of the 39th International ACM SIGIR, pp. 681–684. ACM (2016)

    Google Scholar 

  12. Pappas, N., Popescu-Belis, A.: Combining content with user preferences for ted lecture recommendation. In: 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 47–52. IEEE (2013)

    Google Scholar 

  13. Merlot (2017). https://www.merlot.org/merlot/index.htm. Accessed 20 Nov 2017

  14. Ho, A.D., Reich, J., Nesterko, S.O., Seaton, D.T., Mullaney, T., Waldo, J., Chuang, I.: Harvardx and mitx: the first year of open online courses (2014)

    Google Scholar 

  15. Mace. http://ea-tel.eu/tel-research/mace/. Accessed 20 Nov 2017

Download references

Acknowledgments

Danilo Dessì and Mirko Marras acknowledge Sardinia Regional Government for the financial support of their PhD scholarship (P.O.R. Sardegna F.S.E. Operational Programme of the Autonomous Region of Sardinia, European Social Fund 2014–2020, Axis III “Education and Training”, Specific Goal 10.5).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mirko Marras .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dessì, D., Fenu, G., Marras, M., Reforgiato Recupero, D. (2018). COCO: Semantic-Enriched Collection of Online Courses at Scale with Experimental Use Cases. In: Rocha, Á., Adeli, H., Reis, L., Costanzo, S. (eds) Trends and Advances in Information Systems and Technologies. WorldCIST'18 2018. Advances in Intelligent Systems and Computing, vol 746. Springer, Cham. https://doi.org/10.1007/978-3-319-77712-2_133

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77712-2_133

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77711-5

  • Online ISBN: 978-3-319-77712-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics