Abstract
Almost all software, open or closed, builds on open source software and therefore needs to comply with the license obligations of the open source code. Not knowing which licenses to comply with poses a legal danger to anyone using open source software. This article investigates the extent of inconsistencies between licenses declared by an open source project at the top level of the repository and the licenses found in the code. We analyzed a sample of 1,000 open source GitHub repositories. We find that about half of the repositories did not fully declare all licenses found in the code. Of these, approximately 10% represented a permissive vs. copyleft license mismatch. Furthermore, existing tools cannot fully identify licences. We conclude that users of open source code should not just look at the declared licenses of the open source code they intend to use, but rather examine the software to understand its actual licenses.
- [1] . 2008. Outsourcing to an unknown workforce: Exploring opensurcing as a global sourcing strategy. MIS Quarterly 32, 2 (2008), 385–409.Google ScholarDigital Library
- [2] . 2017. Do software developers understand open source licenses? In 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC’17). IEEE, 1–11.Google ScholarDigital Library
- [3] . 2009. Analyzing software licenses in open architecture software systems. In Proceedings of the 2009 ICSE Workshop on Emerging Trends in Free/Libre/Open Source Software Research and Development. IEEE, 54–57.Google ScholarDigital Library
- [4] . 2021. Competition among proprietary and open-source software firms: The role of licensing in strategic contribution. Management Science 67, 5 (2021), 3041–3066.Google ScholarDigital Library
- [5] . 2020. Sharing at scale: An open-source-software-based license compliance ecosystem. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Practice. 130–131.Google ScholarDigital Library
- [6] . 2019. Usage and attribution of stack overflow code snippets in GitHub projects. Empirical Software Engineering 24, 3 (2019), 1259–1295.Google ScholarDigital Library
- [7] . 2017. 2017 Open Source Security & Risk Analysis, Black Duck Software.Google Scholar
- [8] . 2020. Standardizing open source license compliance with OpenChain. Computer 53, 11 (2020), 70–74.Google ScholarCross Ref
- [9] . 2012. Free/Libre open-source software development: What we know and what we do not know. ACM Computing Surveys (CSUR) 44, 2 (2012).
7 .Google ScholarDigital Library - [10] . 2018. Automating open source software license information generation in software projects. Journal of Systemics, Cybernetics and Informatics 16, 5 (2018), 44–49.Google Scholar
- [11] . [n. d.]. The economic and social impact of software & services on competitiveness and innovation (SMART 2015/0015). Retrieved April 22, 2021, from https://op.europa.eu/en/publication-detail/-/publication/480eff53-0495-11e7-8a35-01aa75ed71a11.Google Scholar
- [12] . 2019. Open source for open source license compliance. In Open Source Systems, , , , and (Eds.). Springer International Publishing, Cham, 133–138. Google Scholar
- [13] . 2019. Open source for open source license compliance. In IFIP International Conference on Open Source Systems. Springer, 133–138.Google ScholarCross Ref
- [14] . 2019. Open-source license violations of binary software at large scale. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER’19). IEEE, 564–568.Google ScholarCross Ref
- [15] . 2006. The transformation of open source software. MIS Quarterly 30, 3 (2006), 587–598.Google ScholarDigital Library
- [16] . 2008. A legal issues primer for open source and free software projects. Software Freedom Law Center, 1–37.Google Scholar
- [17] . 2017. On licensing and other conditions for contributing to widely used open source projects: An exploratory analysis. In Proceedings of the 13th International Symposium on Open Collaboration. 1–14.Google ScholarDigital Library
- [18] . 2012. Managing license compliance in free and open source software development. Information Systems Frontiers 14, 2 (2012), 143–154.Google ScholarDigital Library
- [19] . 2012. A method for open source license compliance of java applications. IEEE Software 29, 3 (2012), 58–63.Google ScholarDigital Library
- [20] . 2010. Understanding and auditing the licensing of open source software distributions. In 2010 IEEE 18th International Conference on Program Comprehension. IEEE, 84–93.Google ScholarDigital Library
- [21] . 2009. License integration patterns: Addressing license mismatches in component-based development. In Proceedings of the 31st International Conference on Software Engineering. IEEE Computer Society, 188–198.Google ScholarDigital Library
- [22] . 2010. A sentence-matching method for automatic license identification of source code files. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering. 437–446.Google ScholarDigital Library
- [23] . 2008. The FOSSology project. In Proceedings of the International Working Conference on Mining Software Repositories. ACM, 47–50.Google ScholarDigital Library
- [24] . 2016. Open-source licensing and business models: Making money by giving it away. Santa Clara Computer & High Tech. LJ 33 (2016), 427.Google Scholar
- [25] . 2020. Managing your open source supply chain-why and how? Computer 53, 6 (2020), 77–81.Google ScholarCross Ref
- [26] . 2019. Industry requirements for FLOSS governance tools to facilitate the use of FLOSS components in commercial products. Journal of Systems and Software 158 (2019), 110390.Google ScholarDigital Library
- [27] . 2021. Getting started with corporate open source governance: A case study evaluation of industry best practices. In Proceedings of the 54th Hawaii International Conference on System Sciences. 6263.Google ScholarCross Ref
- [28] . 2007. Open source on trial. In Open Source Business Resource.Google Scholar
- [29] . 2010. Adoption of open source software in software-intensive organizations–A systematic literature review. Information and Software Technology 52, 11 (2010), 1133–1154.Google ScholarDigital Library
- [30] . 2015. Validity and reliability in quantitative studies. Evidence-based Nursing 18, 3 (2015), 66–67.Google ScholarCross Ref
- [31] . 2011. Finding software license violations through binary code clone detection. In Proceedings of the 8th Working Conference on Mining Software Repositories. 63–72.Google ScholarDigital Library
- [32] . 2021. Finding software license violations through binary code clone detection-a retrospective. ACM SIGSOFT Software Engineering Notes 46, 3 (2021), 24–25.Google ScholarDigital Library
- [33] . 2017. The FOSSology project: 10 years of license scanning. IFOSS L. Rev. 9 (2017), 9.Google ScholarCross Ref
- [34] . 2014. The promises and perils of mining GitHub. In Proceedings of the 11th Working Conference on Mining Software Repositories (MSR’14). ACM, ACM, New York, NY, 92–101. Google ScholarDigital Library
- [35] . 2019. Modeling and recommending open source licenses with findOSSLicense. IEEE Transactions on Software Engineering 47, 5 (2019), 919–935.Google ScholarCross Ref
- [36] . 2017. Automating the license compatibility process in open source software with SPDX. Journal of Systems and Software 131 (2017), 386–401.Google ScholarDigital Library
- [37] . 2011. Free and open source software development and research: Opportunities for software engineering. In 2011 25th Brazilian Symposium on Software Engineering. IEEE, 82–91.Google ScholarDigital Library
- [38] . 2007. Governance of open source software: State of the art. Journal of Management & Governance 11, 2 (2007), 165–177.Google ScholarCross Ref
- [39] . 2005. The scope of open source licensing. Journal of Law, Economics, and Organization 21, 1 (2005), 20–56.Google ScholarCross Ref
- [40] . 2017. DéjàVu: a map of code duplicates on GitHub. Proceedings of the ACM on Programming Languages, Vol. 1, ACM, New York, NY, 1–28.Google ScholarDigital Library
- [41] . 2022. A Delphi study of obsolete assumptions in free/libre and open source software. In Proceedings of the European Conference on Information Systems. AIS.Google Scholar
- [42] . 2021. From one to hundreds: Multi-licensing in the JavaScript ecosystem. Empirical Software Engineering 26, 3 (2021), 1–29. Google ScholarDigital Library
- [43] . 2003. Copyleft—The economics of Linux and other open source software. Information Economics and Policy 15, 1 (2003), 99–121.Google ScholarCross Ref
- [44] . 2020. Free and open source software license compliance: Tools for software composition analysis. Computer 53, 10 (2020), 105–109.Google ScholarCross Ref
- [45] . 2019. Best Practices for Commercial Use of Open Source Software: Business Models. BoD–Books on Demand.Google Scholar
- [46] . 2020. Single-vendor open source firms. Computer 53, 4 (2020), 68–72. Google ScholarCross Ref
- [47] . 2019. Open-source license compliance in software supply chains. In Towards Engineering Free/Libre Open Source Software (FLOSS) Ecosystems for Impact and Sustainability. Springer, 83–95.Google ScholarCross Ref
- [48] . 2005. Open Source Licensing. Vol. 692. Prentice Hall.Google Scholar
- [49] . 2020. Software provenance tracking at the scale of public source code. Empirical Software Engineering 25, 4 (2020), 2930–2959.Google ScholarCross Ref
- [50] . 2004. Using open source software in product development: A primer. IEEE Software 21, 1 (2004), 82–86.Google ScholarDigital Library
- [51] . 2004. Using open source software in product development: A primer. IEEE Software 21 (2004), 82–86.Google ScholarDigital Library
- [52] . 2017. The lives and deaths of open source code forges. In Proceedings of the 13th International Symposium on Open Collaboration. ACM, New York, NY.Google ScholarDigital Library
- [53] . 2006. Impacts of license choice and organizational sponsorship on user interest and development activity in open source software projects. Information Systems Research 17, 2 (2006), 126–144.Google ScholarDigital Library
- [54] . 2006. Asla: Reverse engineering approach for software license information retrieval. In Conference on Software Maintenance and Reengineering (CSMR’06). IEEE, 4–pp.Google ScholarCross Ref
- [55] . 2009. Automated software license analysis. Automated Software Engineering 16, 3 (2009), 455–490.Google ScholarDigital Library
- [56] . 2014. Tracing software build processes to uncover license compliance inconsistencies. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering. 731–742.Google ScholarDigital Library
- [57] . 2019. A Comparison Study of Open Source License Crawler. Bachelor Thesis. https://osr.cs.fau.de/wp-content/uploads/2019/08/wolter_2019.pdf.Google Scholar
- [58] . 2015. A method to detect license inconsistencies in large-scale open source projects. In 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories. IEEE, 324–333.Google ScholarCross Ref
- [59] . 2022. A large-scale dataset of (open source) license text variants. In Proceedings of the 19th International Conference on Mining Software Repositories. IEEE.Google ScholarDigital Library
Index Terms
- Open Source License Inconsistencies on GitHub
Recommendations
Choosing an Open Source License
Maintaining a large code base can be time-consuming and costly. By open sourcing such code, a company can focus on new code for innovative features. However, to ensure the open source project becomes a success, the choice of open source license is ...
Open source license alternatives for software applications: is it a solution to stop software piracy?
ACM-SE 43: Proceedings of the 43rd annual Southeast regional conference - Volume 2The open source movement has introduced a wealth of software applications that may challenge commercial applications in ease of use, features, and speed. Typically open source applications are available "free-of-charge", but the potential for hidden ...
LicenseRec: Knowledge Based Open Source License Recommendation for OSS Projects
ICSE '23: Proceedings of the 45th International Conference on Software Engineering: Companion ProceedingsOpen Source license is a prerequisite for open source software, which regulates the use, modification, redistribution, and attribution of the software. Open source license is crucial to the community development and commercial interests of an OSS ...
Comments