Open Source Initiative Seeks Crowd Input on Defining Open Source AI
The Open Source Initiative, the nonprofit organization responsible for overseeing the Open Source Definition, which outlines the criteria for software licenses, is seeking input from the public on its efforts to define Open Source AI.
The organization is launching a series of global workshops to gather feedback from stakeholders regarding its Open Source AI Definition, a topic that has been under deliberation for the past two years.
The challenge lies in the absence of a universally accepted method for determining whether an AI system qualifies as open source, despite the existence of numerous machine learning models available under open source licenses such as MIT, GPL 3.0, GPL 2.0, and AFL 3.0.
There is apprehension that the legal terminology used in existing OSI-approved licenses may not adequately address the complexities of how machine learning models and datasets are utilized. Terms like “program” encompass more than just source code and binary files when applied to machine learning models, for instance.
Stefano Maffulli, executive director of the OSI, emphasized the uniqueness of AI compared to traditional software and the need for all stakeholders to reassess how Open Source principles are applied in this domain.
The OSI believes in empowering individuals and organizations to maintain control and oversight of technology while promoting transparency, collaboration, and innovation without restrictive permissions.
To gather feedback on its latest draft, currently at version 0.0.8, the OSI will conduct workshops at various upcoming conferences worldwide, spanning the US, Europe, Africa, Asia, the Pacific, and Latin America, continuing through September.
Bruce Perens, the original drafter of the Open Source Definition, expressed skepticism about the necessity of addressing AI separately, arguing that the fundamental issue lies in the broader software industry mislabeling non-open source software as open source.
Perens cautioned that introducing a separate definition for AI could potentially dilute the open source brand, especially considering that the existing Open Source Definition applies universally to all software.
He highlighted the inherent challenge with AI, where the output often involves elements of plagiarism due to the training of large language models from various sources, including websites and open source software, without adequate regard for copyright. Perens suggested that legal proceedings may ultimately address these concerns, paralleling the resolution of issues faced by platforms like Napster in the past.