![translation](https://cdn.durumis.com/common/trans.png)
This is an AI translated post.
The Paradox of Leading AI Models, Transparency
- Writing language: Korean
- •
-
Base country: All countries
- •
- Information Technology
Select Language
Summarized by durumis AI
- Researchers at Stanford University have analyzed 10 state-of-the-art AI systems, including GPT-4, and found that there is a lack of transparency in AI models, such as datasets and learning methods.
- In particular, major AI companies such as OpenAI and Google are taking a passive approach to data disclosure as they pursue a profit-oriented business model, which could hinder the development of AI technology and lead to a monopolistic future.
- Experts argue that the transparency of AI models should be increased to ensure reproducibility and strengthen social responsibility, urging discussions on social consensus and regulation along with the development of AI technology.
A study published by Stanford University researchers on the 18th reveals how deep and potentially dangerous the secrecy surrounding GPT-4 and other cutting-edge AI systems is.
Introducing The Foundation Model Transparency Index, Stanford University
They investigated a total of 10 different AI systems, most of which were large language models like those used in ChatGPT and other chatbots. This included widely used commercial models such as OpenAI's GPT-4, Google's PaLM 2, and Amazon's Titan Text, and they assessed openness based on 13 criteria, including how transparent developers were about the data used to train the models (data collection and annotation methods, whether copyrighted materials were included, etc.). They also investigated whether they disclosed the hardware used to train and run the models, the software frameworks used, and the project's energy consumption.
The results showed that none of the AI models achieved more than 54% on the transparency scale across all criteria. Overall, Amazon's Titan Text was ranked as the least transparent, while Meta's Llama 2 was chosen as the most open. Interestingly, Llama 2, which is a representative of the recent confrontation between open and closed models and is an open-source model, did not disclose the data used in training, the data collection and curation methods, etc. In other words, despite the increasing influence of AI on our society, opacity in the industry is a widespread and persistent phenomenon.
This means that the AI industry is at risk of becoming a profit-driven sector rather than a scientific advancement sector, and it could lead to a monopolistic future driven by specific companies..
Eric Lee/Bloomberg via Getty Images
OpenAI CEO Sam Altman has already met with policymakers around the world, openly stating his willingness to explain this unfamiliar and new intelligence to them and help them concretize related regulations. However, while he supports the idea of an international body to oversee AI in principle, he believes that some limited rules, such as banning all copyrighted material from data sets, could become unfair obstacles. This is a clear sign that the "openness" embedded in the company name OpenAI has deteriorated from the radical transparency it presented at its launch.
However, as the results of this Stanford report show, it is also worth noting that there is no need to keep each model so secret for the sake of competition. Because these results are also an indicator of the shortcomings of almost all companies. For example, there are reportedly no companies that provide statistics on how many users rely on their models or on the regions or market segments where their models are used.
Among organizations that adhere to open source principles, there is a saying, "Many eyes make all bugs shallow." (Linus's law) Crude numbers help to find problems that can be solved and fixed.
However, the practice of open source tends to gradually lose social standing and value recognition within and outside public companies., so there is no great significance in emphasizing it unconditionally. Therefore, rather than sticking to the framework of whether a model is open or closed, it is a better choice to focus the discussion on gradually expanding external access to the "data" that forms the basis of powerful AI models..
Scientific progress requires ensuring reproducibility, confirming whether specific research results reappear. If concrete plans are not implemented to ensure transparency regarding the key components of each model's creation, the industry will ultimately be more likely to remain in a closed and stagnant monopolistic situation. And this needs to be considered a high priority in the current and future situations where AI technology is rapidly permeating all industries.
It has become important for journalists and scientists to understand data, and transparency is a prerequisite for planned policy efforts for policymakers. Transparency is also important for the public, as end users of AI systems can be perpetrators or victims of potential problems related to intellectual property, energy consumption, and bias. Sam Altman argues that the risk of human extinction from AI should be a global priority, on par with social-scale risks such as pandemics or nuclear war. However, we must not forget that our society's survival, maintaining a healthy relationship with developing AI, is a prerequisite for reaching the dangerous situation he mentions.
*This article is the original content published in the signed column of the e-newspaper on October 23, 2023.
References