Variant Investment Partner: Dilemmas and breakthroughs of open source AI, why is encryption technology the last piece of the puzzle?
Reprinted from chaincatcher
01/19/2025·4days agoAuthor: Daniel Barabander
Compiled by: Deep Wave TechFlow
Brief summary
-
The current development of basic AI is dominated by a few technology companies, which is characterized by closedness and lack of competition.
-
Although open source software development is a potential solution, basic AI cannot operate like traditional open source projects (such as Linux) because it faces a "resource problem": open source contributors not only need to devote their time, but also need to take on more than their personal capabilities. computing and data costs.
-
Encryption technology is expected to solve this resource problem by incentivizing resource providers to participate in basic open source AI projects.
-
Combining open source AI with encryption technology can support larger-scale model development and drive more innovation to create more advanced AI systems.
introduction
According to a 2024 Pew Research Center survey , 64% of Americans believe social media is doing more harm than good to the country; 78% say social media companies have too large an influence in politics Power and influence; 83% believe these platforms are likely to intentionally censor political views they disagree with. Dissatisfaction with social media has become one of the rare consensuses in American society.
Looking back at the development of social media over the past 20 years, this situation seems to have been doomed. The story isn’t complicated: A handful of big tech companies capture users’ attention and, more importantly, their data. Despite initial hopes for open data, these companies quickly changed tactics, using the data to create unbreakable network effects and closing off access to outsiders. The result is today's situation: less than 10 large technology companies dominate the social media industry, forming an "oligopoly" pattern. Because the status quo is so favorable to them, these companies have little incentive to change. This model is closed and lacks competition.
Today, the development trajectory of AI technology seems to be repeating this scene, but this time the impact is more profound. A few technology companies have built basic AI models by controlling GPU and data resources, and have closed access to these models to the outside world. For new entrants without billions of dollars in capital, developing a competitive model is nearly impossible. Because the computational cost of just training a basic model is billions of dollars, and social media companies that benefited from the last wave of technology are leveraging their control of proprietary user data to develop features that are difficult for competitors to match. model. We are going the way of social media, heading towards a closed and non-competitive AI world. If this trend continues, a handful of technology companies will have unfettered control over access to information and opportunity.
Open source AI and the “resource problem”
If we don’t want to see a closed AI world, what are our options? The obvious answer is to develop the base model as an open source software project. Historically, we have had countless open source projects successfully build the foundational software we rely on every day. For example, the success of Linux proves that even software as core as an operating system can be developed through open source. So why can't LLMs (Large Language Models)?
However, basic AI models face special limitations that make them different from traditional software, which also greatly weakens their viability as traditional open source projects. Specifically, basic AI models require enormous computing and data resources that are far beyond the capabilities of an individual. Unlike traditional open source projects that only rely on people donating their time, open source AI also requires people to donate computing power and data resources. This is the so-called "resource problem."
Taking Meta's LLaMa model as an example, we can better understand this resource issue. Unlike competitors such as OpenAI and Google, Meta does not hide the model behind a paid API, but publicly provides LLaMa's weights for anyone to use for free ( with certain restrictions ). These weights contain the knowledge learned by the model during the Meta training process and are necessary conditions for running the model. With these weights, the user can fine-tune the model or use the model's output as input to a new model.
Although the weight of Meta's release of LLaMa is worthy of recognition, it cannot be regarded as a true open source software project. Meta controls the model training process behind the scenes, relying on its own computing resources, data and decisions, and unilaterally decides when to make the model available to the public. Meta does not invite independent researchers or developers to participate in community collaboration because the resources required to train or retrain models are far beyond the capabilities of the average individual. These resources include tens of thousands of high-performance GPUs, data centers to store these GPUs, complex cooling facilities, and trillions of Tokens (text data units required for model training) used for training. As Stanford University's 2024 Artificial Intelligence Index report points out, "The sharp rise in training costs has effectively excluded universities that have traditionally been centers of AI research from the development of top basic models." For example, Sam Altman once mentioned Training GPT-4 costs upwards of $100 million , and that doesn't even include capital expenditures for the hardware. In addition, Meta's capital expenditures increased by $2.1 billion in the second quarter of 2024 compared with the same period in 2023 , mainly for servers, data centers and network infrastructure related to AI model training. Therefore, although LLaMa's community contributors may have the technical ability to improve the model architecture, they lack sufficient resources to implement these improvements.
In summary, unlike traditional open source software projects, open source AI projects not only require contributors to invest time, but also require them to bear high computing and data costs. It is unrealistic to rely solely on goodwill and volunteerism to motivate enough resource providers. They need further incentives. Take the open source large language model BLOOM as an example. This model with 176 billion parameters combines the efforts of 1,000 volunteer researchers from more than 250 institutions in more than 70 countries. While BLOOM's success is admirable (and I fully support it), it took a year to coordinate a single training and relied on a €3 million grant from a French research agency (this does not include the supercomputer used to train the model capital expenditure). The process of coordinating and iterating on BLOOM, relying on a new round of funding, is too cumbersome to match the speed of development in large tech labs. It's been over two years since BLOOM was released, and we haven't heard of any follow-up models developed by the team.
To make open source AI possible, we need to find a way to incentivize resource providers to contribute their computing power and data resources, rather than having open source contributors bear these costs themselves.
Why encryption technology can solve the "resource problem" of basic open
source AI
The core breakthrough of encryption technology is to make high resource cost open source software projects possible through the "ownership" mechanism . It solves the resource problem of open source AI by incentivizing potential resource providers to participate in the network, rather than having open source contributors bear the cost of these resources up front.
Bitcoin is a good example. As the oldest cryptographic project, Bitcoin is a completely open source software project, and its code has been public from the beginning. However, the code itself is not the key to Bitcoin. Simply downloading and running the Bitcoin node software to create a blockchain locally has no real meaning. Only when the computational effort required to mine a block exceeds the computing power of any single contributor can the true value of the software be realized: maintaining a decentralized, uncontrolled ledger. Similar to basic open source AI, Bitcoin is an open source project that requires resources beyond the capabilities of an individual. Although the two require computing resources for different reasons
- Bitcoin requires computing resources to ensure that the network cannot be tampered with, while basic AI requires computing resources to optimize and iterate models - what they have in common is that they both need to rely on resources beyond personal capabilities. .
The “secret” to Bitcoin, and indeed any other crypto network, to incentivize participants to contribute resources to open source software projects is to provide network ownership through tokens. As Jesse described in the founding philosophy he wrote for Variant in 2020, ownership provides resource providers with strong incentives to contribute resources in exchange for potential gains in the network. This mechanism is similar to how startups solve the problem of insufficient early capital through "sweat equity" - by paying early employees (such as founders) mainly in the form of ownership of the company, startups are able to attract employees they would not otherwise be able to afford. of labor force. Crypto expands the concept of “sweat equity” from focusing on time contributors to resource providers. As such, Variant focuses on investing in projects that leverage ownership mechanisms to build network effects, such as Uniswap, Morpho, and World.
If we want open source AI to become a reality, ownership mechanisms enabled by cryptography are a key solution to the resource problem. This mechanism allows researchers to freely contribute their model design ideas to open source projects, as the computing and data resources required to implement these ideas will be borne by the resource provider, who will be rewarded by receiving partial ownership of the project , rather than requiring researchers to bear high upfront costs themselves. In open source AI, ownership can take many forms, but one of the most anticipated is ownership of the model itself, which is also the solution proposed by Pluralis .
The approach proposed by Pluralis is called Protocol Models . In this model, computing resource providers can train a specific open source model by contributing computing power and thus obtain partial ownership of the future inference revenue of the model. Since this ownership is tied to a specific model and its value is based on the model's inference revenue, computing resource providers are incentivized to select the optimal model for training without falsifying training data (because providing useless training will directly reduce future Reasoning about the expected value of income). However, a key question is: if the training process requires sending the model’s weights to a compute provider, how does Pluralis ensure the security of ownership? The answer lies in using Model Parallelism technology to distribute model shards to different workers. An important property of neural networks is that even if only a tiny fraction of the model weights are known, the calculator can still participate in the training, ensuring that the complete set of weights cannot be extracted. In addition, since many different models will be trained simultaneously on the Pluralis platform, the trainer will be faced with a large number of different weight sets, which makes it extremely difficult to reconstruct the complete model.
The core idea of Protocol Models is that these models can be trained and used, but cannot be fully extracted from the protocol (unless the computing power used exceeds the resources required to train the model from scratch). This mechanism addresses an issue often raised by critics of open source AI, namely that closed AI competitors could appropriate the fruits of open source projects' efforts.
Why crypto + open source = better AI
At the beginning of the article, I explained the ethical and normative problems of closed AI by analyzing the control of AI by big technology companies. But in an Internet age filled with feelings of powerlessness, I worry that such an argument may not resonate with most readers. Therefore, I would like to put forward two reasons why open source AI powered by encryption technology can truly lead to better AI based on practical effects.
First, the combination of encryption technology and open source AI can coordinate more resources to promote the development of the next generation of foundation models. Research shows that increases in both computing power and data resources help improve model performance, which is why the size of basic models has been expanding. Bitcoin shows us the potential of open source software combined with encryption technology to achieve computing power. It has become the world's largest and most powerful computing network, far larger than the cloud computing resources owned by big tech companies . Cryptocurrency is unique in its ability to transform isolated competition into collaborative competition . Encrypted networks enable efficient use of resources by incentivizing resource providers to contribute resources to solve common problems, rather than working individually and duplicating efforts. Open-source AI powered by cryptography will be able to leverage global computing and data resources to build models that are much larger than closed AI. For example, the company Hyperbolic has demonstrated the potential of this model. They make full use of distributed computing resources through an open market that allows anyone to rent GPUs at a low cost.
Secondly, the combination of encryption technology and open source AI will drive the acceleration of innovation. This is because once the resource issue is resolved, machine learning research can return to its highly iterative and innovative open source nature. Before the emergence of basic large language models (LLMs), researchers in the field of machine learning often publicly released their models and their replicable design blueprints. These models often use open source data sets and have relatively low computational requirements, so researchers can continue to optimize and innovate on these foundations. It is this open iterative process that has given rise to many breakthroughs in the field of sequence modeling, such as recurrent neural networks (RNN), long short-term memory networks (LSTM) and attention mechanisms (Attention Mechanisms), ultimately making the Transformer model architecture a possible. However, this open approach to research changed after the launch of GPT-3. OpenAI has proven through the success of GPT-3 and ChatGPT that as long as enough computing resources and data are invested, a large language model with language understanding capabilities can be trained. This trend has led to a sharp increase in resource thresholds, gradually excluding academia. At the same time, large technology companies no longer disclose their model architecture in order to maintain a competitive advantage. This situation limits our ability to promote the cutting-edge technology of AI.
Open source AI enabled by cryptography can change this. It allows researchers to iterate on cutting-edge models again to discover the “next Transformer”. This combination can not only solve the resource problem, but also reactivate the innovative vitality in the field of machine learning and open up a broader path for the future development of AI.