Variant: Why Better AI Needs Crypto

Reprinted from jinse

01/17/2025·16days ago

Author: Daniel Barabander, General Counsel and Partner of Variant Fund; Compiler: 0xjs@金财经

Key points of this article

Currently, basic AI development is dominated by a few technology companies and is in a closed and anti-competitive state.
Open source software development is another option, but basic AI cannot be developed as a traditional open source software project (such as Linux) because it has a "resource problem" and open source contributors are also required to donate computing and data costs that are beyond their personal capabilities. .
Crypto solves the resource problem by incentivizing resource providers to contribute to underlying open source AI projects through ownership.
Open source AI combined with cryptography can support larger models and drive more innovation, leading to better AI.

introduction

A 2024 Pew Research Center poll found that 64% of Americans believe social media has had a negative rather than a positive impact on the United States, and 78% say social media companies have too much power and power in today’s politics. influence, with 83% saying it is likely or very likely that these platforms deliberately censor political views they disagree with. Distaste for social media platforms is one of the few issues that unites Americans.

Looking back at the progress of social media experiments over the past 20 years, it seems inevitable that we would end up where we are. You all know the story. A handful of big tech companies initially attracted attention and, most importantly, user data. While it was initially hoped that the data would be made public, the companies quickly reversed course and shut down access after using the data to build unbreakable network effects. This has essentially led to the current situation, where less than a dozen large tech social media companies exist like small feudal fiefdoms in an oligopoly, with no incentive to change because the status quo is extremely profitable. It is closed and anti-competitive.

Looking at where AI experiments are currently going, I feel like I'm watching the same movie over again, but this time it's much more involved. A handful of large tech companies have amassed GPUs and data to build basic AI models and blocked access to those models. It is no longer possible for new entrants (who have not raised billions of dollars) to build competing versions because the barriers to entry are so high - the computational capex of just pre-training a base model is in the billions of dollars, and from Social media companies that benefited from the last tech boom are using their control of proprietary user data to build models that rivals cannot. We are working hard to recreate in AI what we have done with social media: being closed and anti-competitive. If we continue down this path of closed AI, a handful of tech companies will have unfettered control over access to information and opportunity.

Open source AI and the “resource problem”

If we don’t want a closed AI world, what are our alternatives? The obvious answer is to build the base model as an open source software project. We have countless examples of open source projects that build the foundational software we rely on every day. If Linux shows that something as basic as an operating system can be built open source, what difference does LLM make?

Unfortunately, underlying AI models have limitations that make them different from traditional software, which severely hinders their viability as traditional open source software projects. Specifically, the underlying AI models themselves require computing and data resources beyond the capabilities of any individual. The upshot is that, unlike traditional open source software projects that rely on people donating their time (already a challenging problem), open source AI also requires people donating resources in the form of computing and data. This is the "resource problem" of open source AI.

To better understand the resource problem, let's look at Meta's LLaMa model. Meta differs from its competitors (OpenAI, Google, etc.) in that it does not hide the model behind a paid API, but instead makes LLaMa's weights openly available for anyone to use for free (with some restrictions). These weights represent what the model learned during the training process of Meta and are required to run the model. With weights in place, anyone can fine-tune the model or use the model's output as input to a new model.

While Meta deserves credit for publishing LLaMa's weights, it is not truly an open source software project. Meta trains the model privately using its own calculations, data, and decisions, and decides unilaterally when to make the model available to the world. Meta does not invite independent researchers/developers to participate in the community because individual community members cannot afford the computational or data resources required to train or retrain models - tens of thousands of high-memory GPUs, data centers to house them, massive cooling infrastructure facilities, and trillions of training data tokens. As stated in Stanford University’s 2024 AI Index report, “The rising cost of training has effectively excluded universities (traditionally centers of AI research) from developing their own cutting-edge foundational models.” To understand the cost, Sam Altman mentions The training cost of GPT-4 is US$100 million, and may not include capital expenditures; Meta’s capital expenditures increased by US$2.1 billion year-on- year (Q2 2024 vs. Q2 2023), mainly from training and training AI Investments in server, data center and network infrastructure associated with the model. Therefore, while LLaMa's community contributors may have the technical ability to contribute and iterate on the basic model architecture, they still lack the means to do so.

In summary, unlike traditional open source software projects, which only require contributors to contribute time, contributors to open source AI projects are required to contribute time and substantial costs in the form of computation and data. Relying on goodwill and volunteerism to motivate enough parties to provide these resources is unrealistic. They need further incentives. Perhaps the best counterexample to the virtues of goodwill and volunteerism in developing open source AI is the success of the 176B parameter open source LLM BLOOM, which involved 1,000 volunteer researchers from over 70 countries and over 250 institutions. While this is certainly an impressive achievement (one that I fully support), coordinating a single training run took a year and received €3 million in funding from a French research agency (and that cost did not include funding for Capital outlay for a supercomputer to train the model, one of which is already available to a French institution). The process of coordinating and relying on new grants to iterate on BLOOM is too cumbersome and bureaucratic to match the pace of large tech labs. Although it's been over two years since BLOOM was released, I don't know if the collective has produced any follow-up models.

To make open source AI possible, we need to incentivize resource providers to contribute their computation and data without requiring open source contributors to incur costs.

Why Crypto Can Solve Open Source AI’s Resource Problem

Crypto’s breakthrough is leveraging ownership to make resource-expensive open source software projects possible. Crypto solves the resource problem inherent in open source AI by incentivizing speculative resource providers with potential upside to the network, rather than requiring open source contributors to pay upfront costs to provide these resources.

For proof of this, look no further than the original crypto project Bitcoin. Bitcoin is an open source software project; the code that runs it is completely open and has been since the day the project began. But the code itself isn't the secret sauce; there's not much use in downloading and running the Bitcoin node software to create a blockchain that only exists on your local computer. The software is only useful when computing the number of mined blocks is large enough to exceed the computing power of any single contributor. Only in this way can the added value of the software be realized: maintaining a ledger that no one controls. Like Foundation Open Source AI, Bitcoin represents an open source software project that requires resources beyond the capabilities of any single contributor. They may need this computing for different reasons — Bitcoin to make the network tamper-proof, and Foundation AI to iterate models — but the broader point is that they both require resources beyond the needs of any single contributor To function as a viable open source software project.

The magic trick that Bitcoin, or indeed any crypto network, uses to incentivize participants to contribute resources to open source software projects is to provide ownership of the network in the form of tokens. As Jesse wrote in his founding paper for Variant back in 2020, ownership incentivizes resource providers to contribute resources to the project in exchange for potential upside to the network. This is similar to how sweat equity is used to launch a fledgling company - by paying early employees (such as the founders) primarily through ownership of the business, the startup can overcome startup problems by accessing a workforce that would otherwise be unaffordable. Crypto extends the concept of sweat equity to resource providers, not just those who donate their time. As such, Variant focuses on investing in projects that leverage ownership to build network effects, such as Uniswap, Morpho, and World.

If we want to make open source AI possible, then ownership through crypto is the solution to the resource problems it faces. Researchers are free to contribute their model design ideas to open source projects because the resources needed to implement their ideas will be provided by the computing and data providers in exchange for their ownership of the project, rather than requiring these researchers to pay exorbitant fees Upfront costs. Ownership can take many different forms in open source AI, but the one I'm most excited about is ownership of the models themselves, like the approach proposed by Pluralis.

Pluralis calls this approach a protocol model, in which compute providers can contribute computing resources to train a specific open source model and gain ownership of future inference revenue from that model. Since ownership belongs to a specific model, and the value of ownership is based on inference revenue, the compute provider has an incentive to choose the best model rather than cheat on the training (since providing useless training reduces the expected value of future inference revenue). The question then becomes: how to enforce ownership on Pluralis if the weights need to be sent to a compute provider for training? The answer is that model parallelism is used to distribute model shards among workers, allowing to exploit a key property of neural networks: being able to contribute to training a larger model while only seeing a small fraction of the total weights, Thereby ensuring that the complete set of weights remains unextractable. And since many different models are trained on Pluralis, the trainer will have many different sets of weights, making it extremely difficult to recreate the model. This is the core concept of protocol models: they are trainable and can be used, but cannot be extracted from the protocol (without using any more computing power than would be required to train the model from scratch). This addresses a concern often raised by critics of open source AI, namely that closed AI competitors will appropriate the fruits of the labor of open projects.

Why Crypto + Open Source = Better AI

I began this article by describing the problem of control by big tech companies to illustrate why closed AI is bad from a normative perspective. But in a world where our online experiences are fatalistic, I fear this may mean nothing to most readers. So in the end I want to give two reasons that open source AI powered by crypto will actually lead to better artificial intelligence.

First, the combination of Crypto and open source AI will allow us to reach the next level of base models, as it will coordinate more resources than closed AI. Our current research shows that more resources in the form of computation and data means better models, which is why base models typically get larger and larger. Bitcoin shows us what open source software plus cryptography can unlock in terms of computing power. It is the largest and most powerful computing network in the world, orders of magnitude larger than the clouds of major tech companies. Encryption turns isolated competition into cooperative competition. Resource providers are incentivized to contribute their resources to solve a collective problem rather than hoarding their resources to solve that problem individually (and redundantly). Open source AI using encryption will be able to leverage the world's collective computing and data to build model sizes far beyond what is possible with closed AI. Companies like Hyperbolic have demonstrated the power of leveraging collective computing resources, allowing anyone to rent out GPUs at a lower price on their open marketplace.

Secondly, combining Crypto and open source AI will drive more innovation. This is because, if we can overcome resource issues, we can return to the highly iterative and innovative open source nature of machine learning research. Before the recent introduction of basic LLM, machine learning researchers had been publicly releasing their models and blueprints for replicating them for decades. These models typically use more limited open data sets and have manageable computational requirements, meaning anyone can iterate on them. It is through this iteration that we have made progress in sequence modeling, such as RNNs, LSTMs, and attention mechanisms, which enable the "Transformer" model architecture on which current basic LLMs rely. But with the launch of GPT-3 (which reversed the trend of open source GPT-2) and the huge success of ChatGPT, this all changed. That's because OpenAI has proven that if you throw enough compute and data at massive models, you can build LLMs that seem to understand human language. This created resource issues that made academic research unable to afford high prices, and caused large tech company labs to largely stop publicly releasing their model architectures to maintain a competitive advantage. The current state of primarily relying on individual laboratories will limit our ability to push the boundaries of state-of-the-art technology. Open source AI enabled by cryptography will mean that researchers will once again be able to continue this iterative process on cutting-edge models to discover the “next transformer”.