These startups build advanced AI models without data centers

by admin
These startups build advanced AI models without data centers

Researchers have trained A new type of Great language model (LLM) using Gpus dotted around the world and nourished by private and public data – a movement that suggests that the dominant way of building artificial intelligence could be disturbed.

Flower ai And OldTwo startups pursuing unconventional AI construction approaches, worked together to create the new model, called collective-1.

Techniques created by flowers that allow training to propagate to hundreds of computers connected on the Internet. Company technology is already used by certain companies to train AI models without needing to pool resources or data. Vana has provided data sources including private messages from X, Reddit and Telegram.

Collective -1 is low according to modern standards, with 7 billion parameters – values ​​that combine to give the model its capacities – have been made to hundreds of billions for today's models, such as those with power programs like power Cat,, ClaudeAnd Gemini.

Nic Lane, a computer scientist from the University of Cambridge and co-founder of Flower Ai, says that the distributed approach promises to extend far beyond the size of the collective-1. Lane adds that Flower AI is halfway through the formation of a model with 30 billion parameters using conventional data and plans to form another model with 100 billion parameters – the size offered by industry leaders – the state this year. “It could really change the way everyone thinks of AI, so we continue that quite hard,” said Lane. He says that the startup also incorporates images and audio in the formation to create multimodal models.

The construction of distributed models could also disturb the power dynamics that has shaped the AI ​​industry.

AI companies are currently building their models by combining large quantities of training data with huge quantities of calculation concentrated inside the data centers filled with advanced GPUs which are networked together using super fast fiber cables. They also count strongly on data sets created by scratching the clear accessible to the public, although sometimes protected by copyright -including websites and books.

The approach means that only the richest companies and nations having access to large quantities of the most powerful chips can develop the most powerful and precious models. Even open source models, such as Meta calls And DEEPSEEK R1are built by companies with access to large data centers. The approaches distributed could allow small businesses and universities to build advanced AI by pooling disparate resources together. Or this could allow countries that lack conventional infrastructure to network together several data centers to create a more powerful model.

Lane believes that the AI ​​industry will turn more and more towards new methods that allow training to get out of individual data centers. The distributed approach “allows you to calculate the calculation much more elegantly than the data center model,” he says.

Helen Toner, expert in AI governance at the Center for Security and Emerging Technology, said that Flower AI's approach is “interesting and potentially very relevant” for competition and AI governance. “He will probably continue to fight to follow the border, but could be an interesting approach to rapid followers,” said Toner.

Divide and conquer

The distributed AI training consists in rethinking how the calculations used to build powerful AI systems are divided. The creation of an LLM consists in nourishing enormous quantities of text in a model that adjusts its parameters in order to produce useful responses to an prompt. Within a data center, the training process is divided so that the parts can be executed on different GPUs, then consolidated periodically in a single master model.

The new approach makes it possible to carry out the work normally done inside a large data center on equipment that can be several kilometers and connected to a relatively slow or variable internet connection.

Source Link

You may also like

Leave a Comment