Microsoft releases the largest 1 -bit LLM, allowing a powerful AI operating on older equipment

by admin
Microsoft releases the largest 1 -bit LLM, allowing a powerful AI operating on older equipment

Microsoft researchers claim to have developed the first large 1 -bit language model with 2 billion parameters. The model, Bitnet B1.58 2B4T, can operate on commercial processors such as Apple M2.

“Trained on a corpus of 4 bowls of tokens, this model shows how the LLM 1 bits native can obtain performance comparable to the main model of open and precise size The front deposit embraces the project.

What makes a bitnet model different?

Bitnets, or 1 -bit llms, are compressed versions of large -language models. The original scale model of 2 billion parameters formed on a corpus of 4 billion tokens has been reduced in a version with considerably reduced memory requirements. All weights are expressed as one of the three values: -1, 0 and 1. Other LLM can use 32 -bit or 16 -bit floating formats.

See: threat actors can Inject malicious packages into AI models which resurfaces during the “coding of the atmosphere”.

In The research documentWho was published on Arxiv as a work in progress, the researchers detail how they created the Bitnet. Other groups have already created bitnets, but, according to the researchers, most of their efforts are either post-training quantification methods (PTQ) applied to pre-formulated complete precision models or to 1 native 1-bit models formed from zero that have been developed on a smaller scale in the first place. Bitnet B1.58 2B4T is a large -scale native LLM formed; It only takes 400 MB, compared to other “small models” which can reach up to 4.8 GB.

Bitnet B1.58 Performances, goal and limit of the 2B4T model

Performance compared to other AI models

Bitnet B1.58 2B4T surpasses other 1 -bit models, according to Microsoft. Bitnet B1.58 2B4T has a maximum sequence length of 4096 tokens; Microsoft claims that it surpasses small models like Meta's Llama 3.2 1b or Google's Gemma 3 1b.

Objective of researchers for this bitnet

Microsoft’s goal is to make LLM accessible to more people by creating versions that operate on EDGE devices, in resource or real -time applications.

However, Bitnet B1.58 2B4T is still not easy to execute; It requires equipment compatible with the Bitnet.CPP framework of Microsoft. Executing it on a standard transformers library will not produce any of the advantages in terms of speed, latency or energy consumption. Bitnet B1.58 2B4T does not work on GPUs, as do the majority of AI models.

What is the next step?

Microsoft researchers plan to explore the training of larger native 1 -bit models (7b, 13b and more) parameters). They note that most IA infrastructure today lack suitable equipment for 1-bit models, they therefore plan to explore the “accelerators of future co-design equipment” specifically designed for a compressed AI. Researchers also aim to:

  • Increase the length of the context.
  • Improve performance on long -lasting reasoning tasks.
  • Add the management of several languages ​​other than English.
  • Integrate 1 -bit models into multimodal architectures.
  • Better understanding the theory of reason why 1 -bit training has produced efficiency gains.

Source Link

You may also like

Leave a Comment