Microsoft bets on a compact and powerful LLM with Phi-3-mini, designed to run on iPhone

On the occasion of Microsoft Ignite 2023, Satya Nadella unveiled Phi-2, a large language model of 2.7 billion parameters with interesting capabilities in terms of reasoning and language understanding, positioning itself among the best models of basic language with less than 13 billion parameters.

Developed by the Machine Learning Foundations team at Microsoft Research, Phi-2 matches or outperforms models up to 25 times larger, with adjustments made to model scaling and data retention. training, said Satya Nadella. Today, the Redmond giant unveiled a third iteration of its Phi model, called “Phi-3-mini.”

A compact LLM that puts Google, Meta and Mistral AI to shame

“We present Phi-3-mini, a large language model of 3.8 billion parameters trained on 3.3 trillion tokens, whose overall performance, measured by academic benchmarks and internal tests, rivals that of models such as Mixtral 8x7B and GPT-3.5”, say researchers from Microsoft Research in their scientific publication. In detail, Phi-3-mini obtains a score of 68.8% on the MMLU test compared to 61.7% for Mistral 7B, 63.6% for Gemma 7B, 66% for Llama-3-Instruct and 68.4 % for Mixtral 8x7B.

And if GPT-3.5 remains above the rest on this test – with a score of 71.4% – OpenAI's flagship model falls behind on the GSM-8K test obtaining a score of 78.1% against 82, 5% for the compact model from Microsoft. Models from Google, Meta and Mistral AI also lag behind in performance on this point of comparison. The HumanEval test also widens the gap between all these models. Phi-3-mini gets 58.5% while GPT-3.5 gets 62.2%. The other LLMs obtain scores between 28% and 38.4%, far from the mark.

Phi-3-mini can be deployed on iPhone 14

The LLM developed by Microsoft stands out on another point: its ability to be deployed on a phone due to its sufficiently small size. To achieve this, the researchers explain that “The innovation lies entirely in our training dataset, an enlarged version of that used for Phi-2, consisting of heavily filtered web data and synthetic data.”

Thanks to its small size, the researchers say Phi-3-mini can be quantized to 4 bits so it only takes up about 1.8 GB of memory. “We tested the quantized model by deploying phi-3-mini on the iPhone 14 equipped with an A16 Bionic chip running natively on the device and fully offline, achieving over 12 tokens per second.”

Pushing the limits of “compact” language models

However, although Phi-3-mini achieves a similar level of language understanding and reasoning ability as much larger models, it is still fundamentally limited by its size for certain tasks. The model simply does not have the capacity to store too many “factual knowledge”, as Microsoft researchers claim, which results, for example, in poor performance on certain benchmarks such as TriviaQA.

However, this weakness can be exploited to good effect by adding a search engine. The Microsoft Research teams looked into the subject and showed an example using the default HuggingFace chat interface with Phi-3-mini. In the future, the teams plan to look at another issue, multilingual capabilities for small language models.

Two more versions of settings 7B and 14B released

At the same time, the teams published two other models with respectively 7 and 14 billion parameters called “Phi-3-small” and “Phi-3-medium”. Both perform significantly better than Phi-3-mini (for example, they obtain respective scores of 75% and 78% on MMLU compared to 69% for phi-3-mini, and 8.7 and 8.9 on MT- bench against 8.38 for Phi-3-mini). The Phi-3-small model has a vocabulary size of 100,352 and a default popup of 8K tokens, which is double the Phi-3-mini model.

In their scientific publication, Microsoft Research researchers specify that, with the aim of making “maximally benefit the open source community”, Phi-3-mini is built on a similar block structure as Llama-2 and uses the same tokenizer with a vocabulary size of 320,641. This means that all packages developed for the Llama-2 model family can be directly adapted to Phi-3-mini.

Do you want to stay up to date on the latest news in the artificial intelligence sector? Register for free to the IA Insider newsletter.

Selected for you