• Share this News :        


  • December 18, 2023
  • Shahala VP
Microsoft Unveils Phi-2: A Compact Yet Powerful Language Model

In the realm of expansive language models such as GPT-4 and Bard, Microsoft has introduced its latest creation — Phi-2, a small language model boasting 2.7 billion parameters and an upgraded iteration of its predecessor, Phi-1.5. Recently made accessible through the Azure AI Studio model catalogue, Phi-2 has garnered attention by purportedly surpassing larger models like Llama-2, Mistral, and Gemini-2 in various generative AI benchmark tests. Satya Nadella initially announced Phi-2 at Ignite 2023, and it hit the virtual shelves earlier this week. Crafted by the Microsoft research team, this generative AI model is touted to possess attributes such as "common sense," "language understanding," and "logical reasoning." Microsoft asserts that, despite its modest size, Phi-2 can outshine models 25 times larger in specific tasks, a claim that has stirred considerable interest in the tech community.

Phi-2's training methodology involves exposure to "textbook-quality" data, encompassing synthetic datasets, general knowledge, theory of mind, and daily activities. As a transformer-based model with a next-word prediction objective, Phi-2 was trained on 96 A100 GPUs for a mere 14 days, a notable departure from the resource-intensive 90-100 day training regimen required for GPT-4, which employs tens of thousands of A100 Tensor Core GPUs.

Remarkably, Microsoft's Phi-2 extends its capabilities beyond language-related tasks. It demonstrates proficiency in solving complex mathematical equations and physics problems, and it can even detect errors in student calculations. This multifaceted functionality positions Phi-2 as a versatile tool for a range of applications. In benchmark assessments encompassing commonsense reasoning, language understanding, math, and coding, Phi-2 exhibits superior performance when compared to the 13B Llama-2 and 7B Mistral. Impressively, it also outpaces the 70B Llama-2 LLM by a considerable margin and surpasses the Google Gemini Nano 2, a 3.25B model designed to run natively on Google Pixel 8 Pro.

The significance of a smaller model outperforming its larger counterparts lies in its cost-effectiveness, lower power consumption, and reduced computing requirements. These advantages make smaller models like Phi-2 appealing for specific tasks, as they can be trained more efficiently and seamlessly integrated into devices, minimizing output latency. Developers keen on exploring Phi-2's capabilities can readily access the model on Azure AI Studio, marking a noteworthy advancement in the landscape of compact yet powerful language models.