News | Microsoft Corp Launches the Code for Phi-4 Small Language Model

Microsoft Corp Launches the Code for Phi-4 Small Language Model

Published by: Insights Desk Released: Jan 09, 2025 Source: DemandTalk

Highlights:

Microsoft described in a research study how it used two post-training optimization strategies to improve the output quality of Phi-4.
Phi-4 is the latest in a long line of small language models that have been made publicly available by large tech companies in the previous year.

Microsoft Corp. launched the code for Phi-4 small language model to solve mathematical problems and generate text content.

The model was initially described by the business last month. At first, Phi-4 could only be accessed via Microsoft’s artificial intelligence development service, Azure Foundry. Hugging Face, a well-known platform for hosting open-source AI projects, is now offering the model for download.

In 2023, Microsoft unveiled Phi-4, the fourth version of a small language model series. It has 14 billion parameters, which are the configuration settings that dictate how a neural network processes information. Over the course of 21 days, Microsoft researchers trained it on a cluster of 1,920 H100 graphics processor units from Nvidia Corp.

The Transformer architecture, which supports the majority of large language models, is the industry standard upon which the model is built. Transformer models interpret user prompts by decomposing the input into individual words and interpreting each word’s meaning by examining the surrounding text. Additionally, they give priority to the sections of the surrounding text that are thought to be the most pertinent.

A so-called decoder-only version of the Transformer architecture is implemented by Phi-4. To ascertain the meaning of a word, a typical Transformer model examines the text both before and after the word. Because decoder-only models solely consider the text that comes before the word, they handle less data, which lowers the cost of inference.

Microsoft described in a research study how it used two post-training optimization strategies to improve the output quality of Phi-4. These techniques are referred to as supervised fine-tuning and direct preference optimization. Both entail providing a language model together with illustrations of how it ought to produce timely answers.

In an internal assessment, Microsoft contrasted Phi-4 with an LLM with five times as many parameters, Llama 3.3 70B. According to the business, Phi-4 performed higher on the well-known GPQA and MATH benchmarks. There are math problems and scientific questions in the two test datasets, respectively.

Phi-4 is the latest in a long line of small language models that have been made publicly available by large tech companies in the previous year.

Google LLC unveiled the Gemma line of small language models in February of last year. Between two billion and 27 billion parameters make up the algorithms in the series. Google claims that the 27 billion-parameter version can outpace other models that are more than twice as large.

Two Llama 3.2 models with less than five billion parameters were recently released by Meta Platforms Inc. Following the release, the company made even more effective versions of those models—which use a machine learning technique called quantification—open-source. The method lowers the amount of hardware required to process the data by compressing the data that a neural network consumes.

il est temps de devenir sérieux avec le genai dan...

harnessing ai: the future of business transformati...

prepare for the future now. achieve greater, secur...

stay ahead with modern technology...

stay ahead...

workforce upskilling for the ai era...

unlock the full potential of generative ai at work...

ai pcs are quickly becoming the key to achieving s...

developing tomorrow’s ai on today’s ai-ready w...

unveiling ai-level productivity...

the new cyber security opportunity in an ‘ai eve...

how ai is changing managed detection and response...

answering your 4 biggest questions about generativ...

understanding the costs of generative ai...

the top 5 generative ai questions on every executi...

7 leading generative ai use cases...

6 steps to success with generative...

revolutionize your product launches with ai-driven...

unlock the full potential of ai-powered software d...

new era energy efficiency whitepaper longform...

compliance automation: a strategic investment for ...

leading the way: how modern workplaces embrace cha...

choosing the right ai foundation model for your ne...

ai governance: the path to responsible ai...

ai in market research: new possibilities, new insi...

ai ready workforce: upskilling for the ai era...

ai pricing strategy: the key to sustainable busine...

ai in business strategy: enhancing decisions boos...

genai at work: revolutionizing modern business ope...

ai misinformation: ai’s role in amplifying misin...

decision intelligence empowering business actions ...

committee machine in ml harnessing ensemble techni...

information processing language serves scalable an...

ai agents in business: transforming operations dr...

ai adoption framework: key components for effectiv...

machine learning use cases that deliver tangible r...

profitable ai-powered data management solutions to...

business-centric cognitive architecture revolution...

ai use cases – innovations for business success...

the role of ai in software development...

qualcomm acquires movianai to strengthen its ai en...

nvidia corp to acquire lepton ai in probably nine-...

alibaba cloud unveils qwen2.5-omni-7b...

openai upgrades chatgpt’s image generation tool ...

microsoft is improving security copilot service wi...

deepseek unveils enhanced v3 model under mit licen...

nvidia reportedly acquires gretel to generate arti...

dataminr raises usd 85 m for real-time analytics...

ai code review startup graphite raises usd 52 m to...

zoom upgrades with agentic ai for enhanced video c...

google introduces gemini robotics and gemini robot...

google launches next-gen lightweight gemma ai mode...

ai21 labs introduces maestro for enhancing llm qua...

servicenow to acquire moveworks in a usd 2.9 b...

qualcomm acquires edge impulse, edge ai startup...

google introduces two new ai features to enhance i...

coreweave plans to buy weight biases for seamless...

openai launches nextgenai consortium with 15 insti...

anthropic pbc raises usd 3.5 b at usd 61.5 b value...

openai introduces gpt-4.5 as the most advanced and...

role of machine learning in networking...

Microsoft Corp Launches the Code for Phi-4 Small Language Model

Insights Desk

Related posts

Qualcomm Acquires MovianAI to Strengthen its AI En...

Nvidia Corp to Acquire Lepton AI in Probably Nine-...

Alibaba Cloud Unveils Qwen2.5-Omni-7B...

OpenAI Upgrades ChatGPT’s Image Generation Tool ...

Microsoft is Improving Security Copilot Service wi...

DeepSeek Unveils Enhanced V3 Model Under MIT Licen...

Nvidia Reportedly Acquires Gretel to Generate Arti...

Dataminr Raises USD 85 M for Real-time Analytics...

AI Code Review Startup Graphite Raises USD 52 M to...

Zoom Upgrades with Agentic AI for Enhanced Video C...

Our Brands