Microsoft is Developing Large Language Model with 500 B Parameters

Published by: Insights Desk Released: May 07, 2024 Source: DemandTalk

Highlights:

The 500 billion parameters that Microsoft’s MAI-1 is said to have suggested that it might be positioned as a type of middle ground between GPT-3 and ChatGPT-4.
It has been reported that Microsoft may power MAI-1 using training data and other resources from Inflection AI.

Microsoft Corp. is developing a large language model with over 500 billion parameters.

It is anticipated that the LLM, internally referred to as MAI-1, will launch as early as this month.

In mid-2020, OpenAI unveiled GPT-3, revealing that its first iteration contained 175 billion parameters. Although it hasn’t yet released precise figures, the business has stated that GPT-4 is larger. According to some reports, Google LLC’s Gemini Ultra, which performs similarly to GPT-4, has 1.6 trillion parameters, and OpenAI’s flagship LLM is said to have 1.76 trillion.

The 500 billion parameters that Microsoft’s MAI-1 is said to have suggested that it might be positioned as a type of middle ground between GPT-3 and ChatGPT-4. Compared to OpenAI’s flagship LLM, this setup would enable the model to give excellent response accuracy while consuming a substantially smaller amount of electricity. Microsoft would pay less for inference as a result.

Mustafa Suleyman, the founder of LLM developer Inflection AI Inc., is reportedly supervising the development of MAI-1. Suleyman and most of the startup’s staff joined Microsoft in March after a purported USD 625 million transaction. The Executive formerly co-founded the DeepMind AI research group at Google LLC.

It has been reported that Microsoft may power MAI-1 using training data and other resources from Inflection AI. Web content and text produced by GPT-4 are among the information categories allegedly included in the model’s training dataset. According to reports, Microsoft is using a “large cluster of servers” with graphics cards made by Nvidia Corp. to carry out the development process.

According to the reports, the corporation hasn’t decided how it will employ MAI-1 yet. The model is too complicated to execute on consumer devices if it really has 500 billion parameters. It follows that Microsoft will probably implement MAI-1 in its data centers, where the LLM may be incorporated into Azure and Bing services.

If the model demonstrates enough potential by May 16, the corporation is expected to introduce MAI-1 at its Build developer conference. This suggests that the company anticipates having a functioning model prototype in a few weeks if it doesn’t already have one.

Less than two weeks have passed since Microsoft released a language model named Pi-3 Mini

for public use. Now, it has announced that it is working on MAI-1. According to the company, the latter model can outperform LLMs more than ten times its size and has 3.8 billion parameters. Pi-3 is a member of an AI family that also consists of two larger, somewhat more effective neural networks.

il est temps de devenir sérieux avec le genai dan...

harnessing ai: the future of business transformati...

prepare for the future now. achieve greater, secur...

stay ahead with modern technology...

stay ahead...

workforce upskilling for the ai era...

unlock the full potential of generative ai at work...

ai pcs are quickly becoming the key to achieving s...

developing tomorrow’s ai on today’s ai-ready w...

unveiling ai-level productivity...

the new cyber security opportunity in an ‘ai eve...

how ai is changing managed detection and response...

answering your 4 biggest questions about generativ...

understanding the costs of generative ai...

the top 5 generative ai questions on every executi...

7 leading generative ai use cases...

6 steps to success with generative...

revolutionize your product launches with ai-driven...

unlock the full potential of ai-powered software d...

new era energy efficiency whitepaper longform...

compliance automation: a strategic investment for ...

leading the way: how modern workplaces embrace cha...

choosing the right ai foundation model for your ne...

ai governance: the path to responsible ai...

ai in market research: new possibilities, new insi...

ai ready workforce: upskilling for the ai era...

ai pricing strategy: the key to sustainable busine...

ai in business strategy: enhancing decisions boos...

genai at work: revolutionizing modern business ope...

ai misinformation: ai’s role in amplifying misin...

decision intelligence empowering business actions ...

committee machine in ml harnessing ensemble techni...

information processing language serves scalable an...

ai agents in business: transforming operations dr...

ai adoption framework: key components for effectiv...

machine learning use cases that deliver tangible r...

profitable ai-powered data management solutions to...

business-centric cognitive architecture revolution...

ai use cases – innovations for business success...

the role of ai in software development...

alibaba cloud unveils qwen2.5-omni-7b...

openai upgrades chatgpt’s image generation tool ...

microsoft is improving security copilot service wi...

deepseek unveils enhanced v3 model under mit licen...

nvidia reportedly acquires gretel to generate arti...

dataminr raises usd 85 m for real-time analytics...

ai code review startup graphite raises usd 52 m to...

zoom upgrades with agentic ai for enhanced video c...

google introduces gemini robotics and gemini robot...

google launches next-gen lightweight gemma ai mode...

ai21 labs introduces maestro for enhancing llm qua...

servicenow to acquire moveworks in a usd 2.9 b...

qualcomm acquires edge impulse, edge ai startup...

google introduces two new ai features to enhance i...

coreweave plans to buy weight biases for seamless...

openai launches nextgenai consortium with 15 insti...

anthropic pbc raises usd 3.5 b at usd 61.5 b value...

openai introduces gpt-4.5 as the most advanced and...

amazon launches alexa , an llm-powered assistant...

perplexity ai is creating a browser for ‘agentic...

role of machine learning in networking...

Microsoft is Developing Large Language Model with 500 B Parameters

Highlights:

Insights Desk

Related posts

Alibaba Cloud Unveils Qwen2.5-Omni-7B...

OpenAI Upgrades ChatGPT’s Image Generation Tool ...

Microsoft is Improving Security Copilot Service wi...

DeepSeek Unveils Enhanced V3 Model Under MIT Licen...

Nvidia Reportedly Acquires Gretel to Generate Arti...

Dataminr Raises USD 85 M for Real-time Analytics...

AI Code Review Startup Graphite Raises USD 52 M to...

Zoom Upgrades with Agentic AI for Enhanced Video C...

Google Introduces Gemini Robotics and Gemini Robot...

Google Launches Next-Gen Lightweight Gemma AI Mode...

Our Brands