News | Google’s Lightweight Gemini 1.5 Flash-8B is Now Available

Google’s Lightweight Gemini 1.5 Flash-8B is Now Available

Published by: Insights Desk Released: Oct 04, 2024 Source: DemandTalk

Highlights:

With Gemini 1.5 Flash-8B, Google is presenting one of the most cost-effective lightweight LLMs currently available on the market.
Gemini 1.5 Flash-8B is freely accessible to developers via the Gemini API and Google AI Studio.

Google LLC is releasing a new version of its popular Gemini 1.5 Flash artificial intelligence model that is smaller and faster than the original.

This version, called Gemini 1.5 Flash-8B, is much more affordable, priced at half the cost of its predecessor. Gemini 1.5 Flash is a lightweight iteration of Google’s Gemini family of large language models, optimized for speed and efficiency and designed for deployment on low-powered devices like smartphones and sensors.

The company unveiled Gemini 1.5 Flash at Google I/O 2024 in May, and it was made available to select paying customers a few weeks later. It subsequently became accessible for free through the Gemini mobile app, though with certain usage restrictions.

It officially became widely available at the end of June, featuring competitive pricing and a one million-token context window alongside high-speed processing. At its launch, Google highlighted that its input size is 60 times greater than that of OpenAI’s GPT-3.5 Turbo and is, on average, 40% faster.

The initial version was designed with a very low token input cost, making it highly competitive for developers. It was adopted by customers like Uber Technologies Inc., powering the Eats AI assistant for the company’s UberEats food delivery service.

With Gemini 1.5 Flash-8B, Google is launching one of the most affordable lightweight LLMs on the market, featuring a 50% lower price and double the rate limits compared to the original 1.5 Flash. Additionally, the company stated that it provides reduced latency on smaller prompts.

Gemini 1.5 Flash-8B is available to developers for free via the Gemini API and Google AI Studio.

In a blog post, Logan Kilpatrick, Senior Product Manager for the Gemini API, stated that the company has made “significant progress” in enhancing 1.5 Flash by considering feedback from developers and “testing the limits” of what can be achieved with lightweight LLMs.

He noted that the company introduced an experimental version of Gemini 1.5 Flash-8B last month. Since then, it has undergone further refinement and is now available for general production use.

Kilpatrick stated that the 8-B version nearly matches the performance of the original 1.5 Flash model across many key benchmarks and has proven particularly effective in tasks like chat, transcription, and long-context language translation.

“Our release of best-in-class small models continues to be informed by developer feedback and our own testing of what is possible with these models. We see the most potential for this model in tasks ranging from high volume multimodal use cases to long context summarization tasks,” Kilpatrick added.

The pricing is competitive with similar models from OpenAI and Anthropic PBC. For OpenAI, the least expensive model remains GPT-4o mini, priced at USD 0.15 per million inputs, though this cost is halved for reused prompt prefixes and batched requests. In contrast, Anthropic’s most affordable model, Claude 3 Haiku, costs USD 0.25 per million inputs, but the price decreases to USD 0.03 per million for cached tokens.

Furthermore, Kilpatrick mentioned that the company is doubling the rate limits of 1.5 Flash-8B to enhance its utility for straightforward, high-volume tasks. Consequently, developers can now submit up to 4,000 requests per minute, he stated.

il est temps de devenir sérieux avec le genai dan...

harnessing ai: the future of business transformati...

prepare for the future now. achieve greater, secur...

stay ahead with modern technology...

stay ahead...

workforce upskilling for the ai era...

unlock the full potential of generative ai at work...

ai pcs are quickly becoming the key to achieving s...

developing tomorrow’s ai on today’s ai-ready w...

unveiling ai-level productivity...

the new cyber security opportunity in an ‘ai eve...

how ai is changing managed detection and response...

answering your 4 biggest questions about generativ...

understanding the costs of generative ai...

the top 5 generative ai questions on every executi...

7 leading generative ai use cases...

6 steps to success with generative...

revolutionize your product launches with ai-driven...

unlock the full potential of ai-powered software d...

new era energy efficiency whitepaper longform...

compliance automation: a strategic investment for ...

leading the way: how modern workplaces embrace cha...

choosing the right ai foundation model for your ne...

ai governance: the path to responsible ai...

ai in market research: new possibilities, new insi...

ai ready workforce: upskilling for the ai era...

ai pricing strategy: the key to sustainable busine...

ai in business strategy: enhancing decisions boos...

genai at work: revolutionizing modern business ope...

ai misinformation: ai’s role in amplifying misin...

decision intelligence empowering business actions ...

committee machine in ml harnessing ensemble techni...

information processing language serves scalable an...

ai agents in business: transforming operations dr...

ai adoption framework: key components for effectiv...

machine learning use cases that deliver tangible r...

profitable ai-powered data management solutions to...

business-centric cognitive architecture revolution...

ai use cases – innovations for business success...

the role of ai in software development...

openai upgrades chatgpt’s image generation tool ...

microsoft is improving security copilot service wi...

deepseek unveils enhanced v3 model under mit licen...

nvidia reportedly acquires gretel to generate arti...

dataminr raises usd 85 m for real-time analytics...

ai code review startup graphite raises usd 52 m to...

zoom upgrades with agentic ai for enhanced video c...

google introduces gemini robotics and gemini robot...

google launches next-gen lightweight gemma ai mode...

ai21 labs introduces maestro for enhancing llm qua...

servicenow to acquire moveworks in a usd 2.9 b...

qualcomm acquires edge impulse, edge ai startup...

google introduces two new ai features to enhance i...

coreweave plans to buy weight biases for seamless...

openai launches nextgenai consortium with 15 insti...

anthropic pbc raises usd 3.5 b at usd 61.5 b value...

openai introduces gpt-4.5 as the most advanced and...

amazon launches alexa , an llm-powered assistant...

perplexity ai is creating a browser for ‘agentic...

mongodb acquires voyage ai for ai models generatin...

role of machine learning in networking...

Google’s Lightweight Gemini 1.5 Flash-8B is Now Available

Insights Desk

Related posts

OpenAI Upgrades ChatGPT’s Image Generation Tool ...

Microsoft is Improving Security Copilot Service wi...

DeepSeek Unveils Enhanced V3 Model Under MIT Licen...

Nvidia Reportedly Acquires Gretel to Generate Arti...

Dataminr Raises USD 85 M for Real-time Analytics...

AI Code Review Startup Graphite Raises USD 52 M to...

Zoom Upgrades with Agentic AI for Enhanced Video C...

Google Introduces Gemini Robotics and Gemini Robot...

Google Launches Next-Gen Lightweight Gemma AI Mode...

AI21 Labs Introduces Maestro for Enhancing LLM Qua...

Our Brands