News | AWS Unveils the Nova Series of Multimodal AI Foundation Models

AWS Unveils the Nova Series of Multimodal AI Foundation Models

Published by: Insights Desk Released: Dec 04, 2024 Source: DemandTalk

Highlights:

The suite includes four text-focused models: Micro, Lite, Pro, and Premier, each representing a progressive increase in size, complexity, and capabilities.
Amazon also introduced two creative models: Nova Canvas for image generation and Nova Reel for video creation.

The cloud computing arm of Amazon.com, AWS has unveiled Nova, a new series of multimodal generative AI models.

Amazon Chief Executive Andy Jassy unveiled a new suite of models during the AWS re:Invent conference in Las Vegas. The lineup includes four text-focused models—Micro, Lite, Pro, and Premier—offering increasing levels of size, complexity, and functionality. While Micro, Lite, and Pro are already available, the most advanced model, Premier, is still undergoing training and is expected to launch in early 2025.

In addition to the four text-focused models, Amazon introduced two creative models: Nova Canvas, designed for generating images, and Nova Reel, tailored for video creation.

Micro, the smallest large language model in the lineup, is text-only and optimized for speed, delivering quick responses at minimal cost. It is specifically designed to handle tasks such as text summarization, translation, question-answering, conversational chat, and brainstorming.

Lite is the next-tier model, offering cost-effective multimodal capabilities for processing text, images, and video inputs to generate text-based responses. According to Amazon, this model is ideal for scenarios like real-time customer interactions and document analysis involving visuals. It can handle up to 300,000 tokens—equivalent to the length of three average novels—analyze multiple images simultaneously or process up to 30 minutes of video in a single command.

Pro, currently AWS’s most advanced multimodal large language model, integrates the capabilities of its predecessors while setting a high benchmark for AI agents. These agents operate autonomously on behalf of users, performing complex tasks by utilizing third-party tools. For instance, Pro can draft and send emails, gather and analyze data, compile reports, and distribute them without requiring human intervention.

Amazon notes that the Pro model can serve as a “teacher” to develop customized versions of the Nova Micro and Lite models. In this approach, larger, more sophisticated models transfer their knowledge to smaller, less complex “student” models through fine-tuning. This method enables the smaller models to deliver comparable performance while requiring less computational power and memory.

Amazon highlights Nova’s adaptability to meet specific enterprise and industry requirements as a key advantage. These foundation models serve as customizable starting points, allowing businesses to fine-tune them to fit specialized needs. For instance, Nova can be optimized to align with a company’s brand voice, understand niche terminology, and leverage enterprise-specific data. A healthcare organization, for example, could adapt Nova to process medical terms, interpret forms, and comprehend the unique relationships within the healthcare sector.

Nova Canvas enables advanced image generation, allowing users to create professional-grade visuals from text descriptions or existing images. It also supports text-based image editing, where users can specify objects or areas to modify. For example, simply mentioning “shirt” and providing a prompt like “add stripes” will result in Canvas adjusting the shirt in the image accordingly, seamlessly aligning with the user’s input.

Users can also instruct Canvas to adjust or retain specific backgrounds and color schemes based on their preferences. Every element of the original or edited image can be modified by providing tailored prompts, offering full flexibility in customizing visuals.

Reel generates short videos based on text prompts, similar to other advanced text-to-video AI models. Users can include natural language descriptions for camera movements, such as zoom, side-to-side motion, and rotation, enabling the creation of cinematic video shots with ease.

Amazon Nova’s text-generating models support content creation in over 200 languages, with particularly strong capabilities in languages like English, German, Spanish, French, Italian, Japanese, Korean, Arabic, Simplified Chinese, and Russian. However, the creative models, Canvas and Reel, currently only support prompts in English.

The new Nova models, excluding Premier, are now available on Amazon Bedrock. This AWS-managed service offers access to cloud-hosted cutting-edge AI models from Amazon and other providers, along with tools to help build AI applications.

il est temps de devenir sérieux avec le genai dan...

harnessing ai: the future of business transformati...

prepare for the future now. achieve greater, secur...

stay ahead with modern technology...

stay ahead...

workforce upskilling for the ai era...

unlock the full potential of generative ai at work...

ai pcs are quickly becoming the key to achieving s...

developing tomorrow’s ai on today’s ai-ready w...

unveiling ai-level productivity...

the new cyber security opportunity in an ‘ai eve...

how ai is changing managed detection and response...

answering your 4 biggest questions about generativ...

understanding the costs of generative ai...

the top 5 generative ai questions on every executi...

7 leading generative ai use cases...

6 steps to success with generative...

revolutionize your product launches with ai-driven...

unlock the full potential of ai-powered software d...

new era energy efficiency whitepaper longform...

compliance automation: a strategic investment for ...

leading the way: how modern workplaces embrace cha...

choosing the right ai foundation model for your ne...

ai governance: the path to responsible ai...

ai in market research: new possibilities, new insi...

ai ready workforce: upskilling for the ai era...

ai pricing strategy: the key to sustainable busine...

ai in business strategy: enhancing decisions boos...

genai at work: revolutionizing modern business ope...

ai misinformation: ai’s role in amplifying misin...

decision intelligence empowering business actions ...

committee machine in ml harnessing ensemble techni...

information processing language serves scalable an...

ai agents in business: transforming operations dr...

ai adoption framework: key components for effectiv...

machine learning use cases that deliver tangible r...

profitable ai-powered data management solutions to...

business-centric cognitive architecture revolution...

ai use cases – innovations for business success...

the role of ai in software development...

openai upgrades chatgpt’s image generation tool ...

microsoft is improving security copilot service wi...

deepseek unveils enhanced v3 model under mit licen...

nvidia reportedly acquires gretel to generate arti...

dataminr raises usd 85 m for real-time analytics...

ai code review startup graphite raises usd 52 m to...

zoom upgrades with agentic ai for enhanced video c...

google introduces gemini robotics and gemini robot...

google launches next-gen lightweight gemma ai mode...

ai21 labs introduces maestro for enhancing llm qua...

servicenow to acquire moveworks in a usd 2.9 b...

qualcomm acquires edge impulse, edge ai startup...

google introduces two new ai features to enhance i...

coreweave plans to buy weight biases for seamless...

openai launches nextgenai consortium with 15 insti...

anthropic pbc raises usd 3.5 b at usd 61.5 b value...

openai introduces gpt-4.5 as the most advanced and...

amazon launches alexa , an llm-powered assistant...

perplexity ai is creating a browser for ‘agentic...

mongodb acquires voyage ai for ai models generatin...

role of machine learning in networking...

AWS Unveils the Nova Series of Multimodal AI Foundation Models

Insights Desk

Related posts

OpenAI Upgrades ChatGPT’s Image Generation Tool ...

Microsoft is Improving Security Copilot Service wi...

DeepSeek Unveils Enhanced V3 Model Under MIT Licen...

Nvidia Reportedly Acquires Gretel to Generate Arti...

Dataminr Raises USD 85 M for Real-time Analytics...

AI Code Review Startup Graphite Raises USD 52 M to...

Zoom Upgrades with Agentic AI for Enhanced Video C...

Google Introduces Gemini Robotics and Gemini Robot...

Google Launches Next-Gen Lightweight Gemma AI Mode...

AI21 Labs Introduces Maestro for Enhancing LLM Qua...

Our Brands