Elon Musk-led xAI Corp. Launches First Multimodal Model Grok-1.5V

Published by: Insights Desk Released: Apr 15, 2024 Source: DemandTalk

Highlights:

The business claims that Grok-1.5V, which specializes in what it terms “multidisciplinary reasoning,” is more than capable of competing with current multimodal models across a range of fields.
According to benchmark data provided by xAI, Grok-1.5V performs better than industry competitors including GPT-4V, Claude, 3Sonnet, Claude 3 Opus, and Gemini Pro 1.5.

Elon Musk-led xAI Corp. launched its first multimodal model recently. The development adds to an AI arms race that never seems to get over.

Grok-1.5 Vision, also known as Grok-1.5V, is a considerably more advanced large language model than the original Grok-1 since it can comprehend both text and visuals, including displayed documents, images, screenshots, charts, diagrams, and more.

The business claims that Grok-1.5V, which specializes in what it terms “multidisciplinary reasoning,” is more than capable of competing with current multimodal models across a range of fields. It has intelligent spatiotemporal perception capabilities, or what’s called real-world spatial understanding in the AI community, which enable it to reason with complex text, analyze scientific images, and engage with visual content in a manner akin to that of a human.

The developer provided several real-world applications for the Grok-1.5V. For example, it can be used to convert drawings into kid-friendly stories, determine which object in a group is the largest, help drivers navigate obstacles by ensuring there is enough room, convert tables into CSV files, and determine whether a wooden deck needs to be replaced because it is decaying. Even the context of internet memes that the user is unfamiliar with will be explained.

According to benchmark data provided by xAI, Grok-1.5V performs better than industry competitors including GPT-4V, Claude, 3Sonnet, Claude 3 Opus, and Gemini Pro 1.5. Grok-1.5V outperformed its competitors by a significant margin in a new benchmark known as the RealWorldQA benchmark, which the company developed to assess real-world spatial comprehension.

Less than a month has passed since Musk’s team debuted the regular Grok-1.5 LLM, which defeated Grok-1 in terms of math and coding capabilities. Now, Grok is available in multimodal form. Additionally, Grok-1.5 demonstrated that it could handle far longer contexts than the original, allowing it to verify information from other sources and enhance answer accuracy.

The xAI claims that Grok-1.5V will soon be made accessible to early testers, beginning with those who have enrolled in X’s Premium service, which offers extra advantages to users of the social media platform formerly Twitter.

The startup, which debuted in July 2023, has advanced rapidly. Musk stated at the time that he was starting the business in response to AI developers like OpenAI and Google, who are very secretive about the inner workings of their AI models. According to Musk, the objective is to develop AI that is more accountable and transparent than the work of its competitors.

il est temps de devenir sérieux avec le genai dan...

harnessing ai: the future of business transformati...

prepare for the future now. achieve greater, secur...

stay ahead with modern technology...

stay ahead...

workforce upskilling for the ai era...

unlock the full potential of generative ai at work...

ai pcs are quickly becoming the key to achieving s...

developing tomorrow’s ai on today’s ai-ready w...

unveiling ai-level productivity...

the new cyber security opportunity in an ‘ai eve...

how ai is changing managed detection and response...

answering your 4 biggest questions about generativ...

understanding the costs of generative ai...

the top 5 generative ai questions on every executi...

7 leading generative ai use cases...

6 steps to success with generative...

revolutionize your product launches with ai-driven...

unlock the full potential of ai-powered software d...

new era energy efficiency whitepaper longform...

compliance automation: a strategic investment for ...

leading the way: how modern workplaces embrace cha...

choosing the right ai foundation model for your ne...

ai governance: the path to responsible ai...

ai in market research: new possibilities, new insi...

ai ready workforce: upskilling for the ai era...

ai pricing strategy: the key to sustainable busine...

ai in business strategy: enhancing decisions boos...

genai at work: revolutionizing modern business ope...

ai misinformation: ai’s role in amplifying misin...

decision intelligence empowering business actions ...

committee machine in ml harnessing ensemble techni...

information processing language serves scalable an...

ai agents in business: transforming operations dr...

ai adoption framework: key components for effectiv...

machine learning use cases that deliver tangible r...

profitable ai-powered data management solutions to...

business-centric cognitive architecture revolution...

ai use cases – innovations for business success...

the role of ai in software development...

alibaba cloud unveils qwen2.5-omni-7b...

openai upgrades chatgpt’s image generation tool ...

microsoft is improving security copilot service wi...

deepseek unveils enhanced v3 model under mit licen...

nvidia reportedly acquires gretel to generate arti...

dataminr raises usd 85 m for real-time analytics...

ai code review startup graphite raises usd 52 m to...

zoom upgrades with agentic ai for enhanced video c...

google introduces gemini robotics and gemini robot...

google launches next-gen lightweight gemma ai mode...

ai21 labs introduces maestro for enhancing llm qua...

servicenow to acquire moveworks in a usd 2.9 b...

qualcomm acquires edge impulse, edge ai startup...

google introduces two new ai features to enhance i...

coreweave plans to buy weight biases for seamless...

openai launches nextgenai consortium with 15 insti...

anthropic pbc raises usd 3.5 b at usd 61.5 b value...

openai introduces gpt-4.5 as the most advanced and...

amazon launches alexa , an llm-powered assistant...

perplexity ai is creating a browser for ‘agentic...

role of machine learning in networking...

Elon Musk-led xAI Corp. Launches First Multimodal Model Grok-1.5V

Highlights:

Insights Desk

Related posts

Alibaba Cloud Unveils Qwen2.5-Omni-7B...

OpenAI Upgrades ChatGPT’s Image Generation Tool ...

Microsoft is Improving Security Copilot Service wi...

DeepSeek Unveils Enhanced V3 Model Under MIT Licen...

Nvidia Reportedly Acquires Gretel to Generate Arti...

Dataminr Raises USD 85 M for Real-time Analytics...

AI Code Review Startup Graphite Raises USD 52 M to...

Zoom Upgrades with Agentic AI for Enhanced Video C...

Google Introduces Gemini Robotics and Gemini Robot...

Google Launches Next-Gen Lightweight Gemma AI Mode...

Our Brands