Highlights:
- DALL-E 3, upon which this feature is built, marks the third iteration of an AI image generator unveiled by OpenAI in 2021.
- The company’s researchers trained the AI on an extensive dataset of images along with their corresponding captions.
Recently, OpenAI debuted an enhanced DALL-E editor version, which is the AI-powered image generator featured in the premium tiers of ChatGPT.
The functionality is rooted in an AI model named DALL-E 3, which the company unveiled in September of last year. Shortly after that, OpenAI incorporated this model into ChatGPT. The initial iteration of the DALL-E editor, released in the previous year, empowered users to create images from textual prompts and visual references, along with performing subsequent modifications.
The latest update will streamline the process for users to modify the images they create.
Within ChatGPT-3, users can access the DALL-E editor via the same chatbot interface used for other features of the service. An added “Select” button located at the interface’s top allows users to pinpoint the particular section of the image they want to edit. Subsequently, they can input natural language commands detailing the desired modifications.
For instance, a user might delineate a circle around a tree within a forest photograph and instruct the DALL-E editor to erase it. Additionally, users can modify the appearance of existing objects in an image or introduce new ones. OpenAI advised in a knowledge base article outlining the update, “We recommend selecting a large space around the area you intend to edit to obtain better results.”
The company’s engineers have incorporated several user-friendly features as well. Within the DALL-E editor, the addition of Undo and Redo buttons enables users to swiftly deselect sections of an image highlighted using the Select tool. Furthermore, customers can now modify the aspect ratio of the generated image and access suggestions for drawing styles.
The DALL-E editor is accessible in ChatGPT Pro, a premium version of the chatbot designed for individual users, as well as in two higher-level product tiers tailored for organizations offered by OpenAI. This feature is available on both the web and mobile platforms.
DALL-E 3, the foundation of this feature, represents the third evolution of an AI image generator introduced by OpenAI in 2021. It produces images of superior quality compared to its predecessors and demonstrates improved accuracy in following user instructions. OpenAI attributes this enhancement to the training dataset utilized for DALL-E 3.
The company’s researchers conducted AI training using an extensive dataset comprising images and corresponding captions. OpenAI reports that 95% of these captions were generated using a custom language model crafted specifically for DALL-E 3. This language model produces succinct image descriptions, focusing solely on the essential elements of an image—an approach deemed beneficial for AI training by OpenAI.
DALL-E 3 is among several models crafted by the company for multimedia generation duties. Other contenders in this domain include Voice Engine, an AI system capable of producing synthetic speech, and the Sora text-to-video model. Notably, among these options, DALL-E 3 is the sole model made widely accessible by OpenAI.