ChatGPT 集成 DALL-E 3,多模态 AI 即将来临
OpenAI 计划在 10 月份时将 DALL-E 3[1] 集成在 ChatGPT Plus 中,它在理解细节方面超越了先前的系统,使用户的创意能够精确地转化为图像。
这不仅是对 MidJourney 的回应,更是预示了即将上演的大规模多模态 LLM 与 DeepMind Gemini 之间的较量(Gemini 是谷歌即将推出的与 OpenAI 竞争的人工智能系统,它建立在 DeepMind 的多模态工作之上)。此外,其基于 ChatGPT 的构建和对语言的出色对齐强调了在构建多模态 AI 时,推理能力的重要性高于像素细节(内容大意来自于 @DrJimFan[2] 的观点)。
产品集成:OpenAI 发布了其生成式 AI 视觉艺术平台 DALL-E 3,该版本与 ChatGPT 集成,使用户不再需要自己构想图像生成提示(与以往版本相比,这大大简化了用户操作)。
功能性能:与 DALL-E 2 相比,新版本更好地理解上下文。DALL-E 3 的新功能是可以通过 ChatGPT 来自动创建提示(用户如果有特定的想法,仍然可以使用自己的提示)。
版本历史:DALL-E 最初于 2021 年发布,其后有 DALL-E 2,2022 年开放给公众。DALL-E 3 计划在 10 月份首先发布给 ChatGPT Plus 和 ChatGPT 企业用户,然后是其他研究实验室和 API 服务。
安全性:OpenAI 对 DALL-E 3 增加了安全措施,防止生成不适当或潜在仇恨的图像。提示即使提及名字,DALL-E 3 也无法重现公众人物的图像。
艺术家权利:DALL-E 2 在被提示时,可以模仿某些艺术家的艺术风格。而 DALL-E 3 已经被训练为拒绝生成活着的艺术家的风格的图像。OpenAI 允许艺术家选择退出其文本到图像的 AI 模型的未来版本,并可以请求移除与他们的艺术作品相似的图像(这是为了应对先前的版权争议和诉讼风险)。
诉讼风险:为避免诉讼,OpenAI 提供了上述艺术家选择权功能,因为先前有艺术家因版权问题起诉了 DALL-E 的竞争对手(如 Stability AI 和 Midjourney)。
DALL-E 3 vs MJ
以下是一些 DALL-E 3 对比 MJ 的示例,大家可以感受一下区别(由 @MattGarciaEth 创建)。
📌 PromptPhoto of a lychee-inspired spherical chair, with a bumpy white exterior and plush interior, set against a tropical wallpaper.
📌 PromptAn expressive oil painting of a basketball player dunking, depicted as an explosion of a nebula.
📌 PromptClose-up photograph of a hermit crab nestled in wet sand, with sea foam nearby and the details of its shell and texture of the sand accentuated.
📌 PromptAn illustration of an avocado sitting in a therapist's chair, saying 'I just feel so empty inside' with a pit-sized hole in its center. The therapist, a spoon, scribbles notes.
📌 PromptAn illustration of a human heart made of translucent glass standing on a pedestal amidst a stormy sea. Rays of sunlight pierce the clouds illuminating the heart revealing a tiny universe within. The quote 'Find the universe within you' is etched in bold letters across the horizon.
📌 PromptA vibrant yellow banana-shaped couch sits in a cozy living room, its curve cradling a pile of colorful cushions. on the wooden floor a patterned rug adds a touch of eclectic charm, and a potted plant sits in the corner, reaching towards the sunlight filtering through the window.
📌 PromptA detailed oil painting of an old sea captain, steering his ship through a storm. Saltwater is splashing against his weathered face, determination in his eyes. Twirling malevolent clouds are seen above and stern waves threaten to submerge the ship while seagulls dive and twirl through the chaotic landscape. Thunder and lights embark in the distance, illuminating the scene with an eerie green glow.
📌 PromptAn ink sketch style illustration of a small hedgehog holding a piece of watermelon with its tiny paws, taking little bites with its eyes closed in delight.
📌 PromptAn antique botanical illustration drawn with fine lines and a touch of watercolour whimsy, depicting a strange lily crossed with a Venus flytrap, its petals poised as if ready to snap shut on any unsuspecting insects.
📌 PromptA vast landscape made entirely of various meats spreads out before the viewer. tender, succulent hills of roast beef, chicken drumstick trees, bacon rivers, and ham boulders create a surreal, yet appetizing scene. the sky is adorned with pepperoni sun and salami clouds.
📌 PromptA 2D animation of a folk music band composed of anthropomorphic autumn leaves, each playing traditional bluegrass instruments, amidst a rustic forest setting dappled with the soft light of a harvest moon.
References
DALL-E 3: https://openai.com/dall-e-3
[2]@DrJimFan: https://twitter.com/DrJimFan