DALLE3画9帧gif、16格连环画,全流程制作分享
1. 这篇文章的最初灵感来自Professor Synapse的ChatGPT Gifs视频(https://www.youtube.com/watch?v=fefLrYgeWTM),作者展示了通过DALLE3做一个蝙蝠的gif。
2. Synapse的提示语:
Prompt:
Step 1 (DALL-E)
# MISSION
Act as a professional 8-bit animator who specializes in creating animals. Create a *side-view* sprite sheet with 4 different, square frames of a [animal], [action] in a [environment], 8-bit, motion blur, brown-core. Your task is complete when there is a single image with 4 panels as described below.
# IMAGES
1. Top Left: [motion 1 description]
2. Top Right: [motion 2 description]
3. Bottom Left: [motion 3 description]
4. Bottom Right: [motion 4 description]
# RULES
- Ensure the subject is centered in each frame
- Ensure each frame is from a side view - Ensure the subject is always facing to the right
- Output ALL 4 FRAMES from the same seed
Step 2 (Data Analysis)
Create a GIF using the attached image. It's a sprite sheet consisting of 4 different frames of an animation, arranged as follows:
Frames:
[1] Top Left
[2] Top Right
[3] Bottom Left
[4] Bottom Right
To create the GIF animation, use the frames in this order: 1, 3, 2, 4. After that, play the sequence 3 more times to add motion.
翻译成中文:
3. 我按照Synapse的思路,修改了下,生成了如下9帧的青蛙gif:
提示语:
MISSION
Act as a professional 8-bit animator who specializes in creating animals. Create a side-view sprite sheet with 9 different, square frames of a frog jumping through a swamp, 8-bit, motion blur, green-core. Your task is complete when there is a single image with 9 panels as described below.
IMAGES
Top Left: The frog's legs are bunched as it prepares to jump.
Top Middle: The frog's hind legs are extended as it starts its jump.
Top Right: The frog is mid-air with legs and arms extended.
Middle Left: The frog's front legs are forward and hind legs back in the air.
Middle Middle: The frog's limbs are extended before landing.
Middle Right: The frog lands with an elongated blur of motion.
Bottom Left: The frog crouches down after landing.
Bottom Middle: The frog's limbs are bunched as it prepares for another jump.
Bottom Right: The frog pushes off with its hind legs to start another jump.
RULES Ensure the frog is centered in each frame Ensure each frame is from a side view Ensure the frog is always facing to the right Output ALL 9 FRAMES from the same seed
这里的一个小技巧:请Claude帮我修改要替换的内容。
4. 过了2天,我就想,Synapse的提示语里非常重要的一点就是指明1幅图里有几格画面,每格在画面里的动作是怎么样的,如果我让DALLE3画16格、25格并指出每帧的具体画面,那我岂不是可以在1张图里制作角色一致、风格一致的连环画?(已知AI绘画难以保持角色一致)
5. 在画图之前,先了解下gen_id的概念。
(1) gen_id 是一个唯一的标识符,用于标识DALL·E为特定的描述生成的图像。每次DALL·E生成一个新图像时,都会为其分配一个新的 gen_id。这使得我们可以跟踪、引用或在需要时再次生成特定的图像。简而言之,gen_id 就是该图像的唯一ID。
(2)让GPT回答图片的json响应即可收到seed和id。
(3)有了gen_id,哪怕你和gpt沟通了十轮,想找到之前的某一张图片,都可以基于gen_id定位到对应的图片。
6. 我在之前的文章实际上手 DALLE3,精准控制画面和文字,生成漫画里展示了1张图里生成连环漫画的效果,一般是生成4张。那时,我是直接分几幕场景,让DALLE3一幕幕画的,当时GPT里的DALLE3还可以一次生成4张及以上图片,现在1次最多只生成2张图片了。而现在我尝试在1张图片里尽量让DALLE3生成多格画面。
7. 先试了下9格画面,效果还行。
16格,出现2个问题:画面不全,角色内容出错(小羊和小兔的故事部分变成了小羊和小羊的故事)。
15格,我指定了3排,每排5个画面,DALLE3的提示词里也是明确是15,但是最后生成的图片没有15个,即DALLE3不擅长数数。
我想让DALLE3生成25格画面,它最终还是生成了16格。
我还尝试了其他16格
游乐园这张兔子的服饰、兔子的长相的一致性保持不错。
以下这张很特别,我让它画16格,它画了20格画面。
8. 最后演示一个9格的绘本故事生成。
(1)使用Claude改编多格画面内容。
(2)用GPT4生成不同场景。
(3)用DALLE3绘图。
仔细观察下面一系列的图片,8岁女孩的红脸蛋和大眼睛是不是保持得还不错呢?
这里用到了gen_id来保持一致性。
10. 1张图里多个画面,裁切出来的话单个画面分辨率低,这里推荐开源软件upscayl来放大。如果需要做更精细的画面,就把分割后的图片转到ps或figma或canva里调整。