Text to Video
Generate anime scenes and cinematic motion clips from text prompts using Veo, Sora, Kling, and Wan.
Text to Video (T2V)
The Text to Video engine is Anime Builder's core motion tool. It turns written scene descriptions into high-fidelity anime clips with temporal consistency.
Supported Models
1. Google Veo 3.1
- Strengths: Photoreal motion, strong camera control, and stable scene rendering.
- Best For: Cinematic anime shots and detailed environments.
2. OpenAI Sora 2
- Strengths: Creative motion, expressive scenes, and long-form coherence.
- Best For: Story-driven anime clips and stylized action.
3. Kling & Wan Video
- Strengths: Efficient processing and strong character motion.
- Best For: Social clips, animation tests, and motion studies.
Technical Specs
| Parameter | Specification |
|---|---|
| Output Format | MP4 |
| Frame Rate | 24fps or 30fps |
| Aspect Ratios | 16:9, 9:16, 1:1, 2.35:1 |
| Max Resolution | Up to 1080p / 2K depending on model |
| Generation Time | Depends on queue and model load |
Workflow
Step 1: Choose a Model
- Use Veo for polished motion and detailed environments.
- Use Sora for cinematic staging and expressive scenes.
Step 2: Write a Scene Prompt
Use a structure like:
[Subject] + [Action] + [Environment] + [Lighting/Camera] + [Style]
Example Prompt:
"A swordswoman standing on a rooftop in the rain, neon reflections, slow camera push-in, cel-shaded anime style, dramatic lighting."
Step 3: Set the Basics
- Longer clips cost more credits.
- Higher resolution increases quality and compute use.
Optimization Tips
- Keep prompts focused when you want stable motion.
- Use medium shots for better character fidelity.
- Add phrases like "slow smooth camera movement" when you want calmer motion.