Skip to content

Scene Schema

Scenes appear in two main places:

  • AnalysisResult.scenes[] — Output from video analysis.
  • ScenarioScene[] — Output from scenario generation and input to image/video generation.

Defined in packages/shared/src/types/scenario.ts as UserIntentSchema:

export const UserIntentSchema = z.object({
mode: z.enum(["style_transfer", "content_remix"]),
product_name: z.string().optional(),
product_description: z.string().optional(),
target_audience: z.string().optional(),
language: z.string().optional(),
video_duration: z.number().optional(),
scene_count: z.number().optional(),
});
FieldTypeRequiredDescription
mode"style_transfer" | "content_remix"YesVideo generation mode.
product_namestringNoProduct or subject name.
product_descriptionstringNoDescription of the product/subject.
target_audiencestringNoTarget audience.
languagestringNoOutput language code.
video_durationnumberNoTotal video duration in seconds.
scene_countnumberNoNumber of scenes to generate.

Defined in ScenarioSceneSchema in packages/shared/src/types/scenario.ts and mirrored in the OpenAPI ScenarioResponse:

export const ScenarioSceneSchema = z.object({
scene_index: z.number(),
duration_seconds: z.number(),
image_prompt: z.string(),
negative_prompt: z.string().nullable().default(""),
video_prompt: z.string(),
text_overlay: z.string().nullable().optional(),
transition: z.string().nullable().default("cut"),
});
FieldTypeRequiredDescription
scene_indexnumberYes0-based scene index.
duration_secondsnumberYesScene duration in seconds.
image_promptstringYesPrompt for image generation.
negative_promptstringNoWhat to avoid in the image (may be "").
video_promptstringYesMotion/camera prompt for image-to-video.
text_overlaystringNoOptional on-screen text for the scene.
transitionstringNoTransition type to the next scene (e.g. "cut", "fade", "dissolve").

In ARCHITECTURE.md and the OpenAPI ScenarioAnalysisInput, analysis scenes look like:

FieldTypeDescription
indexnumber1-based scene index.
duration_secondsnumberScene duration.
descriptionstringVisual description of the scene.
transitionstringTransition type between scenes.
text_overlaystringText shown in the scene, if any.
camera_movementstringCamera movement (e.g. "pan left", "zoom in").

These analysis scenes can be passed into scenario generation (as analysis) to influence the resulting ScenarioScene array.

  • Analyze step → returns AnalysisResult.scenes[] with structural info.
  • Scenario step → takes UserIntent and optional analysis.scenes[] to produce ScenarioScene[].
  • Generate step → uses ScenarioScene.image_prompt, negative_prompt, video_prompt, and duration_seconds for image and video generation.