AI Endpoints
AI Endpoints
Section titled “AI Endpoints”Endpoints that call AI providers or run the export pipeline. All require authentication and the appropriate scope (e.g. ai:analyze, ai:generate).
Analyze
Section titled “Analyze”POST /api/analyze
Analyze a reference video with vision AI.
Headers: x-vision-provider, x-model-id (optional), x-api-key
Request: Multipart form with video (file), mode (style_transfer | content_remix), analyzerPrompt (optional).
Response: AnalysisResult — scene_count, overall_tone, color_palette, tempo, has_text_overlay, scenes[].
Scenario
Section titled “Scenario”POST /api/scenario
Generate scene-by-scene scenario with the LLM.
Headers: x-llm-provider, x-model-id (optional), x-api-key
Request body:
{ "analysis": { ... }, "intent": { "mode": "content_remix", "product_name": "EcoBottle", "product_description": "...", "target_audience": "...", "language": "en", "video_duration": 30, "scene_count": 5 }, "systemPrompt": "...", "knowledge": "...", "styleGuide": { ... }}Response: { scenes: ScenarioScene[] } — each with scene_index, duration_seconds, image_prompt, negative_prompt, video_prompt, text_overlay, transition.
Generate image
Section titled “Generate image”POST /api/generate/image
Generate one image from a text prompt.
Headers: x-image-provider, x-model-id (optional), x-api-key
Request body:
{ "prompt": "Photo of a bottle on a table...", "negative_prompt": "blurry, text", "image_instruction": "...", "styleGuide": { ... }, "resolution": { "width": 1080, "height": 1920 }}| Field | Type | Required | Description |
|---|---|---|---|
| prompt | string | Yes | Image generation prompt. |
| negative_prompt | string | No | What to avoid in the image. |
| image_instruction | string | No | Extra style instructions appended to the prompt. |
| styleGuide | object | No | Brand style guide. |
| resolution | object | No | Output resolution { width, height }. Maps to DALL-E 3 size (portrait→1024x1792, landscape→1792x1024, square→1024x1024). No effect on Gemini image provider (no resolution parameter). |
Response: { imageUrl: string } — URL to the generated image (or upload to work media and return that URL).
Generate video
Section titled “Generate video”POST /api/generate/video
Image-to-video: animate an image with a prompt.
Headers: x-video-provider, x-model-id (optional), x-api-key
Request body:
{ "image_url": "https://... or /api/.../media/scene/0/image", "prompt": "Slow push in toward the bottle", "duration": 5, "video_instruction": "...", "styleGuide": { ... }, "resolution": { "width": 1080, "height": 1920 }}| Field | Type | Required | Description |
|---|---|---|---|
| image_url | string | Yes | Source image URL or internal media path. |
| prompt | string | Yes | Motion/animation prompt. |
| duration | number | No | Duration in seconds (default: 5). |
| video_instruction | string | No | Extra instructions appended to the prompt. |
| styleGuide | object | No | Brand style guide. |
| resolution | object | No | Output resolution { width, height }. Maps to Runway ratio (portrait→"768:1280", landscape→"1280:768", square→omitted) or Gemini Veo aspectRatio ("9:16", "16:9", "1:1"). |
If image_url is an internal media URL, the backend resolves it with auth and passes a data URL to the provider.
Response: { videoUrl: string }
Export
Section titled “Export”POST /api/export
Produce the final MP4 from the editor state.
Request body: ExportRequest — projectId, workId, clips (sceneIndex, position, end, cutFrom, transition, transitionDuration), texts (text overlays), images (assetId, position, end, width, height, x, y, opacity, rotation), audio (volume, audioUrl), options (width, height, fps).
Response: Binary MP4 (e.g. Content-Disposition: attachment).
The backend uses FFmpeg to composite video clips (with transitions), text overlays, image overlays (project assets), and audio into one file.
See also
Section titled “See also”- Providers — List providers
- System Prompts