Grok Imagine Video API — Text & Image to Video

Generate cinematic videos with native audio using xAI's Grok Imagine through VicSee API. Text-to-video and image-to-video with three creative modes. 15-55 credits.

Try it now: Select Grok Imagine from the model picker in the Studio.

Pricing

Duration	Resolution	Credits	Price (Pro Yearly)	Price (Pro Monthly)
6s	480p	15	$0.09	$0.18
6s	720p	28	$0.17	$0.34
10s	480p	28	$0.17	$0.34
10s	720p	40	$0.24	$0.48
15s	480p	40	$0.24	$0.48
15s	720p	55	$0.33	$0.66

Credits are deducted only on successful generation.

Endpoint

POST https://vicsee.com/api/v1/generate

See Authentication for API key setup.

Text to Video

Generate videos from text prompts with native audio.

Request Parameters

Parameter	Type	Required	Description
model	string	Yes	`grok-imagine-text-to-video`
input.prompt	string	Yes	Description of the video (max 5000 chars)
input.aspect_ratio	string	No	`"16:9"`, `"9:16"`, `"1:1"`, `"3:2"`, `"2:3"` (default: "16:9")
input.duration	number	No	`6`, `10`, or `15` seconds (default: 6)
input.resolution	string	No	`"480p"` or `"720p"` (default: "720p")
input.mode	string	No	`"fun"`, `"normal"`, or `"spicy"` (default: "normal")

Creative Modes

fun — Playful, exaggerated motion with creative flair
normal — Balanced, realistic motion (default)
spicy — Intense, dramatic motion with bold creative choices (text-to-video only)

Example Request

curl -X POST https://vicsee.com/api/v1/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-imagine-text-to-video",
    "input": {
      "prompt": "A golden retriever runs through a sunlit meadow, wildflowers swaying in the breeze, cinematic slow motion",
      "aspect_ratio": "16:9",
      "duration": 10,
      "resolution": "720p",
      "mode": "normal"
    }
  }'

Image to Video

Animate an image with optional motion description.

Request Parameters

Parameter	Type	Required	Description
model	string	Yes	`grok-imagine-image-to-video`
input.prompt	string	No	Motion description (max 5000 chars)
input.image_urls	string[]	Yes	Array with one image URL
input.aspect_ratio	string	No	`"16:9"`, `"9:16"`, `"1:1"`, `"3:2"`, `"2:3"` (default: "16:9")
input.duration	number	No	`6`, `10`, or `15` seconds (default: 6)
input.resolution	string	No	`"480p"` or `"720p"` (default: "720p")
input.mode	string	No	`"fun"` or `"normal"` (default: "normal")

Note: "spicy" mode is not available for image-to-video. If provided, it will be automatically set to "normal".

Example Request

curl -X POST https://vicsee.com/api/v1/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-imagine-image-to-video",
    "input": {
      "prompt": "Camera slowly zooms in as the character turns to face the viewer",
      "image_urls": ["https://example.com/your-image.jpg"],
      "aspect_ratio": "16:9",
      "duration": 6,
      "resolution": "720p"
    }
  }'

Input Image Requirements

Formats: JPEG, PNG, WebP
Max size: 10MB
One image only — the array must contain exactly one URL

Response

Success (200)

{
  "success": true,
  "data": {
    "id": "task_abc123xyz",
    "model": "grok-imagine-text-to-video",
    "status": "pending",
    "creditsUsed": 28,
    "creditsRemaining": 972,
    "createdAt": "2026-03-07T12:00:00Z"
  }
}

Poll for completion using Tasks API.

Task Complete

{
  "taskId": "task_abc123xyz",
  "status": "completed",
  "output": {
    "url": "https://cdn.vicsee.com/outputs/video_xyz.mp4",
    "duration": 10,
    "resolution": "720p",
    "format": "mp4"
  }
}

Veo 3.1 - Cinematic videos with native audio
Kling 3.0 - Multi-shot videos with flexible duration
Seedance 1.5 Pro - Multilingual audio with reference images

Grok Imagine Video

On this page