Models

Veo 3.1 API

Generate cinematic videos with native audio using Google Veo 3.1

Veo 3.1

Generate ~8 second cinematic videos with native audio. Veo 3.1 produces high-quality video with natural sound effects, ambient audio, and environmental sounds built-in. Available in Fast and Quality modes.

Try it now: Use the Veo 3.1 Generator to create cinematic videos with native audio.

Pricing

VariantCreditsPrice (Pro Yearly)Price (Pro Monthly)
Veo 3.1 Fast60$0.36$0.72
Veo 3.1 Quality240$1.44$2.88

Credits are deducted only on successful generation.

Endpoint

POST https://vicsee.com/api/v1/generate

See Authentication for API key setup.


Veo 3.1 Fast

Quick iterations with good quality. Recommended for testing and drafts.

Request Parameters

ParameterTypeRequiredDescription
modelstringYesveo-3-1
promptstringYesDescription including visual and audio elements
options.aspectRatiostringNo16:9, 9:16, or 1:1 (default: 16:9)

Example Request

curl -X POST https://vicsee.com/api/v1/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veo-3-1",
    "prompt": "Drone shot flying over a tropical beach at golden hour, waves crashing on shore, seagulls calling in the distance",
    "options": {
      "aspectRatio": "16:9"
    }
  }'

Veo 3.1 Quality

Highest quality output for final production. Longer processing time.

Request Parameters

ParameterTypeRequiredDescription
modelstringYesveo-3-1-quality
promptstringYesDescription including visual and audio elements
options.aspectRatiostringNo16:9, 9:16, or 1:1 (default: 16:9)

Example Request

curl -X POST https://vicsee.com/api/v1/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veo-3-1-quality",
    "prompt": "Cinematic shot of a chef plating a gourmet dish in a professional kitchen, sizzling sounds, ambient restaurant chatter",
    "options": {
      "aspectRatio": "16:9"
    }
  }'

Tips for Native Audio

Veo 3.1 generates audio based on your prompt. Include audio cues for best results:

Good prompts include:

  • "waves crashing on shore"
  • "birds chirping in the forest"
  • "busy city street with traffic sounds"
  • "person speaking to the camera"
  • "rain pattering on a window"

Example with rich audio:

{
  "model": "veo-3-1",
  "prompt": "A cozy coffee shop interior, barista steaming milk, espresso machine hissing, soft jazz playing in background, customers chatting quietly",
  "options": {
    "aspectRatio": "16:9"
  }
}

Response

Success (200)

{
  "taskId": "task_abc123xyz",
  "status": "pending",
  "model": "veo-3-1",
  "createdAt": "2025-12-29T12:00:00Z"
}

Poll for completion using Tasks API.

Task Complete

{
  "taskId": "task_abc123xyz",
  "status": "completed",
  "output": {
    "url": "https://cdn.vicsee.com/outputs/video_xyz.mp4",
    "duration": 8,
    "format": "mp4",
    "hasAudio": true
  }
}

  • Sora 2 - 10-15s videos with physics-accurate motion
  • Kling 2.6 - Videos with dialogue and lip-sync