Kling 3.0

Generate 3-15 second premium videos with native audio using Kling 3.0 through VicSee API. Standard and Professional quality, flexible duration, start and end frame support. 84-840 credits.

Try it now: Use the Kling 3.0 Generator to create premium AI videos. Pro plan required.

Pricing

Kling 3.0 uses per-second pricing. Multiply the rate by your chosen duration (3-15 seconds).

QualityAudioCredits per second
StandardOff28/s
StandardOn42/s
ProfessionalOff38/s
ProfessionalOn56/s

Examples

ConfigurationCalculationCredits
5s Standard + Audio (default)42 x 5210
3s Standard, no audio (cheapest)28 x 384
10s Professional + Audio56 x 10560
15s Professional + Audio (max)56 x 15840

Credit range: 84-840 credits. Credits are deducted only on successful generation.

Endpoint

POST https://vicsee.com/api/v1/generate

See Authentication for API key setup.


Text to Video

Generate premium videos from text descriptions with native audio.

Request Parameters

ParameterTypeRequiredDescription
modelstringYeskling-3-0-text-to-video
input.promptstringYesVideo description (max 2500 chars)
input.durationnumberNo3-15 seconds, any integer (default: 5)
input.modestringNostandard or professional (default: standard)
input.audiobooleanNoEnable audio generation (default: true)
input.aspect_ratiostringNo16:9, 9:16, 1:1 (default: 16:9)
input.multi_shotsbooleanNoEnable manual shot control (default: false). When false, AI auto-segments shots.
input.multi_promptobject[]NoPer-shot prompts. Required when multi_shots is true. See Multi-Shot Control.

Example Request

curl -X POST https://vicsee.com/api/v1/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kling-3-0-text-to-video",
    "input": {
      "prompt": "A cinematic tracking shot through a neon-lit Tokyo street at night",
      "duration": 10,
      "mode": "professional",
      "audio": true,
      "aspect_ratio": "16:9"
    }
  }'

Image to Video

Animate images into premium video. Supports start frame and optional end frame.

Request Parameters

ParameterTypeRequiredDescription
modelstringYeskling-3-0-image-to-video
input.promptstringYesDescription of the animation
input.image_urlsstring[]YesArray of 1-2 images (start frame, optional end frame)
input.durationnumberNo3-15 seconds, any integer (default: 5)
input.modestringNostandard or professional (default: standard)
input.audiobooleanNoEnable audio generation (default: true)
input.multi_shotsbooleanNoEnable manual shot control (default: false). When false, AI auto-segments shots.
input.multi_promptobject[]NoPer-shot prompts. Required when multi_shots is true. See Multi-Shot Control.

Example Request

curl -X POST https://vicsee.com/api/v1/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kling-3-0-image-to-video",
    "input": {
      "prompt": "The person walks forward through the garden, looking around with curiosity",
      "image_urls": ["https://example.com/start-frame.jpg"],
      "duration": 5,
      "mode": "standard",
      "audio": true
    }
  }'

Response

Success (200)

{
  "success": true,
  "data": {
    "id": "task_abc123xyz",
    "model": "kling-3-0-text-to-video",
    "status": "pending",
    "creditsUsed": 560,
    "creditsRemaining": 440,
    "createdAt": "2026-02-11T12:00:00Z"
  }
}

Poll for completion using Tasks API.

Task Complete

{
  "taskId": "task_abc123xyz",
  "status": "completed",
  "output": {
    "url": "https://cdn.vicsee.com/outputs/video_xyz.mp4",
    "duration": 10,
    "format": "mp4",
    "hasAudio": true
  }
}

Multi-Shot Control

By default (multi_shots: false), Kling 3.0 automatically segments your prompt into multiple shots. The AI decides how to break your scene into cinematic shots.

When multi_shots: true, you define each shot manually using multi_prompt. Each shot has its own prompt and duration.

multi_prompt Format

FieldTypeRequiredDescription
promptstringYesDescription for this shot
durationnumberYesDuration of this shot in seconds

The sum of all shot durations should equal the total duration parameter.

Example: Multi-Shot Request

curl -X POST https://vicsee.com/api/v1/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kling-3-0-text-to-video",
    "input": {
      "prompt": "A dramatic car chase through a city",
      "duration": 10,
      "mode": "professional",
      "audio": true,
      "multi_shots": true,
      "multi_prompt": [
        { "prompt": "A sports car idles at a red light, engine rumbling, driver gripping the wheel", "duration": 3 },
        { "prompt": "The light turns green, tires screech, the car rockets forward down the street", "duration": 4 },
        { "prompt": "The car drifts around a corner, smoke billowing from the tires, camera follows", "duration": 3 }
      ]
    }
  }'

When multi_shots is true, multi_prompt is required and must contain at least one shot. Each shot must have both prompt and duration.


  • Kling 2.6 - Video with dialogue and lip-sync
  • Sora 2 - Physics-accurate videos with audio
  • Veo 3.1 - Cinematic videos with native audio