Lipsync

Lip-sync video characters to an audio track. Modify lip movements in AI-generated or real video to match voiceover narration. Tiered pricing from 50 credits. Max 30s.

New: Lip-sync video characters to any audio track with a single API call. Works with AI-generated characters, animation, and live-action footage.

Pricing

Tiered by video duration. Video is probed automatically before processing.

DurationCreditsPrice (Pro Yearly)Price (Pro Monthly)
≤5s50$0.30$0.40
≤10s100$0.60$0.80
≤15s150$0.90$1.20
≤20s200$1.20$1.60
≤25s250$1.50$2.00
≤30s300$1.80$2.40

Formula: ceil(duration / 5) × 50 credits. Maximum video duration: 30 seconds.

Credits are deducted upfront based on video duration. Refunded automatically if lipsync fails.


Endpoint

POST https://vicsee.com/api/v1/tools/lipsync

See Authentication for API key setup.

Parameters

ParameterTypeRequiredDescription
video_urlstringYesSource video URL (MP4, MOV, WebM, M4V, GIF). Max 30 seconds.
audio_urlstringYesAudio track URL (MP3, WAV, AAC, OGG, M4A)
sync_modestringNoHow to handle duration mismatch. Default: "cut_off".

Sync Modes

ModeBehavior
cut_offTruncates whichever input is longer (default, most predictable)
loopRepeats the video to match longer audio
bouncePalindrome playback of video to match audio length
silenceExtends video with frozen frames if audio is longer
remapTime-stretches the video to match audio duration

Example Request

curl -X POST https://vicsee.com/api/v1/tools/lipsync \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "video_url": "https://example.com/scene.mp4",
    "audio_url": "https://example.com/voiceover.mp3"
  }'

Response

{
  "success": true,
  "data": {
    "id": "task_abc123",
    "model": "lipsync",
    "status": "completed",
    "output": {
      "url": "https://cdn.vicsee.com/results/dew/user-id/abc123.mp4",
      "duration": 5.0
    },
    "creditsUsed": 50,
    "creditsRemaining": 450,
    "createdAt": "2026-03-01T12:00:00.000Z"
  }
}

The response returns after processing completes (typically 30-120 seconds depending on video length). The output URL contains the lip-synced video hosted on VicSee CDN (available for 7 days).


Pipeline Example

Generate a voiceover with ElevenLabs, then lip-sync it to an AI-generated video:

# Step 1: Generate voiceover
AUDIO=$(curl -s -X POST https://vicsee.com/api/v1/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "elevenlabs-v3-dialogue",
    "prompt": "Welcome to the future of AI video creation."
  }' | jq -r '.data.output.url')

# Step 2: Lip-sync the voiceover to your video
curl -X POST https://vicsee.com/api/v1/tools/lipsync \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"video_url\": \"https://example.com/ai-generated-scene.mp4\",
    \"audio_url\": \"$AUDIO\"
  }"

Limits

ConstraintValue
Maximum video duration30 seconds
Supported video formatsMP4, MOV, WebM, M4V, GIF
Supported audio formatsMP3, WAV, AAC, OGG, M4A
Max processing time5 minutes
Multi-person supportYes (active speaker detection)
AI-generated charactersYes

Errors

CodeHTTPDescription
MISSING_VIDEO_URL400video_url not provided
MISSING_AUDIO_URL400audio_url not provided
INVALID_SYNC_MODE400Sync mode not one of the valid options
PROBE_FAILED422Could not read video duration
VIDEO_TOO_LONG422Video exceeds 30 second maximum
INSUFFICIENT_CREDITS402Not enough credits
LIPSYNC_SUBMIT_FAILED500Failed to submit to Replicate
LIPSYNC_FAILED500Processing failed or timed out