Lipsync
Lip-sync video characters to an audio track. Modify lip movements in AI-generated or real video to match voiceover narration. Tiered pricing from 50 credits. Max 30s.
New: Lip-sync video characters to any audio track with a single API call. Works with AI-generated characters, animation, and live-action footage.
Pricing
Tiered by video duration. Video is probed automatically before processing.
| Duration | Credits | Price (Pro Yearly) | Price (Pro Monthly) |
|---|---|---|---|
| ≤5s | 50 | $0.30 | $0.40 |
| ≤10s | 100 | $0.60 | $0.80 |
| ≤15s | 150 | $0.90 | $1.20 |
| ≤20s | 200 | $1.20 | $1.60 |
| ≤25s | 250 | $1.50 | $2.00 |
| ≤30s | 300 | $1.80 | $2.40 |
Formula: ceil(duration / 5) × 50 credits. Maximum video duration: 30 seconds.
Credits are deducted upfront based on video duration. Refunded automatically if lipsync fails.
Endpoint
POST https://vicsee.com/api/v1/tools/lipsyncSee Authentication for API key setup.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| video_url | string | Yes | Source video URL (MP4, MOV, WebM, M4V, GIF). Max 30 seconds. |
| audio_url | string | Yes | Audio track URL (MP3, WAV, AAC, OGG, M4A) |
| sync_mode | string | No | How to handle duration mismatch. Default: "cut_off". |
Sync Modes
| Mode | Behavior |
|---|---|
cut_off | Truncates whichever input is longer (default, most predictable) |
loop | Repeats the video to match longer audio |
bounce | Palindrome playback of video to match audio length |
silence | Extends video with frozen frames if audio is longer |
remap | Time-stretches the video to match audio duration |
Example Request
curl -X POST https://vicsee.com/api/v1/tools/lipsync \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"video_url": "https://example.com/scene.mp4",
"audio_url": "https://example.com/voiceover.mp3"
}'Response
{
"success": true,
"data": {
"id": "task_abc123",
"model": "lipsync",
"status": "completed",
"output": {
"url": "https://cdn.vicsee.com/results/dew/user-id/abc123.mp4",
"duration": 5.0
},
"creditsUsed": 50,
"creditsRemaining": 450,
"createdAt": "2026-03-01T12:00:00.000Z"
}
}The response returns after processing completes (typically 30-120 seconds depending on video length). The output URL contains the lip-synced video hosted on VicSee CDN (available for 7 days).
Pipeline Example
Generate a voiceover with ElevenLabs, then lip-sync it to an AI-generated video:
# Step 1: Generate voiceover
AUDIO=$(curl -s -X POST https://vicsee.com/api/v1/generate \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "elevenlabs-v3-dialogue",
"prompt": "Welcome to the future of AI video creation."
}' | jq -r '.data.output.url')
# Step 2: Lip-sync the voiceover to your video
curl -X POST https://vicsee.com/api/v1/tools/lipsync \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"video_url\": \"https://example.com/ai-generated-scene.mp4\",
\"audio_url\": \"$AUDIO\"
}"Limits
| Constraint | Value |
|---|---|
| Maximum video duration | 30 seconds |
| Supported video formats | MP4, MOV, WebM, M4V, GIF |
| Supported audio formats | MP3, WAV, AAC, OGG, M4A |
| Max processing time | 5 minutes |
| Multi-person support | Yes (active speaker detection) |
| AI-generated characters | Yes |
Errors
| Code | HTTP | Description |
|---|---|---|
MISSING_VIDEO_URL | 400 | video_url not provided |
MISSING_AUDIO_URL | 400 | audio_url not provided |
INVALID_SYNC_MODE | 400 | Sync mode not one of the valid options |
PROBE_FAILED | 422 | Could not read video duration |
VIDEO_TOO_LONG | 422 | Video exceeds 30 second maximum |
INSUFFICIENT_CREDITS | 402 | Not enough credits |
LIPSYNC_SUBMIT_FAILED | 500 | Failed to submit to Replicate |
LIPSYNC_FAILED | 500 | Processing failed or timed out |
Related
- ElevenLabs Audio — Generate voiceover audio to lip-sync
- Merge Audio + Video — Merge audio and video without lipsync
- Merge Videos — Combine multiple videos into one
- Tasks API — Check task history
- Credits API — Check your credit balance
Character Creation
Train a consistent character identity from photos (FLUX LoRA). Upload 10-30 images, get a character ID, then generate unlimited consistent images via API. 500 credits per training, 15 credits per image.
Merge Audio + Video
Merge a video URL with an audio URL into a single MP4. Replace original audio or mix new audio on top of existing. 1 credit per merge.