Sora 2

Generate 10-15 second AI videos with Sora 2 through VicSee API. Text-to-video and image-to-video modes. Physics-accurate motion, synchronized audio. 20-30 credits.

Try it now: Use the Sora 2 Generator to create videos without code.

Need HD 1080p? Upgrade to Sora 2 Pro for higher resolution output.

Pricing

VariantCreditsPrice (Pro Yearly)Price (Pro Monthly)
10 seconds20$0.12$0.24
15 seconds30$0.18$0.36

Credits are deducted only on successful generation.

Endpoint

POST https://vicsee.com/api/v1/generate

See Authentication for API key setup.


Text to Video

Generate videos from text descriptions.

Parameters


model · string · required

The model ID for text-to-video generation.

Value: "sora-2-text-to-video"


prompt · string · required

Text description of the video to generate. Be specific about visual details, camera movements, lighting, and scene elements for best results.

Constraints:

  • Maximum length: 10,000 characters

Example: "A cat walking across a piano, playing random notes, sunlight streaming through window"


duration · number · optional

Video length in seconds.

Supported values:

  • 10 — 10 second video
  • 15 — 15 second video

Default: 10


aspect_ratio · string · optional

Video aspect ratio.

Supported values:

  • "landscape" — 16:9 horizontal
  • "portrait" — 9:16 vertical

Default: "landscape"


Example Request

curl -X POST https://vicsee.com/api/v1/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sora-2-text-to-video",
    "input": {
      "prompt": "A cat walking across a piano, playing random notes, sunlight streaming through window",
      "duration": 10,
      "aspect_ratio": "landscape"
    }
  }'

Image to Video

Animate a static image into a video.

Human faces: OpenAI policy may reject images containing real human faces. Use illustrations, objects, landscapes, or anime characters for best results.

Parameters


model · string · required

The model ID for image-to-video generation.

Value: "sora-2-image-to-video"


prompt · string · required

Text description of how to animate the image. Describe the motion, camera movement, and any changes you want to see.

Constraints:

  • Maximum length: 10,000 characters

Example: "The character turns and smiles at the camera, wind blowing through their hair"


image_urls · array<string> · required

Array containing the image URL to animate.

Constraints:

  • Exactly 1 image URL
  • Must be publicly accessible (http:// or https://)
  • Supported formats: .jpg, .jpeg, .png, .webp
  • Maximum file size: 10MB

Example: ["https://example.com/portrait.jpg"]


duration · number · optional

Video length in seconds.

Supported values:

  • 10 — 10 second video
  • 15 — 15 second video

Default: 10


aspect_ratio · string · optional

Video aspect ratio. Should match your input image for best results.

Supported values:

  • "landscape" — 16:9 horizontal
  • "portrait" — 9:16 vertical

Default: "landscape"


Example Request

curl -X POST https://vicsee.com/api/v1/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sora-2-image-to-video",
    "input": {
      "prompt": "The character turns and smiles at the camera, wind blowing through their hair",
      "image_urls": ["https://example.com/portrait.jpg"],
      "duration": 10,
      "aspect_ratio": "portrait"
    }
  }'

Response

Success (200)

{
  "success": true,
  "data": {
    "id": "task_abc123xyz",
    "model": "sora-2-text-to-video",
    "status": "pending",
    "creditsUsed": 20,
    "creditsRemaining": 480,
    "createdAt": "2026-02-11T12:00:00Z"
  }
}

Poll for completion using Tasks API.

Task Complete

{
  "taskId": "task_abc123xyz",
  "status": "completed",
    "output": {
      "url": "https://cdn.vicsee.com/outputs/video_xyz.mp4",
      "duration": 10,
      "format": "mp4"
    },
    "createdAt": "2026-01-17T12:00:00.000Z",
    "completedAt": "2026-01-17T12:02:30.000Z"
  }
}

  • Sora 2 Pro — HD 1080p quality with extended duration options
  • Veo 3.1 — Cinematic videos with native audio synthesis
  • Kling 2.6 — Videos with dialogue and lip-sync