Grok Imagine Image API — Text & Image to Image

Generate images with xAI's Grok Imagine through VicSee API. Text-to-image generates 6 variations, image-to-image generates 2 variations. 10 credits per generation.

Try it now: Select Grok Imagine from the model picker in the Studio.

Pricing

Capability	Credits	Price (Pro Yearly)	Price (Pro Monthly)
Text to Image (6 images)	10	$0.06	$0.12
Image to Image (2 images)	10	$0.06	$0.12

Credits are deducted only on successful generation.

Endpoint

POST https://vicsee.com/api/v1/generate

See Authentication for API key setup.

Text to Image

Generate 6 image variations from a text prompt.

Request Parameters

Parameter	Type	Required	Description
model	string	Yes	`grok-imagine-text-to-image`
input.prompt	string	Yes	Description of the image (max 5000 chars)
input.aspect_ratio	string	No	`"1:1"`, `"16:9"`, `"9:16"`, `"3:2"`, `"2:3"` (default: "1:1")

Example

{
  "model": "grok-imagine-text-to-image",
  "input": {
    "prompt": "Cinematic portrait of a woman sitting by a vinyl record player, retro living room, warm earthy tones, 1970s aesthetic, film grain texture",
    "aspect_ratio": "3:2"
  }
}

Response

Returns a task ID. Query with the Tasks endpoint to get the result.

{
  "success": true,
  "data": {
    "id": "task_abc123",
    "status": "pending",
    "model": "grok-imagine-text-to-image"
  }
}

When complete, the task result contains 6 image URLs.

Image to Image

Transform an existing image into 2 new variations with an optional text prompt.

Request Parameters

Parameter	Type	Required	Description
model	string	Yes	`grok-imagine-image-to-image`
input.prompt	string	No	Text prompt to guide the transformation
input.image_urls	string[]	Yes	Array with 1 image URL (max 10MB, JPEG/PNG/WebP)

Example

{
  "model": "grok-imagine-image-to-image",
  "input": {
    "prompt": "Recreate this scene in the style of a Studio Ghibli film",
    "image_urls": ["https://example.com/photo.jpg"]
  }
}

Response

Returns a task ID. When complete, the task result contains 2 image URLs.

Notes

Text to Image generates 6 images per request — all 6 are returned in the task result
Image to Image generates 2 variations per request
Maximum prompt length: 5,000 characters
Image input: JPEG, PNG, or WebP, max 10MB
Typical generation time: 30-90 seconds

Grok Imagine Image

On this page