Learn the steps and key endpoints for creating videos with the Visla Open API.
Setup Open API Key/Secret
Visla provides an API key and secret for secure access to the OpenAPI. Include these credentials in your API request headers using an encrypted format. For authentication and encryption guidelines, see authenticated interactions.
Video Creation Methods
Visla’s OpenAPI supports multiple ways to create videos depending on your input type — webpage, text script, or document.
Idea to Video
Provide a brief idea or topic to generate a video. Visla uses AI to expand the idea into a structured script and automatically create a cohesive video with narration and visuals.
For full request and response details, see idea to video
Webpage to Video
Provide a webpage URL in your request, and Visla will automatically extract the content and generate a video. This is the simplest way to create videos directly from online articles or landing pages.
For full request and response details, see webpage to video
Script to Video
Submit your own text script to generate a video. Visla uses the provided script as narration and visual guidance to automatically create a cohesive video.)
You can control how the script is processed using the script_text_mode parameter:
ai_rewrite– Visla will rewrite and summarize the provided script using AI to create a more concise narration structure.direct_script– Visla will use the script exactly as provided without rewriting.
For full request and response details, see script to video
Visual to Video
Provide images or short video clips to generate a story-driven video. Visla organizes the visual assets and automatically creates a cohesive video with transitions, pacing, and supporting narration.
To create a video from visual assets:
- Upload your images or video clips using the resource upload process below.
- Create a video project using the visual to video endpoint.
You can control the structure of the generated video using the video_style parameter:
montage– A silent visual montage without narration.storytelling– An AI-generated narrative video that tells a story.explainer- An AI-generated video designed to explain a concept or topic.
Speech to Video
Submit an audio file or speech-centric video to generate a video. Visla analyzes the spoken content and transforms it into a polished video with synchronized visuals and narration.
To create a video from speech content:
- Upload your audio or video file using the resource upload process below.
- Create a video project using the speech to video endpoint with the uploaded resource URLs.
You can control the output format of the generated video using the project_function parameter:
SPEECH_TO_VIDEO_SUMMARY– Generates a concise highlight-style video summarizing the key points from the speech.SPEECH_TO_VIDEO_FULL_LENGTH– Produces a full-length video that closely follows the entire spoken content with synchronized visuals.
Document to Video
To create a video from a document:
- Upload your document using the resource upload process below.
- Create a video project using the document to video endpoint with the uploaded document URL.
Control how the document is used to generate the video with the doc_usage parameter:
- page_by_page_walkthrough (default)
- Equivalent to: Visla AI agent "Present Your Visuals".
- Scenes: One scene per page. The target video duration is determined by the number of pages and user-defined pacing.
- Narration: AI explains each page based on its visuals and text content.
- Footage: The page image only (no stock footage).
- content_source
- Equivalent to: Visla AI agent "Convert Text to Video".
- Scenes: AI composes scenes based on user-defined target video’s duration and pacing.
- Narration: Document text is the content source for the voiceover script.
- Footage: Mix of relevant page images and stock footage.
PPT override: For .ppt / .pptx files,
page_by_page_walkthroughis always applied. Any value provided fordoc_usageis ignored.
-
speaker_notes_verbatim
If the input source is a PowerPoint (PPT) file and speaker notes are provided, the notes will be read verbatim in the video narration. Default is false.
Key Workflow Steps
| Step | Endpoint | Purpose | Wait Time |
|---|---|---|---|
| 1 | GET /openapi/v1/project/get-asset-upload-url | Get S3 upload URL | Instant |
| 2 | PUT {uploadUrl} | Upload PDF to S3 | 10s–2min |
| 3 | POST /openapi/v1/project/doc-to-video | Create project & wait for editing status | 2–10min |
| 4 | POST /openapi/v1/project/{projectUuid}/export-video | Export video | Async via webhook |
video_duration_in_secondsapplies to create video from script, webpage and doc whendoc_usage=content_source. In other modes (e.g.,page_by_page_walkthrough), it may be ignored and the system will choose the length.
Resource upload process
Step 1: Get Upload URL
Obtain a pre-signed URL for file upload. This URL allows you to securely upload your resource file to S3.
- Parameters:
mediaType: Resource typesuffix: File extension
- Purpose: Generate a secure, temporary URL for file upload
- Response: Returns upload URL and upload ID
Step 2: Upload File to S3
Once you have the upload URL, you must upload your resource file directly to S3 using HTTP PUT method
- Method:
HTTP PUT - URL: Use the exact
uploadUrlfrom step 1 response - Headers:
Content-Type: Set appropriate MIME type (application/octet-stream,application/pdf,application/vnd.ms-powerpoint, etc.)- Do NOT include authentication headers for S3 upload
- Body: Raw binary file data
- Timeout: Complete upload within 10 minutes (URL expires after 600 seconds)
- Security: Uses pre-signed URL for secure, time-limited access
# Use the pre-signed URL from Step 1 — no auth headers needed for S3
with open("resource.pdf", "rb") as file:
response = requests.put(
upload_url, # URL from step 1
data=file.read(), # Raw binary data
headers={
"Content-Type": "application/octet-stream"
}
)
if response.status_code == 200:
print("Upload successful")
else:
print(f"Upload failed: {response.status_code}")File Constraints
- File Size: 100MB
- Supported formats: PDF, PPT, PPTX, PPS, PPSX, MP3, WAV, M4A, AAC, FLAC, OGG, MP4, MOV, AVI, WEBM, MKV
Video Status Lifecycle
Understanding the project status is crucial for successful integration:
| Status | Description | Next Action |
|---|---|---|
init | Project created; initial setup in progress. AI is analyzing document content. | Wait (typically 1–2 minutes) |
preparation | Basic configuration is being set (sound, avatar, subtitles, etc.) and initial video generation begins. | Wait while video is generated (typically 3–10 minutes) |
generated_videos | Video content is being created. | Wait 5–15 minutes |
editing | Project is ready for preview. | Proceed with editing or exporting |
error | The process failed. |
After creating a video project, poll the Get Project Info API repeatedly to check the project's status. Once the status reaches editing, the project is ready for preview or export.
Clip Status Lifecycle
After exporting a project, monitoring the clip status lifecycle ensures you know exactly when the video is ready for use:
| Status | Description | Typical Use Case |
|---|---|---|
init | Clip just created; initial setup in progress | Immediately after export request |
downloading | Downloading source content | Processing remote assets |
uploading | Uploading to storage | File transfer in progress |
processing | Video processing in progress | Encoding, compression, etc. |
publishing | Finalizing and publishing | Making clip available |
completed | Clip ready for download or viewing | Fully processed and available |
failed | Processing failed | An error occurred during processing |
After exporting a project, poll the Get Clip Info API repeatedly to monitor clip progress. The lifecycle ends when the status reaches completed, indicating the video is fully processed and ready for download or viewing.