Overview

The Clipzy ML Engine is a REST API that analyzes video content and applies learned visual and audio styles to new footage. You send a video, the engine extracts its stylistic fingerprint — called Style DNA — and can apply that fingerprint to your own clips.

How it works

The engine follows a simple input/output model:

Input

A source video (template) whose style you want to extract, plus your own footage to apply it to.

Output

A processed video styled to match the template, plus the Style DNA JSON that describes the extracted style.

You can use the API in two ways:

Extract only — submit a template video and receive its Style DNA JSON for inspection or storage.
Extract and apply — submit a template video and target footage, and receive a fully rendered output video.

Asynchronous jobs

All processing runs asynchronously. When you submit a request, the API immediately returns a job ID. You then poll the job status endpoint or configure a webhook to be notified when processing completes.

# Submit a job
POST /api/v1/jobs

# Check job status
GET /api/v1/jobs/{job_id}

A job moves through a defined set of statuses: queued → processing → completed (or failed / cancelled). See Job Lifecycle for the full state machine.

Jobs typically complete within 30–120 seconds depending on video length and the processing stages required. The progress_percent field on the job object lets you track progress in real time.

Template videos

A template video is the source of truth for style extraction. The engine analyzes its color grading, cuts, transitions, audio treatment, motion intensity, and more to build a Style DNA object. Choose template videos that clearly represent the style you want to replicate:

A consistent color grade throughout the clip improves extraction accuracy.
Music-driven edits with clear beat alignment produce the most reliable detected_beats and tempo_bpm values.
Clips under 10 minutes process faster and yield a higher extraction_confidence score.

The extraction_confidence field (0–1) in the Style DNA output tells you how reliably the engine was able to analyze the template. Scores above 0.85 indicate a high-quality extraction.

Supported formats and file limits

Video formats

Format	Extension
MP4	`.mp4`
MOV	`.mov`
AVI	`.avi`
MKV	`.mkv`
WebM	`.webm`

File size limit

The maximum upload size is configured by the MAX_FILE_SIZE_MB environment variable (default: 1000 MB). Uploads exceeding this limit are rejected before processing begins.

Files exceeding the configured size limit are rejected at upload time. Check your file size before submitting.

Output formats

The rendered output video is returned as MP4 (H.264). The output resolution matches your input footage.

Next steps

Style DNA

Learn about the StyleJSON schema and all the fields it contains.

Processing pipeline

See the six stages a job goes through from submission to output.

Job lifecycle

Understand job statuses, progress tracking, and completion detection.

Get Started

Core Concepts

Guides

Configuration

Troubleshooting

How it works

Input

Output

Asynchronous jobs

Template videos

Supported formats and file limits

Video formats

File size limit

Output formats

Next steps

Style DNA

Processing pipeline

Job lifecycle

Get Started

Core Concepts

Guides

Configuration

Troubleshooting

​How it works

Input

Output

​Asynchronous jobs

​Template videos

​Supported formats and file limits

​Video formats

​File size limit

​Output formats

​Next steps

Style DNA

Processing pipeline

Job lifecycle

How it works

Asynchronous jobs

Template videos

Supported formats and file limits

Video formats

File size limit

Output formats

Next steps