How it works

Direction, not just generation.

Most AI tools give you ten seconds of random footage. Arch gives you a director who reads your brief, plans every cut, locks every character, and edits to your music — shot by shot, beat by beat.

The pipeline

Three phases. One vision.

Every render flows through the same three-step craft. Most of the quality you see comes from phase one — the part nobody else bothers to do.

01

Direction

Your brief becomes a film treatment.

  • A story bible captures the world, the characters, and the visual identity.
  • A shot plan breaks the song into precisely-timed cuts, each with its own framing and intent.
  • Every reference image is mapped to its role: main, supporting, environment, style.
02

Filming

Each shot is filmed individually.

  • A starting frame is generated from the shot plan.
  • The video model animates it — 1.5 to 5 seconds, motion-driven.
  • The last frame of each shot anchors the next, keeping the world coherent.
03

Assembly

Cuts, audio, delivery.

  • Every shot is concatenated in narrative order.
  • Your audio is muxed cleanly with the picture.
  • The final video is uploaded — ready to download or share.
The craft

Where the difference lives.

Six things Arch does that turn an AI clip generator into a real video studio.

A real AI director, not a single prompt

Most tools take your prompt and roll the dice. Arch runs multiple specialised AI passes in sequence: a story builder, a section planner, a shot planner, a prompt writer, a validator, a sanitizer. Each one has a job. The result is direction — not chance.

Editing locked to your music

Beat structure, vocal energy, BPM, dynamic shifts — all analysed from your audio. Cuts land on downbeats. Pacing follows the song's arc. Choruses cut faster than verses. You can also pin who sings what, exactly when.

Characters that don't drift

Each character's physical identity is described once and injected into every shot prompt — same hair, same outfit, same distinctive features across the whole video. No morphing faces between cuts.

You write briefs, not prompts

Tell Arch the story, the mood, the look. It translates that into hundreds of carefully crafted cinematographic prompts — one per shot — using the right vocabulary for each model. You stay creative; the machine handles the wording.

Refine shot by shot

A bad shot doesn't mean restarting the render. Hover any cut in your timeline, click recreate, and only that shot regenerates. The final video rebuilds itself. Credits go toward what actually needs work.

Your keys, your costs

Bring your own Gemini API key. We don't resell tokens or hide markup. You see every cost, you keep control. Arch only charges for the rendering — the part we actually do.

What you give

Four inputs. A whole video.

Audio
Up to 5 minutes, MP3 or WAV. Arch reads tempo, beats, vocals, and dynamic structure to drive the editing.
Creative guidance
One sentence or a full treatment — your story, mood, characters, locations, visual style. The depth is up to you.
Reference images
Optional. Up to 9 slots: main character, supporting cast, environments, and one style reference. Locks identity across the whole video.
Lyrics with section tags
Optional. [Verse], [Chorus], [Bridge] tags help align cuts to the song's structure. Auto-detected if omitted.

That's it. No timeline editing, no per-shot prompting, no model settings to tune. Arch handles the craft, you stay focused on direction.

Visual range

Beyond live-action.

The default look is premium cinematic. Ask for anything else and Arch adapts — wording, lighting, composition all shift to match the medium.

Cinematic live-action
Hand-drawn animation
Anime
Cel-shaded
3D animated
Stop motion
Claymation
Watercolor
Oil painting
Comic panels
Pixel art
Y2K music video
VHS home video
70s grainy doc
35mm film
Low-poly 3D

Stop generating. Start directing.

The full Arch is one signup away. Three free credits to start — no credit card.