AI tutorial video maker
Produce a tutorial video for your Shopify app in four steps: clone your voice, generate an AI avatar intro, re-voice your screen recording in the same voice, and stitch it all into one MP4.
Clone your voice
Upload a clean 10-60 second audio sample of the voice you want the avatar and the re-voiced screen recording to speak in.
Generate the avatar intro
Upload a portrait (clear face shot works best), write the script, pick a model. Output: a 15–60s intro MP4.
Best overall quality — 2026 benchmark for studio-grade avatars.
Re-voice your screen recording
Upload the screen recording you made with your voice. We transcribe it, let you clean up the text, then re-speak it in the cloned voice and swap the audio track. Video runs locally via ffmpeg.wasm (~25 MB, downloads once).
Stitch intro + tutorial
Concatenates the avatar intro and the re-voiced screen recording into a single MP4 (normalized to 1280×720, 30 fps, H.264 / AAC).
How this works
- Upload a 10–60s clean voice sample. MiniMax Speech 02 HD clones it into a reusable voice ID.
- Upload a portrait + write an intro script. We generate voice audio (TTS with the cloned voice) and hand it to your chosen avatar model (Creatify Aurora, ByteDance Omnihuman, or Hedra Character-3).
- Upload your screen recording. Whisper transcribes, you edit the transcript if needed, TTS re-speaks it in the cloned voice, and ffmpeg.wasm swaps the audio track on the original video.
- ffmpeg.wasm concatenates the intro and the re-voiced screen recording into a single 1280×720 MP4, H.264 + AAC, ready for upload to Shopify, YouTube, or the Shopify App Store video slot.
Costs
- Voice clone (one-time per voice): ~$0.01–0.02
- TTS (intro + screen recording): ~$0.001 per second of audio
- Avatar video: Aurora 720p $0.14/sec · 480p $0.07/sec · Omnihuman $0.14/sec · Hedra ~$0.02–0.05/sec
- Whisper transcription: pennies per minute
Rough total for a 30s intro + 90s screen recording at Aurora 720p: ~$4.50. Switch to Hedra to get under $1.