Shotmaker Experiments - The Daydreamer Evolution

A sequence of experiments exploring how structured 3D scenes, AI feedback, and music-driven parameter changes interact to produce evolving visual behavior.

These videos are grouped by capability — from basic structure preservation to physics interaction and beat-synchronized hallucination.

Sand Temple Demo
The first moment the system started to come alive — where feedback, hallucination, and music-driven parameter changes produced something visually unexpected.

The original goal of Shotmaker was to build a repeatable AI filmmaking process using structured 3D scene data from game engines.

But generative AI does something unusual. It hallucinates new details — shapes, textures, and motion that were never explicitly programmed. Normally those hallucinations flicker and disappear from frame to frame.

In this system — now called Daydreamer — a feedback mechanism allows the model to remember what it invented and carry those ideas forward. You can watch it explore and play with its own creations over time.

The visual shifts are not random. They are driven by the music.

On each beat, rendering parameters like denoising, CFG, and other weights are dynamically adjusted. It is similar to shaking an Etch-A-Sketch on the downbeat — the system clears space and invents something new, then builds on it in the next moments.

These experiments are both artistic exploration and technical research into how controllable AI video systems might evolve.


1. Structure — The System Learns the Scene

These experiments demonstrate how the renderer begins with a stable 3D foundation from Unity and gradually introduces hallucinated detail as parameters increase.

The key question explored here:

How much creative variation can be introduced while preserving spatial structure?

Sand Temple Ramp — Controlled Hallucination
Starting from a clean Unity frame, CFG scale is gradually increased. Brief flashes on music downbeats (the one count). The scene transitions from structured geometry to imaginative interpretation while maintaining spatial continuity.

2. Style — Portrait Mode and Composition

These experiments explore vertical composition designed for mobile viewing and social media formats.

The renderer focuses on character presence while maintaining environmental context.

Sand Temple Portrait — Environmental Emergence
Architectural structures emerge beneath the dancer as the system invents spatial detail from depth and motion cues.

Higher Denoising — Emergent Character Variations
With higher denoising values, the renderer begins to explore alternate visual interpretations of the dancer. These variations appear briefly and then stabilize, demonstrating how creative exploration can be introduced without losing motion continuity.
Portrait Variant — Architectural Exploration
In this variation, the system begins to invent architectural structures beneath the dancer, expanding the scene while maintaining spatial stability.

3. Energy — Music Makes It Move

These experiments synchronize rendering behavior to musical structure and sound energy.

Instead of using fixed render settings, the system automates denoising and other parameters across time. Downbeats create larger pulses, intermediate beats create smaller pulses, and additional energy-driven bias can push the renderer harder when the music intensifies.

In the plot below, the denoising curve is built from several components: a base value, beat-driven pulse additions, and an experimental override in measure 7 that deliberately drives the system into a stronger hallucination event.

Plot of beat-driven denoising automation showing downbeat pulses, beat pulses, and a larger override in measure 7

Funky Town — Portrait Variation
A refined portrait version of the Funky Town sequence, emphasizing rhythm-driven motion and tighter framing around the dancer.

Funky Town — Beat-Driven Motion
Visual intensity increases on musical beats, producing rhythmic bursts of motion and style changes.

Energy Mapping — Sound Controls Rendering
Audio amplitude is mapped directly to rendering parameters, allowing the system to react to music intensity in real time.
Go-Go Dancers — Beat-Synchronized Motion
This lower-denoising experiment pushes much harder on the downbeats and beats. The result is one of the clearest examples of the dancers syncing their pose and motion to the music while still allowing controlled bursts of hallucination.
Denoising automation driven by musical beats and downbeats
The dancers appear to synchronize with the music because the rendering parameters are being actively driven by the beat. This plot shows the denoising signal used in the experiment, with steady baseline values and sharp increases on downbeats that inject motion and variation at precise musical moments.

4. Physics — Interaction with Real Motion

These experiments test how the renderer behaves when interacting with simulated physical objects.

The goal is to preserve motion consistency while still allowing creative hallucination.

Boing Ball Physics — High Hallucination Attempt
This earlier test pushed the hallucination much harder. The tradeoff was that the Boing Ball itself mostly disappeared, only appearing briefly near the end as the renderer prioritized invention over structure.

Boing Ball Physics — Structure vs Imagination
After the more aggressive version largely lost the Boing Ball, this follow-up test lowered denoising, strengthened segmentation, and adjusted prompts to preserve the object more clearly. The result keeps the ball visible for much more of the shot while still allowing brief pulses into more stylized hallucination.

5. How It Works — Inside the System

This video shows the internal signals driving the renderer.

Unity frames, control maps, and parameter changes are displayed in real time, revealing how structure, motion, and music interact to produce the final result.

System Overview — Control Maps and Beat Synchronization
A technical walkthrough showing the underlying data driving the visuals, including depth maps, segmentation, motion signals, and beat synchronization.