TuneJury demo

Internal listening review · three reward-driven Applications

Mode 1 — Inference-time best-of-N selection

For each prompt, we draw N=16 candidates from a frozen backbone (only the noise seed differs) and pick the top-1 by TuneJury reward. Below: the first sample (N=1, c0) vs. the TuneJury-selected top-1-of-16. All four backbones receive the prompt prefix "high quality instrumental music, "; ACE-Step Turbo Continuous additionally receives an empty lyric input. See paper Appendix G for the full N-sweep.

MusicGen-medium (1.5B)

A dark trance track featuring accordion, blending hypnotic rhythms with melancholic melodies and a pervasive, atmospheric mood.
N=1 (single sample, c0)
TuneJury reward: +0.05
Best-of-16 (TuneJury-selected, c6)
TuneJury reward: +1.71 (Δ +1.66)
A party-loving rocknroll piece centered on piano, full of upbeat energy and lively rhythms, capturing the fun and festive spirit of a joyful gathering.
N=1 (single sample, c0)
TuneJury reward: −0.96
Best-of-16 (TuneJury-selected, c13)
TuneJury reward: +0.64 (Δ +1.60)
A spiritual techno piece featuring a smooth pad instrument, evoking a serene and reflective atmosphere.
N=1 (single sample, c0)
TuneJury reward: +0.13
Best-of-16 (TuneJury-selected, c10)
TuneJury reward: +1.68 (Δ +1.55)

MusicGen-large (3.3B)

A melodic latin piece featuring a sampler, blending rhythmic patterns with smooth, flowing melodies that evoke a warm and inviting atmosphere.
N=1 (single sample, c0)
TuneJury reward: −0.14
Best-of-16 (TuneJury-selected, c15)
TuneJury reward: +1.77 (Δ +1.91)
A dark score featuring a cymbal, evoking a tense, atmospheric tension through its delicate and echoing metallic textures.
N=1 (single sample, c0)
TuneJury reward: −1.15
Best-of-16 (TuneJury-selected, c9)
TuneJury reward: +0.78 (Δ +1.93)
A spiritual techno piece featuring a smooth pad instrument, evoking a serene and reflective atmosphere.
N=1 (single sample, c0)
TuneJury reward: +0.43
Best-of-16 (TuneJury-selected, c1)
TuneJury reward: +2.02 (Δ +1.59)

AudioLDM2-music (1.1B)

A dancefloor-focused chiptune track featuring slideguitar, delivering a retro, energetic vibe with pixelated melodies and crisp, rhythmic pulses.
N=1 (single sample, c0)
TuneJury reward: −2.07
Best-of-16 (TuneJury-selected, c14)
TuneJury reward: +0.32 (Δ +2.38)
A chaotic noise soundscape unfolds through the warm, shimmering Rhodes, layered with deep sub-bass and textured field recordings, creating a dense, atmospheric environment that blends dissonance and ambient texture. The piece floats in a raw, unfiltered tone, emphasizing unpredictable sonic shifts and environmental resonance, evoking a sense of unease and immersive sonic exploration.
N=1 (single sample, c0)
TuneJury reward: −1.26
Best-of-16 (TuneJury-selected, c10)
TuneJury reward: +0.66 (Δ +1.92)
Atmospheric music featuring a soothing vibraphone, enhanced with gentle piano and ambient synth pads, evokes a calm, reflective mood with soft, resonant tones and a spacious, dreamy texture.
N=1 (single sample, c0)
TuneJury reward: −0.66
Best-of-16 (TuneJury-selected, c9)
TuneJury reward: +1.31 (Δ +1.98)

ACE-Step v1.5 Turbo Continuous (2.4B)

A psychedelic soundscape crafted with the warm, resonant Rhodes, evoking a dreamy, otherworldly atmosphere.
N=1 (single sample, c0)
TuneJury reward: −0.85
Best-of-16 (TuneJury-selected, c6)
TuneJury reward: +1.77 (Δ +2.61)
An inspiring rocknroll piece featuring a ukulele, bright and upbeat with a lively rhythm that captures the spirit of classic, joyful tunes.
N=1 (single sample, c0)
TuneJury reward: −1.31
Best-of-16 (TuneJury-selected, c2)
TuneJury reward: +1.12 (Δ +2.43)
A soothing soundtrack featuring acoustic guitar, enriched with gentle cello and soft piano, unfolds in a calm, background mood, perfect for quiet moments or ambient settings. The acoustic guitar provides a warm, natural tone, while the cello and piano add depth and continuity, creating a relaxed, non-intrusive atmosphere that supports without overpowering.
N=1 (single sample, c0)
TuneJury reward: −0.80
Best-of-16 (TuneJury-selected, c6)
TuneJury reward: +1.50 (Δ +2.30)

Mode 2 — Inference-time DITTO reward optimization

SAO-small uses a 30-prompt subset of SDD-100. TangoFlux uses the full 100-prompt set (asymmetry due to SAO release reproducibility constraint, paper Appendix J). AudioLDM2-music is omitted (50-step UNet backprop exceeds 24GB). DITTO performs 5 outer iterations of full 8-step sampler backprop against the negative TuneJury reward at lr 0.05. Base weights stay frozen, and the Mode 1 prompt prefix is applied to the conditioning input.

SAO-small (340M)

A melancholic rap piece driven by a steady drummachine beat, layered with subtle synth pads and a sparse electric guitar, creating a reflective, introspective atmosphere. The rhythm remains tight and repetitive, grounding the mood in a sense of quiet solitude and emotional depth.
Baseline (N=1 noise)
TuneJury reward: +0.83
DITTO-optimized (5 outer iter)
TuneJury reward: +1.85 (Δ +1.01)
A retro techno track driven by crisp drums, layered with warm synth pads and a steady bassline, evokes a nostalgic, dance-floor-ready vibe with a polished yet authentic 1980s electronic feel.
Baseline (N=1 noise)
TuneJury reward: +0.62
DITTO-optimized (5 outer iter)
TuneJury reward: +1.41 (Δ +0.79)
A vibrant worldfusion piece driven by a bright guitar, pulsing percussion, and a lively drum kit, designed for the dancefloor with rhythmic energy and cross-cultural fusion elements. Enhanced by a vibrant tabla and a resonant synth bass, the arrangement blends global rhythms with modern textures, creating an infectious, celebratory groove that moves the body and energizes the space.
Baseline (N=1 noise)
TuneJury reward: +0.36
DITTO-optimized (5 outer iter)
TuneJury reward: +0.98 (Δ +0.62)

TangoFlux (515M)

A melancholic rap piece driven by a steady drummachine beat, layered with subtle synth pads and a sparse electric guitar, creating a reflective, introspective atmosphere. The rhythm remains tight and repetitive, grounding the mood in a sense of quiet solitude and emotional depth.
Baseline (N=1 noise)
TuneJury reward: −1.11
DITTO-optimized (5 outer iter)
TuneJury reward: +1.12 (Δ +2.23)
Noise music featuring a Hammond organ, characterized by an epic mood.
Baseline (N=1 noise)
TuneJury reward: −0.62
DITTO-optimized (5 outer iter)
TuneJury reward: +0.82 (Δ +1.44)
A vibrant worldfusion piece driven by a bright guitar, pulsing percussion, and a lively drum kit, designed for the dancefloor with rhythmic energy and cross-cultural fusion elements. Enhanced by a vibrant tabla and a resonant synth bass, the arrangement blends global rhythms with modern textures, creating an infectious, celebratory groove that moves the body and energizes the space.
Baseline (N=1 noise)
TuneJury reward: −0.81
DITTO-optimized (5 outer iter)
TuneJury reward: +0.34 (Δ +1.15)

Mode 3 — Expert-iteration post-training

Top-decile (90/900) candidates by TuneJury reward are used to fine-tune the FluxAudio-S backbone for 5K iterations (batch 16, AdamW, CFG 4.5, 25 Euler steps). Single-round LR sweep at 1e-6 / 5e-6 / 1e-5 yields aggregate lifts of +0.17 / +0.37 / +0.42 over a -0.262 baseline; the Pareto sweet spot is 5e-6 (+0.369, 73/100 prompts improved). See paper Appendix H for full ablations.

FluxAudio-S (120M)

A fast garage track featuring an electric guitar, driven by raw energy and a loose, rhythmic feel.
Baseline FluxAudio-S
TuneJury reward: −2.06
Post-trained (expert iteration)
TuneJury reward: −0.05 (Δ +2.00)
An experimental piece centered on a beat, evoking a fantasy mood.
Baseline FluxAudio-S
TuneJury reward: −1.10
Post-trained (expert iteration)
TuneJury reward: +0.83 (Δ +1.93)
An orchestral piece with a dreamy atmosphere, driven by a delicate sampler, lush strings, and a soft, shimmering piano, evoking a serene, ethereal mood where distant echoes and gentle textures blend into a floating, otherworldly soundscape.
Baseline FluxAudio-S
TuneJury reward: −0.74
Post-trained (expert iteration)
TuneJury reward: +0.77 (Δ +1.52)