aimastering.dev / dev log
Post #022 · 2026-03-08

AB Testing on TikTok and YouTube

20 patterns. 1 minute. With visuals.

And then a decisive dataset arrived — (continued)

AB testTikTokYouTubereelsdata1min
1.

The Next Question Was "What Specifically Works?"

Beatport 110 proved that the structure works.

But what those 110 buyers responded to was still unknown. The track structure? The title? The genre tags? Algorithmic chance? To secure reproducibility, the variables needed to be identified.

"Apply AB testing to music" — in my work, this is standard methodology. Why I had not applied it to music until now is, in itself, the strange thing.

2.

Design — 4 Axes, 20 Patterns

I decomposed the variables into 4 axes. The 1-minute cap was chosen because platform retention measurement is most accurate at that length. All patterns included visuals — comparison against text-only posts was deferred to a later phase.

A — Track StructureAssumed highest-impact variable on scroll-exit rate
A: Build-up priority (:00–:30)B: Drop-first (:00–:10)C: Silence open (8 bars silent → explosion)
B — Visual TextureAudio-visual coherence hypothesized to affect retention
A: AI-generated abstract psychedelicB: Live-action style, nature/waterC: Japanese aesthetic, brush/ink
C — Title LanguageObserving impact on recommendation algorithm per platform
A: Japanese onlyB: English onlyC: Japanese + romaji
D — Posting TimeVerifying where psychedelic trance core listener base is located
A: Japan time 20:00–22:00B: Europe time 20:00–22:00 CETC: Both platforms simultaneous
3 × 3 × 3 × 3 = 81 theoretical patterns → narrowed to 20 practical patterns (duplicate and meaningless combinations removed)
3.

Execution — TikTok and YouTube Reels in Parallel

The same patterns were posted in parallel to TikTok and YouTube Reels. Per-platform algorithm differences were included as an observation variable.

ItemTikTokYouTube Reels
Posts20 patterns20 patterns (identical)
Length60 sec fixed60 sec fixed
VisualsPer-pattern, same source materialPer-pattern, same source material
MetricsFull-view rate, save rate, share rateRetention rate, CTR, impressions
Window72h post-publish72h post-publish
4.

A Decisive Dataset Arrived

72 hours later, the data was in.

Calling it "decisive" is not an exaggeration. Among the 20 patterns, one produced numbers incomparable to the others. And the variable combination that pattern represented — was not the one I had expected to win.

The full data breakdown will be published in the next post.

Continues in #023