Why AI strugle for theory voicing in song analysis?

Sacculus · Apr 8, 2026 at 1:38 AM

I try to figure out PYT of Michael Jackson to look at the exact Greg Phillinganes Rhodes chords and the voicing but none of ChatGPT, Gemini, Grok, hook theory and YouTube piano tutorials have the same insight for the answer!?
It seems difficult to trust someone if you don't have very good ears...

Ads Master

Djord Emer · Apr 8, 2026 at 1:42 AM

They're all still very bad a that, yeah. I'm not sure why but I'm dead sure I wouldn't trust them for anything music theory related... maybe music history.

macros mk2 · Apr 8, 2026 at 2:43 AM

a LLM probably thinks Beethoven is a Saint Bernard, then will tell you're special for asking.

flier0244 · Apr 8, 2026 at 3:53 AM

Can you try splitting stems, then loading into Melodyne?

Try splitting stems, then feed to AI?

Lois Lane · Apr 8, 2026 at 1:43 PM

Perhaps AI LLMs struggle
For theory voicing in a song
Because...
It is not actually down to
Dry mathematical formulae, and
Even though they cannibalize
The physical notes of musicians
They can never digest our souls.

Last edited: Apr 8, 2026 at 1:43 PM

orbitbooster · Apr 8, 2026 at 8:18 AM

Well, I'll be a bit off topic, but I don't trust much tools like melodyne and similar, they get the basic line, but nuances are hard to get, and I don't ask help to AI for that matter.

However, i must say I was astonished by a software from Aurally Sound - Prism.
I thought was yet another trascribe tool comparable to all the others, but the results I got with it were jaw-dropping.
I played a pretty complex piano piece, and it mostly got it all right, beside some sustain pedal passage that could shuffle notes.
If I can i'll post original and the reconstruction.

This tool is only for solo instruments.

PulseWave · Apr 8, 2026 at 8:27 AM

Sacculus said: ↑

I try to figure out PYT of Michael Jackson to look at the exact Greg Phillinganes Rhodes chords and the voicing but none of ChatGPT, Gemini, Grok, hook theory and YouTube piano tutorials have the same insight for the answer!?
It seems difficult to trust someone if you don't have very good ears...
Click to expand...

Exceptionally, here is the AI answer:

What’s tripping you up is that PYT exists in more than one musical “version”: the released Thriller track, the earlier Michael/Greg demo, and then transcription guesses that simplify the keyboard part into generic chords. The demo is credited to Michael Jackson and Greg Phillinganes, while the released song was reworked by Quincy Jones and James Ingram, so the exact Rhodes voicings people share online often don’t match each other.

Why the answers disagree
Most tutorial sites reduce the part to the harmonic shell, not the actual voicing. For example, one common chord transcription gives the song as B7, Em7, G, A, which may describe the progression but not the exact Rhodes shape or inversion. Another analysis notes the song is harmonically more complex than average and centers around B Dorian / B minor regions, which also helps explain why a simple four-chord answer can feel incomplete.

What the Rhodes part likely is
The Rhodes comping in PYT is best understood as stacked voicings and voice-leading, not block chords copied from a chart. In practice, that usually means:

left hand holding root or rootless support,
right hand using 3rds, 7ths, 9ths, and sometimes sus tones,
smooth movement between shapes instead of big jumps.
So when people say “it’s B7 to Em7,” they may be naming the harmony correctly while missing the actual keyboard color that makes Greg’s part sound right.

How to get closer to the real voicing

A reliable way to reconstruct it is:
Identify the bass note at each hit.
Listen for the guide tones first, especially 3rd and 7th.
Add color tones like 9ths or 11ths only after the shell works.
Match the comping rhythm before worrying about note-for-note density.
That approach usually gets closer to the original than copying a piano tutorial with simplified chord names. The reason is that Rhodes parts in records like this are often arranged as voiced textures, not “lead-sheet chords”.

Practical takeaway
If your goal is the exact Greg Phillinganes feel, you probably need to treat the part as a transcription job, not a chord-chart job. Start from the progression B7 / Em7 / G / A as a harmonic skeleton, then build a voicing-by-voicing ear transcription over the recording, because the missing detail is in the inversion, spacing, and rhythmic placement rather than just the chord symbols.

A useful next step is to isolate one short phrase of the song and transcribe only the Rhodes attack points; that will reveal the actual voicing much faster than trying to decode the whole track at once.

ArticStorm · Apr 8, 2026 at 8:51 AM

Sacculus said: ↑

I try to figure out PYT of Michael Jackson to look at the exact Greg Phillinganes Rhodes chords and the voicing but none of ChatGPT, Gemini, Grok, hook theory and YouTube piano tutorials have the same insight for the answer!?
It seems difficult to trust someone if you don't have very good ears...
Click to expand...

Usually like i noticed, there are tracks who dont really follow traditional music theory. Its the uncommon, surprising ideas, which sound fresh to our ears. Thats what makes human creativity surperior to thes models still.

You can however use basics from the models and then figure out your own contribution to the idea ...

orbitbooster · Apr 8, 2026 at 9:15 AM

PulseWave said: ↑

If your goal is the exact Greg Phillinganes feel, you probably need to treat the part as a transcription job, not a chord-chart job
Click to expand...

That's why most automatic tools fail in giving the full transcription (but Prism was near).

PulseWave said: ↑

A reliable way to reconstruct it is:
Identify the bass note at each hit.
Listen for the guide tones first, especially 3rd and 7th.
Add color tones like 9ths or 11ths only after the shell works.
Match the comping rhythm before worrying about note-for-note density.
That approach usually gets closer to the original than copying a piano tutorial with simplified chord names. The reason is that Rhodes parts in records like this are often arranged as voiced textures, not “lead-sheet chords”.
Click to expand...

That BTW are roughly the steps I do to transcribe (not always sorted this way):
1 - get the bass, so just from the I and V you'll get the base scale (really!), no matter what type is;
2 - guess if major or minor;
3 - guess if a modal scale (not very important, you'll get it in step 4 and 5);
4 - guess the melody (usually just monophonically);
5 - guess basic triad chords playing along the bass (like Am, C, etc.);
6 - guess the chord inversions if any (in most cases they are);
7 - guess if there are 7th, 9th, 11th, sus, and so on;
8 - arrange with the right voicing (arpeggio, etc.) and add harmony to the melody if any.

For all these steps, unless you want to hit the keyboard hoping that chance is with you, you need at least some basics in music theory aka harmony theory.

Edit: I forgot: there are times that songs modulate to other scales, that complicates a little, but not that much.

Crinklebumps · Apr 8, 2026 at 9:47 AM

AI gives better responses with specific, detailed questions. If you can break down what you're hearing and convey what you think is being played in words it will get you closer, just keep hammering it with questions, telling it that it's wrong and it will keep analysing. I'm talking about a chat AI but in my experience they're very good at this kind of interaction. For example, AI will be able to give you a great deal of insight into Greg Phillinganes style, usual choices. It can't analyse audio but it doesn't need to, it already 'knows' the song structure of PYT and the exact chords Greg plays - eg., (the following is Copilot AI):

Here’s a clean, accurate, musician‑level breakdown of Greg Phillinganes’ style — the kind of insight an AI can give, because this is pattern‑based, not audio‑based.

I’ll structure it the way a session player would explain another session player.

Greg Phillinganes: Core Style Traits
1. Ultra‑clean, ultra‑precise voicings
Greg’s Rhodes and synth parts are surgically tidy.
He avoids muddy clusters and prefers:

3‑note shells (rootless when the bass is covered)

tight 4‑note voicings with a clear top voice

spread voicings with 10ths or 6ths in the left hand

upper‑structure triads for colour (especially over dominant chords)

He plays like someone who knows the exact harmonic function of every note.

2. Pop‑funk economy
Greg never overplays.
He’s the opposite of a gospel‑chops keyboardist.

His parts are:

short

rhythmic

intentional

supportive

He plays hooks, not “keyboard parts”.

Listen to any MJ track he’s on — the Rhodes is basically a rhythmic instrument.

3. The “Phillinganes snap” on Rhodes
He has a very recognisable Rhodes attack:

slightly harder velocity

bright tine

tight envelope

almost clav‑like articulation

He often accents the off‑beats with a little extra bite.

This is why his Rhodes parts cut through dense mixes without being loud.

4. Syncopation that locks with the drums
Greg plays with the drummer, not over them.

Typical traits:

anticipations on the “e” or “a”

ghosted left‑hand notes

right‑hand stabs that mirror the guitar

rhythmic displacement (he loves pushing a chord 1/16th early)

On PYT specifically, the Rhodes is basically part of the percussion section.

5. He outlines harmony with minimal notes
Greg often implies full chords with:

3‑note voicings

dyads

octave + colour tone

triads over alternate bass notes

He trusts the bass player and the arrangement.

This is why his parts feel so clean.

6. Signature harmonic tendencies
He gravitates toward:

major 9

minor 11

dominant 13

sus2 / sus4

add9

quartal voicings in passing

chromatic approach chords (very subtle)

He rarely plays plain triads unless the arrangement demands it.

7. Synth parts that are “keyboardist clean,” not “sound designer messy”
Greg’s synth lines are:

monophonic

precise

hook‑driven

rhythmically tight

harmonically simple but placed perfectly

He doesn’t noodle.
He plays statements.

How this applies to PYT
Without analysing the audio directly, we can still say:

the Rhodes part is tight, syncopated, and minimal

the voicings are clean, upper‑mid register, no mud

he uses add9 and 6/9 colours

the left hand is light, mostly outlining roots or 10ths

the right hand plays stabs, not pads

the part is locked to the drums, not floating

This is classic Greg.

⭐ If you want, I can go deeper
I can break down:

typical Greg Phillinganes voicing shapes

his left‑hand habits

his right‑hand rhythmic vocabulary

how he approaches MJ‑era Rhodes comping

what he tends to do on dominant chords

how he voices minor chords in pop‑funk

Or I can give you a Greg‑style voicing dictionary you can use to reverse‑engineer PYT.

Just tell me which direction you want to go.

orbitbooster · Apr 8, 2026 at 12:21 PM

Crinklebumps said: ↑

AI gives better responses with specific, detailed questions. If you can break down what you're hearing and convey what you think is being played in words it will get you closer, just keep hammering it with questions, telling it that it's wrong and it will keep analysing. I'm talking about a chat AI but in my experience they're very good at this kind of interaction. For example, AI will be able to give you a great deal of insight into Greg Phillinganes style, usual choices. It can't analyse audio but it doesn't need to, it already 'knows' the song structure of PYT and the exact chords Greg plays - eg., (the following is Copilot AI):

Here’s a clean, accurate, musician‑level breakdown of Greg Phillinganes’ style — the kind of insight an AI can give, because this is pattern‑based, not audio‑based.

I’ll structure it the way a session player would explain another session player.

Greg Phillinganes: Core Style Traits
1. Ultra‑clean, ultra‑precise voicings
Greg’s Rhodes and synth parts are surgically tidy.
He avoids muddy clusters and prefers:

3‑note shells (rootless when the bass is covered)

tight 4‑note voicings with a clear top voice

spread voicings with 10ths or 6ths in the left hand

upper‑structure triads for colour (especially over dominant chords)

He plays like someone who knows the exact harmonic function of every note.

2. Pop‑funk economy
Greg never overplays.
He’s the opposite of a gospel‑chops keyboardist.

His parts are:

short

rhythmic

intentional

supportive

He plays hooks, not “keyboard parts”.

Listen to any MJ track he’s on — the Rhodes is basically a rhythmic instrument.

3. The “Phillinganes snap” on Rhodes
He has a very recognisable Rhodes attack:

slightly harder velocity

bright tine

tight envelope

almost clav‑like articulation

He often accents the off‑beats with a little extra bite.

This is why his Rhodes parts cut through dense mixes without being loud.

4. Syncopation that locks with the drums
Greg plays with the drummer, not over them.

Typical traits:

anticipations on the “e” or “a”

ghosted left‑hand notes

right‑hand stabs that mirror the guitar

rhythmic displacement (he loves pushing a chord 1/16th early)

On PYT specifically, the Rhodes is basically part of the percussion section.

5. He outlines harmony with minimal notes
Greg often implies full chords with:

3‑note voicings

dyads

octave + colour tone

triads over alternate bass notes

He trusts the bass player and the arrangement.

This is why his parts feel so clean.

6. Signature harmonic tendencies
He gravitates toward:

major 9

minor 11

dominant 13

sus2 / sus4

add9

quartal voicings in passing

chromatic approach chords (very subtle)

He rarely plays plain triads unless the arrangement demands it.

7. Synth parts that are “keyboardist clean,” not “sound designer messy”
Greg’s synth lines are:

monophonic

precise

hook‑driven

rhythmically tight

harmonically simple but placed perfectly

He doesn’t noodle.
He plays statements.

How this applies to PYT
Without analysing the audio directly, we can still say:

the Rhodes part is tight, syncopated, and minimal

the voicings are clean, upper‑mid register, no mud

he uses add9 and 6/9 colours

the left hand is light, mostly outlining roots or 10ths

the right hand plays stabs, not pads

the part is locked to the drums, not floating

This is classic Greg.

⭐ If you want, I can go deeper
I can break down:

typical Greg Phillinganes voicing shapes

his left‑hand habits

his right‑hand rhythmic vocabulary

how he approaches MJ‑era Rhodes comping

what he tends to do on dominant chords

how he voices minor chords in pop‑funk

Or I can give you a Greg‑style voicing dictionary you can use to reverse‑engineer PYT.

Just tell me which direction you want to go.

Click to expand...

But that requires too a knowledge of (basic or advanced - it depends) music theory.
I guess many want the supper ready instead.

Last edited: Apr 8, 2026 at 12:21 PM

Gre89 · Apr 8, 2026 at 4:34 PM

Try to get the original studio stems. Maybe they are even on YT or somewhere. Use surgical EQs for further analysis of the track if needed.

jazzzz · Apr 8, 2026 at 8:38 PM

Why don't you ask Greg?

Sacculus · Apr 11, 2026 at 4:47 AM

This is what Grok give me but it didn't make sense... There's errors everywhere!!??

1. Main voicings by section (recommended for piano)
Here are the voicings that most closely match the original sound, adapted for piano (right hand + bass in the left hand or with the pedal). The notes are listed from bass to treble (left to right).
Intro / Vamp (before the verse)
Classic progression:
Bm7 → E7 → Bm7 → F#7sus4 (or variations)
Bm7: B (bass) – F# – A – D – E (open voicing: B-F#-A-D-E)
(very common: left hand B-F#, right hand A-D-E)
E7: E (bass) – G# – B – D – F# (E7(9))
F#7sus4: F# – B – C# – E – A (highly suspended, creates tension)
Verse – the most iconic
Basic progression (Hooktheory + transcriptions):
Bm → Gmaj7 → Bm (with variations)
Typical Rhodes voicings by Greg Phillinganes (bright, percussive):
Bm9 (frequently used instead of plain Bm):
Left hand: B – F#
Right hand: A – C# – D – E → full voicing: B – F# – A – C# – D – E (Bm9)
Gmaj7 / A (very common slash chord):
Bass: A
Chords: G – B – D – F# (or G – B – D – F# – A for Gmaj9/A)
C#m7 / G# (pre-chorus):
Bass: G#
Chords: B – D – E – G# (C#m7/G#)
Bm7 / F#:
Bass: F#
Chords: A – B – D – E (very soft and open)
These voicings are often played with the right hand fairly high (4th–5th octave) to leave room for Louis Johnson’s slap bass and the handclaps.
Pre-Chorus
Ascending, tense progression:
Dmaj7/A → C#m7/G# → Bm7/F# → A(add9)
Recommended voicings:
Dmaj7/A: A (bass) – C# – D – F# – A – C# (very soft)

clone · Apr 11, 2026 at 5:19 AM

I wanted to see how GPT would answer your larger question. Would it blame the user, the song, or itself?

Short answer: the model isn’t “hearing” what you’re hearing—it’s guessing from patterns, and this kind of question punishes guessing hard.

Let’s use P.Y.T. (Pretty Young Thing) as the example. Those Greg Phillinganes Rhodes parts are exactly the kind of thing AI struggles with, for a few concrete reasons:

First problem: voicing ≠ chord name
Most systems are trained to identify chords (Cmaj7, F#min9, etc.), not how they’re voiced. But in real playing—especially Rhodes/funk/pop stuff—the voicing is everything:

spread vs closed

missing roots

extensions on top

inversions that don’t match the bass

Two players can both be “playing Cmaj9” and sound completely different. AI tends to collapse that into one label and call it done.

Second: dense, layered recordings
Tracks like PYT aren’t a clean piano recording. You’ve got:

Rhodes

synth layers

bass implying different harmony

vocal harmonies adding extensions

So what is “the chord”?
AI (and even humans) can legitimately disagree because the harmony is distributed across instruments.

Third: training data problem
Most AI models learned theory from:

lead sheets

MIDI files

simplified transcriptions

Not from:

isolated multitracks

pro session voicings

nuanced studio performances

So when you ask for exact Greg Phillinganes voicings, you’re asking for something that barely exists in its training data.

Fourth: no true audio grounding (in many cases)
Unless the system is explicitly doing high-quality transcription from audio (and most aren’t doing that deeply), it’s not:

isolating the Rhodes

tracking exact note stacks

resolving inversions precisely

It’s pattern-matching based on “this song is usually described as X harmony.”

Fifth: even humans disagree on this stuff
If you pull 5 legit transcriptions of a complex pop/R&B track, you’ll often get:

different chord labels

different voicings

different interpretations of extensions

So when AI gives inconsistent answers, it’s not just failing—it’s reflecting a real ambiguity.

Here’s the blunt truth:
If you want exact voicings like what Phillinganes played, AI is currently the wrong tool.

What actually works better:

slow the track down (Transcribe!, Logic Flex, etc.)

isolate midrange (EQ out bass/kick)

play along and lock the top notes first

then fill in inner voices

Or go even more direct:

find live footage of Greg Phillinganes playing

or trusted transcribers who specialize in R&B/jazz voicings

If you want, I can take a specific section of PYT (like verse or chorus) and give you a realistic Rhodes-style voicing breakdown instead of generic chord names. That’s where this actually becomes useful.

*]:pointer-events-auto scroll-mt-(--header-height)" dir="auto" data-turn-id="a07197c0-ddbc-4221-9209-4211d16a44ee" data-testid="conversation-turn-3" data-scroll-anchor="false" data-turn="user">
ok

*]:pointer-events-auto scroll-mt-[calc(var(--header-height)+min(200px,max(70px,20svh)))]" dir="auto" data-turn-id="request-WEB:d85d4f80-ad94-464c-810a-bb8f4236471a-1" data-testid="conversation-turn-4" data-scroll-anchor="true" data-turn="assistant">
Alright—let’s actually do this properly instead of hand-waving.

We’ll break down a realistic Rhodes-style voicing approach for the main groove in P.Y.T. (Pretty Young Thing), based on how someone like Greg Phillinganes actually plays—not just chord labels.

I’m going to focus on the core vamp (verse groove feel) because that’s where the signature voicing lives.

What’s really going on (important mindset)
Forget “what are the chords” for a second.
This part is built on:

upper-structure triads

rootless voicings

tight top-note movement

bass defining the harmony more than the keyboard

That’s why AI falls apart—it tries to label instead of hear structure.

Core harmonic idea (simplified)
You can think of the groove roughly orbiting:

B major / G# minor tonal space

But the Rhodes is not playing full chords. It’s playing partial stacks that imply them.

Rhodes voicing approach (this is the useful part)
Shape 1 (home color)
Right hand:

D# – F# – A#

That’s basically a B major triad in 1st inversion (no root)

Left hand:

either nothing OR light B (often implied by bass instead)

Already this tells you something:
He’s not hammering roots. He’s staying out of the bass.

Movement (this is the signature sound)
Now shift that shape down slightly while keeping it smooth:

From:

D# – F# – A#

To:

C# – F# – G#

Now you’ve got:

a suspended / add9-ish flavor

Notice:

top note drops (A# → G#)

middle stays (F# anchor)

bottom moves (D# → C#)

That kind of minimal motion voice-leading is classic.

Funk/Rhodes trick he’s using
Instead of big chords, it’s:

3-note clusters

often in the mid register

with rhythmic stabs, not sustained pads

And very important:

slight timing looseness

not perfectly quantized (huge for feel)

What AI usually gets wrong here
AI will say something like:

“Bmaj7 → C#m7 → F#sus”

That’s not wrong, but it’s useless if you’re trying to play it.

Because the actual playable reality is:

small moving shapes

not full chords

bass + Rhodes = complete harmony

If you want to actually play it
Start here (don’t overcomplicate):

Right hand loop:

D#–F#–A#
→ C#–F#–G#
→ back

Keep it tight, rhythmic, and slightly detached.

Then:

let your left hand barely touch roots (or skip entirely)

imagine the bass is doing the heavy lifting

The honest takeaway
You weren’t wrong to be frustrated.

The question you saw is basically asking:

“Why can’t AI reconstruct a studio musician’s nuanced voicing choices inside a dense mix?”

That’s one of the hardest music tasks there is—even for trained humans.

If you want, next step I’d suggest is:

we zoom into one exact timestamp (like 0:30–0:45)

I’ll map the voicings tighter, note-for-note, including octave placement

That’s how you actually crack these tracks instead of chasing vague chord charts.

orbitbooster · Apr 11, 2026 at 8:23 AM

clone said: ↑

Most systems are trained to identify chords (Cmaj7, F#min9, etc.), not how they’re voiced
Click to expand...

Nailed it!

clone said: ↑

What actually works better:

slow the track down (Transcribe!, Logic Flex, etc.)

isolate midrange (EQ out bass/kick)

play along and lock the top notes first

then fill in inner voices

Or go even more direct:

find live footage of Greg Phillinganes playing

or trusted transcribers who specialize in R&B/jazz voicings

Click to expand...

Nailed it!

clone said: ↑

That’s one of the hardest music tasks there is—even for trained humans.
Click to expand...

Nailed twice!

BTW there is another solution: get the score sheet if available. But that calls back again staff reading that is part of the music theory.

One last point, something that I find uncomfortable, and that's our personal limits: there are pieces that no matter our knowledge or skill, are out of reach within a lifespan.

Lois Lane · Apr 11, 2026 at 8:32 AM

orbitbooster said: ↑

our personal limits: there are pieces that no matter our knowledge or skill, are out of reach within a lifespan.
Click to expand...

Absolutely!! I await a future lifetime when I shall learn the saxophone!!!

Similar Threads - strugle theory voicing	Forum	Date
Applied Compression Theory explained to dummies.	Mixing and Mastering	Mar 27, 2026
Soundtheory Kraftur	Software News	Jun 18, 2024
Games on PC to learn music theory and/or keyboard?	Education	Feb 16, 2024
For Sale - Oeksound, Soundtheory, Soundtoys, Mixwave, Submission Audio	Selling / Buying	Dec 19, 2023
UVI releases NOCTUA by Venus Theory - FREE!	Software News	Dec 14, 2023

Why AI strugle for theory voicing in song analysis?

Sacculus Kapellmeister

Ads Master

Djord Emer Audiosexual

macros mk2 Audiosexual

flier0244 Member

Lois Lane Audiosexual

orbitbooster Audiosexual

PulseWave Audiosexual

ArticStorm Moderator Staff Member

orbitbooster Audiosexual

Crinklebumps Audiosexual

orbitbooster Audiosexual

Gre89 Member

jazzzz Platinum Record

Sacculus Kapellmeister

clone Audiosexual

orbitbooster Audiosexual

Lois Lane Audiosexual

PROFESSIONAL AUDIO LOVERS

Why AI strugle for theory voicing in song analysis?

Sacculus Kapellmeister

Ads Master

Djord Emer Audiosexual

macros mk2 Audiosexual

flier0244 Member

Lois Lane Audiosexual

orbitbooster Audiosexual

PulseWave Audiosexual

ArticStorm Moderator Staff Member

orbitbooster Audiosexual

Crinklebumps Audiosexual

orbitbooster Audiosexual

Gre89 Member

jazzzz Platinum Record

Sacculus Kapellmeister

clone Audiosexual

orbitbooster Audiosexual

Lois Lane Audiosexual

Useful Searches