Headphone and speakers

 

The difference between listening to music using speakers or a headphone is striking.

 

When listening to speakers:

When listening to a headphone, each ear will receive 1 and 1 channel only. No mixing, no delay, no indirect sound.

The latter means that the room acoustics are eliminated.

 

We can localize a sound.

HRTF

If a speaker is off axis, the sound will reach the far-ear a fraction later than the near-ear.

Graph of interaural time differences.
Interaural Time Differences (ITD)

 

The far ear is in the shadow of the head, so it hears the sound at a slightly lower volume.
This will also affect the frequency response.


Iinteraural Intensity Difference (IID)

 

These are the two primary cues we use for localization.
This is not enough as the Pinna  (the outer ear structure) is needed too but one thing is obvious, all of this won't happen when using a headphone.


This makes why stereo on a headphone sounds like STEREO, the two channels don't mix and all the effects described above, the Head Related Transfer Function (HRTF) won't happen.

Crossfeed

You can simulate the HRTF by using a crossfeed.
Benjamin B. Bauer was one the pioneers.
A famous article by him: Stereophonic Earphones and Binaural Loudspeakers.
JAES Volume 9 Number 2 pp. 148-151; April 1961.

 

A crossfeed emulates the HRTF by taking the signal from one channel, EQ and delay it and feed it into the other channel and visa versa.

Your media player might have one or there is a plug-in.

A VST plug-in and a lot more about headphones can be found here: BlogOhl

A VST plug-in simulating both the HRTF and the room acoustics is Isone Pro