The timing and synchronization problem
Trim the silent priming samples to preserve correct synchronization.
Overview
If an audio playback system attempting to synchronize AAC encoded audio and video does not compensate for encoder delay (that is, does not discard the silent priming samples), the audio and video will be out of synchronization. In the example above, it will be off by 2112 samples—The audio will be 2112 samples behind the video because the first real audio sample is actually the 2113th sample after the beginning of the decoded PCM data.
Therefore, a playback system must trim the silent priming samples to preserve correct synchronization. This trimming by the playback system should be done in two places:
When playback first begins
When the playback position is moved to another location. For example, the user skips ahead or back to another part of the media and begins playback from that new location.