AES3 (also known as AES/EBU) is a standard for the exchange of digital audio signals between professional audio devices. AES3 was jointly developed by the Audio Engineering Society (AES) and the European Broadcasting Union (EBU). An AES3 signal can carry two channels of PCM audio over several transmission media including balanced lines, unbalanced lines, and optical fiber. The standard was first published in 1985 and has been revised in 1992 and 2003.
The development of standards for digitising analog audio, as used to interconnect both professional and domestic audio equipment, began in the late 1970s in a joint effort between the Audio Engineering Society and the European Broadcasting Union, and culminated in the publishing of AES3 in 1985. Early on, the standard was frequently known as AES/EBU. Both AES and EBU versions of the standard exist. Variants using different physical connections--essentially consumer versions of AES3 for use within the domestic "Hi-Fi" environment using connectors more commonly found in the consumer market--are specified in IEC 60958. These variants are commonly known as S/PDIF.
The standard has been revised in 1992 and 2003 and is published in AES and EBU versions. Worldwide, it is the most commonly used method for digitally interconnecting audio equipment.
The AES3 standard parallels part 4 of the international standard IEC 60958. Of the physical interconnection types defined by IEC 60958, three are in common use.
Type I connections use balanced, 3-conductor, 110-ohm twisted pair cabling with XLR connectors. Type I connections are most often used in professional installations and are considered the AES3 standard connector. The hardware interface is usually implemented using RS-422 line drivers and receivers.
|Cable end||Device end|
|Input||XLR male plug||XLR female jack|
|Output||XLR female plug||XLR male jack|
Type II connections use unbalanced, 2-conductor, 75-ohm coaxial cable with RCA connectors. Type II connections are used in most often in consumer audio installations and are often called coaxial S/PDIF connections.
|Cable end||Device end|
|Input||RCA male plug||RCA female jack|
|Output||RCA male plug||RCA female jack|
Type III Optical connections use optical fiber--usually plastic, but occasionally glass--with F05 connectors, which are more commonly known by their Toshiba brand name, TOSLINK. Like Type II, Type III Optical connections are also used in consumer audio installations and are often called optical S/PDIF connections.
|Cable end||Device end|
|Input||F05/TOSLINK male plug||F05/TOSLINK female jack|
|Output||F05/TOSLINK male plug||F05/TOSLINK female jack|
The AES-3id standard defines a 75-ohm BNC electrical variant of AES3. This uses the same cabling, patching and infrastructure as analogue or digital video, and is thus common in the broadcast industry.
The precursor of the IEC 60958 Type II specification was the Sony/Philips Digital Interface, or S/PDIF. S/PDIF and AES3 are similar in many ways and are interchangeable at the protocol level, but at the physical level they specify different electrical signaling levels and impedances, which may be significant in some applications.
AES3 was designed primarily to support stereo PCM encoded audio in either DAT format at 48 kHz or CD format at 44.1 kHz. No attempt was made to use a carrier able to support both rates; instead, AES3 allows the data to be run at any rate, and encoding the clock and the data together using biphase mark code (BMC).
Each bit occupies one time slot.
Each audio sample (of up to 24 bits) is combined with four flag bits and a synchronisation preamble which is four time slots long to make a subframe of 32 time slots.
Two subframes (A and B, normally used for left and right audio channels) make a frame. Frames contain 64 time slots and are produced once per sample time. This determines the clock rate.
At the highest level, each 192 consecutive frames are grouped into an audio block. While samples repeat each frame time, metadata is only transmitted once per audio block.
At the default 48 kHz sample rate, there are 250 audio blocks per second, and 3,072 kilobits per second with a biphase clock of 6.144 MHz 
The 32 time slots of each subframe are assigned as follows:
|0-3||Preamble||A synchronisation preamble (biphase mark code violation) for audio blocks, frames, and subframes.|
|4-7||Auxiliary sample (optional)||A low-quality auxiliary channel used as specified in the channel status word, notably for producer talkback or recording studio-to-studio communication.|
|Audio sample||One sample stored with most significant bit (MSB) last. If the auxiliary sample is used, bits 4-7 are not included. Data with smaller sample bit depths always have MSB at bit 27 and are zero-extended towards the least significant bit (LSB).|
|28||Validity (V)||Unset if the audio data are correct and suitable for D/A conversion. During the presence of defective samples, the receiving equipment may be instructed to mute its output. It is used by most CD players to indicate that concealment rather than error correction is taking place.|
|29||User data (U)||Forms a serial data stream for each channel (with 1 bit per frame), with a format specified in the channel status word.|
|30||Channel status (C)||Bits from each frame of an audio block are collated giving a 192-bit channel status word. Its structure depends on whether AES3 or S/PDIF is used.|
|31||Parity (P)||Even parity bit for detection of errors in data transmission. (I.e. bits 4-31 have an even number of ones.)|
This is a specially coded preamble that identify the subframe and its position within the audio block. They are not normal BMC-encoded data bits, although they do still have zero DC bias.
Three preambles are possible :
They are called X, Y, Z in the AES3 standard; and M, W, B in IEC 958 (an AES extension).
The 8-bit preambles are transmitted in time allocated to the first four time slots of each subframe (time slots 0 to 3). Any of the three marks the beginning of a subframe. X or Z marks the beginning of a frame, and Z marks the beginning of an audio block.
| 0 | 1 | 2 | 3 | | 0 | 1 | 2 | 3 | Time slots _____ _ _____ _ / \_____/ \_/ \_____/ \_/ \ Preamble X _____ _ ___ ___ / \___/ \___/ \_____/ \_/ \ Preamble Y _____ _ _ _____ / \_/ \_____/ \_____/ \_/ \ Preamble Z ___ ___ ___ ___ / \___/ \___/ \___/ \___/ \ All 0 bits BMC encoded _ _ _ _ _ _ _ _ / \_/ \_/ \_/ \_/ \_/ \_/ \_/ \_/ \ All 1 bits BMC encoded | 0 | 1 | 2 | 3 | | 0 | 1 | 2 | 3 | Time slots
In two-channel AES3, the preambles form a pattern of ZYXYXYXY..., but it is straightforward to extend this structure to additional channels (more subframes per frame), each with a Y preamble, as is done in the MADI protocol.
As stated before there is one channel status bit in each subframe, making one 192 bit word for each channel in each block. This 192 bit word is usually presented as 192/8 = 24 bytes. The contents of the channel status word are completely different between the AES3 and S/PDIF standards, although they agree that the first channel status bit (byte 0 bit 0) distinguishes between the two. In the case of AES3, the standard describes in detail how the bits have to be used. Here is a summary of the channel status word:
SMPTE timecode timestamp data can be embedded within AES3 digital audio signals. It can be used for synchronization and for logging and identifying audio content. According to John Ratcliff's Timecode: A user's guide, it is embedded as a 32-bit binary word in bytes 18 to 21 of the channel status data.
In 1977, stimulated by the growing need for standards in digital audio, the AES Digital Audio Standards Committee was formed.
Bytes 18 to 21, Bits 0 to 7: Time of day sample address code. Value (each Byte): 32-bit binary value representing the first sample of current block. LSBs are transmitted first. Default value shall be logic "0". Note: This is the time-of-day laid down during the source encoding of the signal and shall remain unchanged during subsequent operations. A value of all zeros for the binary sample address code shall, for the purposes of transcoding to real time, or to time codes in particular, be taken as midnight (i.e., 00 h, 00 mm, 00 s, 00 frame). Transcoding of the binary number to any conventional time code requires accurate sampling frequency information to provide the sample accurate time.
Manage research, learning and skills at defaultLogic. Create an account using LinkedIn or facebook to manage and organize your IT knowledge. defaultLogic works like a shopping cart for information -- helping you to save, discuss and share.