RIFF1994
RIFF1994 is an update to [RIFF1991] . It documents general updates to the RIFF format, WAVE related updates:
- [inst] and [smpl] chunks.
- update
fmt
to document WAVEFORMATEX fmt
.format_tag
0x0000 redefined to "for development purposes". UNKNOWN is now 0xFFFF.- compare country codes and language and dialect code tables to current list
- many additional
fmt
chunk definitions for registeredfmt
.format_tag
s. - encoding/decoding algorithm descriptions for ADPCM, MPEG-1 audio
Most of the new fmt
chunk definitions are for proprietary compressed formats from the early 90s. With a few exceptions (see fmt
for documented tags) I haven't documented them here, because I think they're extremely rare. If you're aware usage of a particlar format_tag
that isn't already documented please let me know, and I'll add it to the book.
[RIFF1994] New Multimedia Data Types and Data Techniques 3.0 (1994). See pages 12-22.
URLs:
TODO: implement spec_detail widget
{ "id": "RIFF1994", "title": "New Multimedia Data Types and Data Techniques", "publication_year": 1994, "authority": "Microsoft", "urls": [ "https://www.mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/Docs/RIFFNEW.pdf" ], "publication_date": "1994-04-15", "version": "3.0", "see": "pages 12-22" }
Learning References
- MIDI Technical Fanatic's Brainwashing Center / tech / WAVE format - opinionated guide to WAV. Very clear writing. Also documents [RIFF1991] chunks.
Fields defined in this spec
Chunk | Name | Name as Specified | Bytes | Type | Type as Specified | condition | Description | WRID | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ISMP | id |
ckID | 4 |
u8[4] |
FOURCC |
id = ISMP
|
WRID>RIFF-WAVE>LIST-INFO>ISMP>id | |||||||||||||||||
ISMP | size |
ckSize | 4 |
u32 |
DWORD |
|
WRID>RIFF-WAVE>LIST-INFO>ISMP>size | |||||||||||||||||
ISMP | text |
(text) | size |
ZSTR |
ZSTR |
|
SMPTE time code of digitization start point expressed as a NULL terminated text string "HH:MM:SS.FF". If performing MCI capture in AVICAP, this chunk will be automatically set based on the MCI start time. | WRID>RIFF-WAVE>LIST-INFO>ISMP>text | ||||||||||||||||
IDIT | id |
ckID | 4 |
u8[4] |
FOURCC |
id = IDIT
|
WRID>RIFF-WAVE>LIST-INFO>IDIT>id | |||||||||||||||||
IDIT | size |
ckSize | 4 |
u32 |
DWORD |
|
WRID>RIFF-WAVE>LIST-INFO>IDIT>size | |||||||||||||||||
IDIT | text |
(text) | size |
ZSTR |
ZSTR |
|
Digitization Time. Specifies the time and date that digitization commenced. The digitization time is contained in an ASCII string which contains exactly 26 characters and is in the format "Wed Jan 02 02:03:55 1990\n\0". The ctime(), asctime(), functions can be used to create strings in this format. This chunk is automatically added to the capture file based on the current system time at the moment capture is initiated. | WRID>RIFF-WAVE>LIST-INFO>IDIT>text | ||||||||||||||||
PAD | id |
ckID | 4 |
u8[4] |
FOURCC |
id = PAD
|
WRID>RIFF-WAVE>PAD>id | |||||||||||||||||
PAD | size |
ckSize | 4 |
u32 |
DWORD |
|
WRID>RIFF-WAVE>PAD>size | |||||||||||||||||
PAD | padding |
padding | size |
u8[size] |
BYTE[ckSize] |
|
WRID>RIFF-WAVE>PAD>padding | |||||||||||||||||
inst | id |
ckID | 4 |
u8[4] |
FOURCC |
id = "inst"
|
WRID>RIFF-WAVE>inst>id | |||||||||||||||||
inst | size |
ckSize | 4 |
u32 |
DWORD |
|
WRID>RIFF-WAVE>inst>size | |||||||||||||||||
inst | unshifted_note |
bUnshiftedNote | 1 |
u8 |
BYTE |
|
MIDI note number that corresponds to the unshifted pitch of the sample. Valid values range from 0 to 127. | WRID>RIFF-WAVE>inst>unshifted_note | ||||||||||||||||
inst | fine_tune |
chFineTune | 1 |
i8 |
CHAR |
|
Pitch shift adjustment in cents (or 100ths of a semitone) needed to hit unshifted_note value exactly. fine_tune can be used to compensate for tuning errors in the sampling process. Valid values range from -50 to 50. |
WRID>RIFF-WAVE>inst>fine_tune | ||||||||||||||||
inst | gain |
chGain | 1 |
i8 |
CHAR |
|
Suggested volume setting for the sample in decibels. A value of zero decibels suggests no change in the volume. A value of -6 decibels suggests reducing the amplitude of the sample by two. | WRID>RIFF-WAVE>inst>gain | ||||||||||||||||
inst | low_note |
bLowNote | 1 |
u8 |
BYTE |
|
Suggested usable MIDI note number range of the sample. Valid values range from 0 to 127. | WRID>RIFF-WAVE>inst>low_note | ||||||||||||||||
inst | high_note |
bHighNote | 1 |
u8 |
BYTE |
|
Suggested usable MIDI note number range of the sample. Valid values range from 0 to 127. | WRID>RIFF-WAVE>inst>high_note | ||||||||||||||||
inst | low_velocity |
bLowVelocity | 1 |
u8 |
BYTE |
|
Suggested usable MIDI velocity range of the sample. Valid values range from 0 to 127. | WRID>RIFF-WAVE>inst>low_velocity | ||||||||||||||||
inst | high_velocity |
bHighVelocity | 1 |
u8 |
BYTE |
|
Suggested usable MIDI velocity range of the sample. Valid values range from 0 to 127. | WRID>RIFF-WAVE>inst>high_velocity | ||||||||||||||||
smpl | id |
4 |
u8[4] |
FOURCC |
id = "smpl"
|
WRID>RIFF-WAVE>smpl>id | ||||||||||||||||||
smpl | size |
4 |
u32 |
DWORD |
|
WRID>RIFF-WAVE>smpl>size | ||||||||||||||||||
smpl | manufacturer |
dwManufacturer | 4 |
u32 |
DWORD |
|
Specifies the MMA Manufacturer code for the intended target device. The high byte indicates the number of low order bytes (1 or 3) that are valid for the manufacturer code. For example, this value will be 0x01000013 for Digidesign (the MMA Manufacturer code is one byte, 0x13); whereas 0x03000041 identifies Microsoft (the MMA Manufacturer code is three bytes, 0x00 0x00 0x41). If the sample is not intended for a specific manufacturer, then this field should be set to zero. | WRID>RIFF-WAVE>smpl>manufacturer | ||||||||||||||||
smpl | product |
dwProduct | 4 |
u32 |
DWORD |
|
Specifies the Product code of the intended target device for the manufacturer . If the sample is not intended for a specific manufacturer's product, then this field should be set to zero. |
WRID>RIFF-WAVE>smpl>product | ||||||||||||||||
smpl | sample_period |
dwSamplePeriod | 4 |
u32 |
DWORD |
|
Specifies the period of one sample in nanoseconds (normally 1/ samples_per_second from the WAVEFORMAT structure for the RIFF WAVE file -- however, this field allows fine tuning). For example, 44.1 kHz would be specified as 22675 (0x00005893). |
WRID>RIFF-WAVE>smpl>sample_period | ||||||||||||||||
smpl | midi_unity_note |
dwMIDIUnityNote | 4 |
u32 |
DWORD |
|
Specifies the MIDI note which will replay the sample at original pitch. This value ranges from 0 to 127 (a value of 60 represents Middle C as defined by the MMA). | WRID>RIFF-WAVE>smpl>midi_unity_note | ||||||||||||||||
smpl | midi_pitch_fraction |
dwMIDIPitchFraction | 4 |
u32 |
DWORD |
|
Specifies the fraction of a semitone up from the specified midi_unity_note . A value of 0x80000000 is 1/2 semitone (50 cents); a value of 0x00000000 represents no fine tuning between semitones. |
WRID>RIFF-WAVE>smpl>midi_pitch_fraction | ||||||||||||||||
smpl | smpte_format |
dwSMPTEFormat | 4 |
u32 |
DWORD |
|
Specifies the SMPTE time format used in the smpte_offset field. Possible values are (unrecognized formats should be ignored):
0 - specifies no SMPTE offset (smpte_offset should also be zero).
24 - specifies 24 frames per second.
25 - specifies 25 frames per second.
29 - specifies 30 frames per second with frame dropping ('30 drop').
30 - specifies 30 frames per second. |
WRID>RIFF-WAVE>smpl>smpte_format | ||||||||||||||||
smpl | smpte_offset |
dwSMPTEOffset | 4 |
u32 |
DWORD |
|
Specifies a time offset for the sample if it is to be syncronized or calibrated according to a start time other than 0. The format of this value is 0xhhmmssff. hh is a signed Hours value [-23..23]. mm is an unsigned Minutes value [0..59]. ss is unsigned Seconds value [0..59]. ff is an unsigned value [0..(smpte_format - 1)]. |
WRID>RIFF-WAVE>smpl>smpte_offset | ||||||||||||||||
smpl | sample_loop_count |
cSampleLoops | 4 |
u32 |
DWORD |
|
Specifies the number (count) of <sample-loop > records that are contained in the <smpl > chunk. The <sample-loop > records are stored immediately following the sampler_data field. |
WRID>RIFF-WAVE>smpl>sample_loop_count | ||||||||||||||||
smpl | sampler_data_size |
cbSamplerData | 4 |
u32 |
DWORD |
|
Specifies the size in bytes of the optional <sampler_data >. Sampler specific data is stored imediately following the <sample-loop > records. The sampler_data field will be zero if no extended sampler specific information is stored in the <smpl > chunk. |
WRID>RIFF-WAVE>smpl>sampler_data | ||||||||||||||||
smpl | identifier |
dwIdentifier | 4 |
u32 |
DWORD |
|
Identifies the unique 'name' of the loop. This field may correspond with a name stored in the <cue > chunk. The name data is stored in the <LIST-adtl > chunk. |
WRID>RIFF-WAVE>smpl>identifier | ||||||||||||||||
smpl | type |
dwType | 4 |
u32 |
DWORD |
|
Specifies the loop type: 0 - Loop forward (normal). 1 - Alternating loop (forward/backward). 2 - Loop backward. 3-31 - reserved for future standard types. 32-? - sampler specific types (manufacturer defined). | WRID>RIFF-WAVE>smpl>type | ||||||||||||||||
smpl | start |
dwStart | 4 |
u32 |
DWORD |
|
Specifies the startpoint of the loop in samples. | WRID>RIFF-WAVE>smpl>start | ||||||||||||||||
smpl | end |
dwEnd | 4 |
u32 |
DWORD |
|
Specifies the endpoint of the loop in samples (this sample will also be played). | WRID>RIFF-WAVE>smpl>end | ||||||||||||||||
smpl | fraction |
dwFraction | 4 |
u32 |
DWORD |
|
Allows fine-tuning for loop fractional areas between samples. Values range from 0x00000000 to 0xFFFFFFFF. A value of 0x80000000 represents 1/2 of a sample length. | WRID>RIFF-WAVE>smpl>fraction | ||||||||||||||||
smpl | play_count |
dwPlayCount | 4 |
u32 |
DWORD |
|
Specifies the number of times to play the loop. A value of 0 specifies an infinite sustain loop. | WRID>RIFF-WAVE>smpl>play_count | ||||||||||||||||
fmt | id |
ckID | 4 |
u8[4] |
FOURCC |
id = "fmt "
|
WRID>RIFF-WAVE>fmt >id | |||||||||||||||||
fmt | size |
ckSize | 4 |
u32 |
DWORD |
|
WRID>RIFF-WAVE>fmt >size | |||||||||||||||||
fmt | format_tag |
wFormatTag | 2 |
u16 |
WORD |
|
A number indicating the WAVE format category of the file. The content of the |
WRID>RIFF-WAVE>fmt >format_tag | ||||||||||||||||
fmt | channels |
wChannels | 2 |
u16 |
WORD |
|
The number of channels represented in the waveform data, such as 1 for mono or 2 for stereo. | WRID>RIFF-WAVE>fmt >channels | ||||||||||||||||
fmt | samples_per_sec |
dwSamplesPerSec | 4 |
u32 |
DWORD |
|
The sampling rate (in samples per second) at which each channel should be played. | WRID>RIFF-WAVE>fmt >samples_per_sec | ||||||||||||||||
fmt | avg_bytes_per_sec |
dwAvgBytesPerSec | 4 |
u32 |
DWORD |
|
The average number of bytes per second at which the waveform data should be transferred. Playback software can estimate the buffer size using this value. | WRID>RIFF-WAVE>fmt >avg_bytes_per_sec | ||||||||||||||||
fmt | block_align |
wBlockAlign | 2 |
u16 |
WORD |
|
The block alignment (in bytes) of the waveform data. Playback software needs to process a multiple of block_align bytes of data at a time, so the value of block_align can be used for buffer alignment. The block_align field should be equal to the following formula, rounded to the next whole number: channels x ( bits_per_sample / 8 ) | WRID>RIFF-WAVE>fmt >block_align | ||||||||||||||||
fmt | bits_per_sample |
wBitsPerSample | 2 |
u16 |
WORD |
|
The bits_per_sample field specifies the number of bits of data used to represent each sample of each channel. If there are multiple channels, the sample size is the same for each channel. The block_align field should be equal to the following formula, rounded to the next whole number: channels x ( bits_per_sample / 8 ) | WRID>RIFF-WAVE>fmt >bits_per_sample | ||||||||||||||||
fmt | extra_size |
cbSize | 2 |
u16 |
WORD |
|
The count in bytes of the extra extensible data. The size in bytes of the extra information in the WAVE format header not including the size of the WAVEFORMATEX structure. (size of fields from format_tag through extra_size inclusive (all fields except id , size and the extra_bytes )) |
WRID>RIFF-WAVE>fmt >extra_size | ||||||||||||||||
fmt | extra_bytes |
size-18 |
u8[size-18] |
WORD |
|
The extra information as bytes. For use when format_tag is unknown, so this portion can't be parsed deteriministically. |
WRID>RIFF-WAVE>fmt >extra_bytes | |||||||||||||||||
fmt | id |
ckID | 4 |
u8[4] |
FOURCC |
id = "fmt "
|
WRID>RIFF-WAVE>fmt >id | |||||||||||||||||
fmt | size |
ckSize | 4 |
u32 |
DWORD |
|
WRID>RIFF-WAVE>fmt >size | |||||||||||||||||
fmt | format_tag |
wFormatTag | 2 |
u16 |
WORD |
= 0x0002
|
A number indicating the WAVE format category of the file. The content of the |
WRID>RIFF-WAVE>fmt >format_tag | ||||||||||||||||
fmt | channels |
wChannels | 2 |
u16 |
WORD |
|
The number of channels represented in the waveform data, such as 1 for mono or 2 for stereo. | WRID>RIFF-WAVE>fmt >channels | ||||||||||||||||
fmt | samples_per_sec |
dwSamplesPerSec | 4 |
u32 |
DWORD |
|
Frequency of the sample rate of the wave file. This should be 11025, 22050, or 44100. Other sample rates are allowed, but not encouraged. | WRID>RIFF-WAVE>fmt >samples_per_sec | ||||||||||||||||
fmt | avg_bytes_per_sec |
dwAvgBytesPerSec | 4 |
u32 |
DWORD |
|
The average number of bytes per second at which the waveform data should be transferred. Playback software can estimate the buffer size using this value. ((samples_per_sec / samples_per_block) * block_align). | WRID>RIFF-WAVE>fmt >avg_bytes_per_sec | ||||||||||||||||
fmt | block_align |
wBlockAlign | 2 |
u16 |
WORD |
|
The block alignment (in bytes) of the waveform data. Playback software needs to process a multiple of block_align bytes of data at a time, so the value of block_align can be used for buffer alignment.
|
WRID>RIFF-WAVE>fmt >block_align | ||||||||||||||||
fmt | bits_per_sample |
wBitsPerSample | 2 |
u16 |
WORD |
|
This is the number of bits per sample of ADPCM. Currently only 4 bits per sample is defined. Other values are reserved. | WRID>RIFF-WAVE>fmt >bits_per_sample | ||||||||||||||||
fmt | extra_size |
cbSize | 2 |
u16 |
WORD |
|
The size in bytes of the extended information after the WAVEFORMATEX structure. For the standard WAVE_FORMAT_ADPCM using the standard seven coefficient pairs, this is 32. If extra coefficients are added, then this value will increase. | WRID>RIFF-WAVE>fmt >extra_size | ||||||||||||||||
fmt | samples_per_block |
nSamplesPerBlock | 2 |
u16 |
WORD |
|
Count of number of samples per block. (((block_align - (7 * channels)) * 8) / (bits_per_sample * channels)) + 2. | WRID>RIFF-WAVE>fmt >samples_per_block | ||||||||||||||||
fmt | coefficient_count |
nNumCoef | 2 |
u16 |
WORD |
|
Count of the number of coefficient sets defined in coefficients . |
WRID>RIFF-WAVE>fmt >coefficient_count | ||||||||||||||||
fmt | coefficients |
aCoeff | 4 * coef_count |
(i16, i16)[coef_count] |
aCoeff[wNumCoef] |
|
These are the coefficients used by the wave to play. They may be interpreted
as fixed point 8.8 signed values. Currently there are 7 preset coefficient sets.
They must appear in the following order.
Note that if even only 1 coefficient set was used to encode the file then all coefficient sets are still included. More coefficients may be added by the encoding software, but the first 7 must always be the same. |
WRID>RIFF-WAVE>fmt >coefficients | ||||||||||||||||
fmt | id |
ckID | 4 |
u8[4] |
FOURCC |
id = "fmt "
|
WRID>RIFF-WAVE>fmt >id | |||||||||||||||||
fmt | size |
ckSize | 4 |
u32 |
DWORD |
|
WRID>RIFF-WAVE>fmt >size | |||||||||||||||||
fmt | format_tag |
wFormatTag | 2 |
u16 |
WORD |
= 0x0011
|
A number indicating the WAVE format category of the file. The content of the |
WRID>RIFF-WAVE>fmt >format_tag | ||||||||||||||||
fmt | channels |
wChannels | 2 |
u16 |
WORD |
|
The number of channels represented in the waveform data, such as 1 for mono or 2 for stereo. | WRID>RIFF-WAVE>fmt >channels | ||||||||||||||||
fmt | samples_per_sec |
dwSamplesPerSec | 4 |
u32 |
DWORD |
|
Sample rate of the WAVE file. This should be 8000, 11025, 22050 or 44100. Other sample rates are allowed. | WRID>RIFF-WAVE>fmt >samples_per_sec | ||||||||||||||||
fmt | avg_bytes_per_sec |
dwAvgBytesPerSec | 4 |
u32 |
DWORD |
|
The average number of bytes per second at which the waveform data should be transferred. Playback software can estimate the buffer size using this value. ((samples_per_sec / samples_per_block) * block_align). | WRID>RIFF-WAVE>fmt >avg_bytes_per_sec | ||||||||||||||||
fmt | block_align |
wBlockAlign | 2 |
u16 |
WORD |
|
The block alignment (in bytes) of the waveform data. Playback software needs to process a multiple of wBlockAlign bytes of data at a time, so the value of wBlockAlign can be used for buffer alignment.
The recommended block size for coding is 256 * |
WRID>RIFF-WAVE>fmt >block_align | ||||||||||||||||
fmt | bits_per_sample |
wBitsPerSample | 2 |
u16 |
WORD |
|
This is the number of bits per sample of data. DVI ADPCM supports 3 or 4 bits per sample. | WRID>RIFF-WAVE>fmt >bits_per_sample | ||||||||||||||||
fmt | extra_size |
cbSize | 2 |
u16 |
WORD |
|
The size in bytes of the extended information after the WAVEFORMATEX structure. This should be 2. | WRID>RIFF-WAVE>fmt >extra_size | ||||||||||||||||
fmt | samples_per_block |
nSamplesPerBlock | 2 |
u16 |
WORD |
|
Count of number of samples per block. (((block_align - (4 * channels)) * 8) / (bits_per)sample * channels)) + 1. | WRID>RIFF-WAVE>fmt >samples_per_block |
Related Chunks
- [plst] : Play order for cue points. Very rare.
- [CSET] : Character set information. Code page, language, etc. Very Rare.
- [cue] : A series of positions in the waveform data chunk.
- [smpl] : Information needed for use as a sampling instrument.
- [RIFF-WAVE] : Container structure for multimedia data.
- [LIST-adtl] : A LIST of CuePoint annotation chunks.
- [inst] : Pitch, volume, and velocity for playback by sampler.
- [fact] : Number of samples for compressed audio in data chunk.
- [LIST-INFO] : A LIST of descripitve text chunks.
- [data] : TODO: description of data
- [PAD] : padding, filler or outdated information
- [fmt] : Format of audio samples in data chunk.
- [JUNK] : padding, filler or outdated information