RIFF1994

RIFF1994 is an update to [RIFF1991] . It documents general updates to the RIFF format, WAVE related updates:

  • [inst] and [smpl] chunks.
  • update fmt to document WAVEFORMATEX
  • fmt .format_tag 0x0000 redefined to "for development purposes". UNKNOWN is now 0xFFFF.
  • compare country codes and language and dialect code tables to current list
  • many additional fmt chunk definitions for registered fmt .format_tags.
  • encoding/decoding algorithm descriptions for ADPCM, MPEG-1 audio

Most of the new fmt chunk definitions are for proprietary compressed formats from the early 90s. With a few exceptions (see fmt for documented tags) I haven't documented them here, because I think they're extremely rare. If you're aware usage of a particlar format_tag that isn't already documented please let me know, and I'll add it to the book.


[RIFF1994] New Multimedia Data Types and Data Techniques 3.0 (1994). See pages 12-22.

URLs:

TODO: implement spec_detail widget

|    id: RIFF1994
|    title: New Multimedia Data Types and Data Techniques
|    publication_year: 1994
|    authority: Microsoft
|    urls: [https://www.mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/Docs/RIFFNEW.pdf]
|    publication_date: 1994-04-15
|    version: 3.0
|    see: pages 12-22

Learning References

Fields defined in this spec

Chunk Name Name
as Specified
Bytes Type Type
as Specified
conditionDescriptionWRID
ISMP id ckID 4 u8[4] FOURCC id = ISMP WRID>RIFF-WAVE>LIST-INFO>ISMP>id
ISMP size ckSize 4 u32 DWORD WRID>RIFF-WAVE>LIST-INFO>ISMP>size
ISMP text (text) size ZSTR ZSTR SMPTE time code of digitization start point expressed as a NULL terminated text string "HH:MM:SS.FF". If performing MCI capture in AVICAP, this chunk will be automatically set based on the MCI start time. WRID>RIFF-WAVE>LIST-INFO>ISMP>text
IDIT id ckID 4 u8[4] FOURCC id = IDIT WRID>RIFF-WAVE>LIST-INFO>IDIT>id
IDIT size ckSize 4 u32 DWORD WRID>RIFF-WAVE>LIST-INFO>IDIT>size
IDIT text (text) size ZSTR ZSTR Digitization Time. Specifies the time and date that digitization commenced. The digitization time is contained in an ASCII string which contains exactly 26 characters and is in the format "Wed Jan 02 02:03:55 1990\n\0". The ctime(), asctime(), functions can be used to create strings in this format. This chunk is automatically added to the capture file based on the current system time at the moment capture is initiated. WRID>RIFF-WAVE>LIST-INFO>IDIT>text
PAD id ckID 4 u8[4] FOURCC id = PAD WRID>RIFF-WAVE>PAD>id
PAD size ckSize 4 u32 DWORD WRID>RIFF-WAVE>PAD>size
PAD padding padding size u8[size] BYTE[ckSize] WRID>RIFF-WAVE>PAD>padding
inst id ckID 4 u8[4] FOURCC id = "inst" WRID>RIFF-WAVE>inst>id
inst size ckSize 4 u32 DWORD WRID>RIFF-WAVE>inst>size
inst unshifted_note bUnshiftedNote 1 u8 BYTE MIDI note number that corresponds to the unshifted pitch of the sample. Valid values range from 0 to 127. WRID>RIFF-WAVE>inst>unshifted_note
inst fine_tune chFineTune 1 i8 CHAR Pitch shift adjustment in cents (or 100ths of a semitone) needed to hit unshifted_note value exactly. fine_tune can be used to compensate for tuning errors in the sampling process. Valid values range from -50 to 50. WRID>RIFF-WAVE>inst>fine_tune
inst gain chGain 1 i8 CHAR Suggested volume setting for the sample in decibels. A value of zero decibels suggests no change in the volume. A value of -6 decibels suggests reducing the amplitude of the sample by two. WRID>RIFF-WAVE>inst>gain
inst low_note bLowNote 1 u8 BYTE Suggested usable MIDI note number range of the sample. Valid values range from 0 to 127. WRID>RIFF-WAVE>inst>low_note
inst high_note bHighNote 1 u8 BYTE Suggested usable MIDI note number range of the sample. Valid values range from 0 to 127. WRID>RIFF-WAVE>inst>high_note
inst low_velocity bLowVelocity 1 u8 BYTE Suggested usable MIDI velocity range of the sample. Valid values range from 0 to 127. WRID>RIFF-WAVE>inst>low_velocity
inst high_velocity bHighVelocity 1 u8 BYTE Suggested usable MIDI velocity range of the sample. Valid values range from 0 to 127. WRID>RIFF-WAVE>inst>high_velocity
smpl id 4 u8[4] FOURCC id = "smpl" WRID>RIFF-WAVE>smpl>id
smpl size 4 u32 DWORD WRID>RIFF-WAVE>smpl>size
smpl manufacturer dwManufacturer 4 u32 DWORD Specifies the MMA Manufacturer code for the intended target device. The high byte indicates the number of low order bytes (1 or 3) that are valid for the manufacturer code. For example, this value will be 0x01000013 for Digidesign (the MMA Manufacturer code is one byte, 0x13); whereas 0x03000041 identifies Microsoft (the MMA Manufacturer code is three bytes, 0x00 0x00 0x41). If the sample is not intended for a specific manufacturer, then this field should be set to zero. WRID>RIFF-WAVE>smpl>manufacturer
smpl product dwProduct 4 u32 DWORD Specifies the Product code of the intended target device for the manufacturer. If the sample is not intended for a specific manufacturer's product, then this field should be set to zero. WRID>RIFF-WAVE>smpl>product
smpl sample_period dwSamplePeriod 4 u32 DWORD Specifies the period of one sample in nanoseconds (normally 1/ samples_per_second from the WAVEFORMAT structure for the RIFF WAVE file -- however, this field allows fine tuning). For example, 44.1 kHz would be specified as 22675 (0x00005893). WRID>RIFF-WAVE>smpl>sample_period
smpl midi_unity_note dwMIDIUnityNote 4 u32 DWORD Specifies the MIDI note which will replay the sample at original pitch. This value ranges from 0 to 127 (a value of 60 represents Middle C as defined by the MMA). WRID>RIFF-WAVE>smpl>midi_unity_note
smpl midi_pitch_fraction dwMIDIPitchFraction 4 u32 DWORD Specifies the fraction of a semitone up from the specified midi_unity_note. A value of 0x80000000 is 1/2 semitone (50 cents); a value of 0x00000000 represents no fine tuning between semitones. WRID>RIFF-WAVE>smpl>midi_pitch_fraction
smpl smpte_format dwSMPTEFormat 4 u32 DWORD Specifies the SMPTE time format used in the smpte_offset field. Possible values are (unrecognized formats should be ignored): 0 - specifies no SMPTE offset (smpte_offset should also be zero). 24 - specifies 24 frames per second. 25 - specifies 25 frames per second. 29 - specifies 30 frames per second with frame dropping ('30 drop'). 30 - specifies 30 frames per second. WRID>RIFF-WAVE>smpl>smpte_format
smpl smpte_offset dwSMPTEOffset 4 u32 DWORD Specifies a time offset for the sample if it is to be syncronized or calibrated according to a start time other than 0. The format of this value is 0xhhmmssff. hh is a signed Hours value [-23..23]. mm is an unsigned Minutes value [0..59]. ss is unsigned Seconds value [0..59]. ff is an unsigned value [0..(smpte_format - 1)]. WRID>RIFF-WAVE>smpl>smpte_offset
smpl sample_loop_count cSampleLoops 4 u32 DWORD Specifies the number (count) of <sample-loop> records that are contained in the <smpl> chunk. The <sample-loop> records are stored immediately following the sampler_data field. WRID>RIFF-WAVE>smpl>sample_loop_count
smpl sampler_data_size cbSamplerData 4 u32 DWORD Specifies the size in bytes of the optional <sampler_data>. Sampler specific data is stored imediately following the <sample-loop> records. The sampler_data field will be zero if no extended sampler specific information is stored in the <smpl> chunk. WRID>RIFF-WAVE>smpl>sampler_data
smpl identifier dwIdentifier 4 u32 DWORD Identifies the unique 'name' of the loop. This field may correspond with a name stored in the <cue > chunk. The name data is stored in the <LIST-adtl> chunk. WRID>RIFF-WAVE>smpl>identifier
smpl type dwType 4 u32 DWORD Specifies the loop type: 0 - Loop forward (normal). 1 - Alternating loop (forward/backward). 2 - Loop backward. 3-31 - reserved for future standard types. 32-? - sampler specific types (manufacturer defined). WRID>RIFF-WAVE>smpl>type
smpl start dwStart 4 u32 DWORD Specifies the startpoint of the loop in samples. WRID>RIFF-WAVE>smpl>start
smpl end dwEnd 4 u32 DWORD Specifies the endpoint of the loop in samples (this sample will also be played). WRID>RIFF-WAVE>smpl>end
smpl fraction dwFraction 4 u32 DWORD Allows fine-tuning for loop fractional areas between samples. Values range from 0x00000000 to 0xFFFFFFFF. A value of 0x80000000 represents 1/2 of a sample length. WRID>RIFF-WAVE>smpl>fraction
smpl play_count dwPlayCount 4 u32 DWORD Specifies the number of times to play the loop. A value of 0 specifies an infinite sustain loop. WRID>RIFF-WAVE>smpl>play_count
fmt id ckID 4 u8[4] FOURCC id = "fmt " WRID>RIFF-WAVE>fmt >id
fmt size ckSize 4 u32 DWORD WRID>RIFF-WAVE>fmt >size
fmt format_tag wFormatTag 2 u16 WORD A number indicating the WAVE format category of the file. The content of the portion of the ‘fmt’ chunk, and the interpretation of the waveform data, depend on this value. WRID>RIFF-WAVE>fmt >format_tag
fmt channels wChannels 2 u16 WORD The number of channels represented in the waveform data, such as 1 for mono or 2 for stereo. WRID>RIFF-WAVE>fmt >channels
fmt samples_per_sec dwSamplesPerSec 4 u32 DWORD The sampling rate (in samples per second) at which each channel should be played. WRID>RIFF-WAVE>fmt >samples_per_sec
fmt avg_bytes_per_sec dwAvgBytesPerSec 4 u32 DWORD The average number of bytes per second at which the waveform data should be transferred. Playback software can estimate the buffer size using this value. WRID>RIFF-WAVE>fmt >avg_bytes_per_sec
fmt block_align wBlockAlign 2 u16 WORD The block alignment (in bytes) of the waveform data. Playback software needs to process a multiple of block_align bytes of data at a time, so the value of block_align can be used for buffer alignment. The block_align field should be equal to the following formula, rounded to the next whole number: channels x ( bits_per_sample / 8 ) WRID>RIFF-WAVE>fmt >block_align
fmt bits_per_sample wBitsPerSample 2 u16 WORD The bits_per_sample field specifies the number of bits of data used to represent each sample of each channel. If there are multiple channels, the sample size is the same for each channel. The block_align field should be equal to the following formula, rounded to the next whole number: channels x ( bits_per_sample / 8 ) WRID>RIFF-WAVE>fmt >bits_per_sample
fmt extra_size cbSize 2 u16 WORD The count in bytes of the extra extensible data. The size in bytes of the extra information in the WAVE format header not including the size of the WAVEFORMATEX structure. (size of fields from format_tag through extra_size inclusive (all fields except id, size and the extra_bytes)) WRID>RIFF-WAVE>fmt >extra_size
fmt extra_bytes size-18 u8[size-18] WORD The extra information as bytes. For use when format_tag is unknown, so this portion can't be parsed deteriministically. WRID>RIFF-WAVE>fmt >extra_bytes
fmt id ckID 4 u8[4] FOURCC id = "fmt " WRID>RIFF-WAVE>fmt >id
fmt size ckSize 4 u32 DWORD WRID>RIFF-WAVE>fmt >size
fmt format_tag wFormatTag 2 u16 WORD = 0x0002 A number indicating the WAVE format category of the file. The content of the portion of the ‘fmt’ chunk, and the interpretation of the waveform data, depend on this value. This must be set to WAVE_FORMAT_ADPCM. WRID>RIFF-WAVE>fmt >format_tag
fmt channels wChannels 2 u16 WORD The number of channels represented in the waveform data, such as 1 for mono or 2 for stereo. WRID>RIFF-WAVE>fmt >channels
fmt samples_per_sec dwSamplesPerSec 4 u32 DWORD Frequency of the sample rate of the wave file. This should be 11025, 22050, or 44100. Other sample rates are allowed, but not encouraged. WRID>RIFF-WAVE>fmt >samples_per_sec
fmt avg_bytes_per_sec dwAvgBytesPerSec 4 u32 DWORD The average number of bytes per second at which the waveform data should be transferred. Playback software can estimate the buffer size using this value. ((samples_per_sec / samples_per_block) * block_align). WRID>RIFF-WAVE>fmt >avg_bytes_per_sec
fmt block_align wBlockAlign 2 u16 WORD The block alignment (in bytes) of the waveform data. Playback software needs to process a multiple of block_align bytes of data at a time, so the value of block_align can be used for buffer alignment.

(samples_per_sec x channels)block_align
8k256
11k256
22k512
44k1024
WRID>RIFF-WAVE>fmt >block_align
fmt bits_per_sample wBitsPerSample 2 u16 WORD This is the number of bits per sample of ADPCM. Currently only 4 bits per sample is defined. Other values are reserved. WRID>RIFF-WAVE>fmt >bits_per_sample
fmt extra_size cbSize 2 u16 WORD The size in bytes of the extended information after the WAVEFORMATEX structure. For the standard WAVE_FORMAT_ADPCM using the standard seven coefficient pairs, this is 32. If extra coefficients are added, then this value will increase. WRID>RIFF-WAVE>fmt >extra_size
fmt samples_per_block nSamplesPerBlock 2 u16 WORD Count of number of samples per block. (((block_align - (7 * channels)) * 8) / (bits_per_sample * channels)) + 2. WRID>RIFF-WAVE>fmt >samples_per_block
fmt coefficient_count nNumCoef 2 u16 WORD Count of the number of coefficient sets defined in coefficients. WRID>RIFF-WAVE>fmt >coefficient_count
fmt coefficients aCoeff 4 * coef_count (i16, i16)[coef_count] aCoeff[wNumCoef] These are the coefficients used by the wave to play. They may be interpreted as fixed point 8.8 signed values. Currently there are 7 preset coefficient sets. They must appear in the following order.

Coef1Coef2
2560
512-256
00
19264
2400
460-208
392-232

Note that if even only 1 coefficient set was used to encode the file then all coefficient sets are still included. More coefficients may be added by the encoding software, but the first 7 must always be the same.

WRID>RIFF-WAVE>fmt >coefficients
fmt id ckID 4 u8[4] FOURCC id = "fmt " WRID>RIFF-WAVE>fmt >id
fmt size ckSize 4 u32 DWORD WRID>RIFF-WAVE>fmt >size
fmt format_tag wFormatTag 2 u16 WORD = 0x0011 A number indicating the WAVE format category of the file. The content of the portion of the ‘fmt’ chunk, and the interpretation of the waveform data, depend on this value. This must be set to WAVE_FORMAT_DVI_ADPCM. WRID>RIFF-WAVE>fmt >format_tag
fmt channels wChannels 2 u16 WORD The number of channels represented in the waveform data, such as 1 for mono or 2 for stereo. WRID>RIFF-WAVE>fmt >channels
fmt samples_per_sec dwSamplesPerSec 4 u32 DWORD Sample rate of the WAVE file. This should be 8000, 11025, 22050 or 44100. Other sample rates are allowed. WRID>RIFF-WAVE>fmt >samples_per_sec
fmt avg_bytes_per_sec dwAvgBytesPerSec 4 u32 DWORD The average number of bytes per second at which the waveform data should be transferred. Playback software can estimate the buffer size using this value. ((samples_per_sec / samples_per_block) * block_align). WRID>RIFF-WAVE>fmt >avg_bytes_per_sec
fmt block_align wBlockAlign 2 u16 WORD The block alignment (in bytes) of the waveform data. Playback software needs to process a multiple of wBlockAlign bytes of data at a time, so the value of wBlockAlign can be used for buffer alignment.

bits_per_sampleblock_align
3(( N * 3 ) + 1 ) * 4 * channels
4(N + 1) * 4 * channels
where N = 0, 1, 2, 3 . . .

The recommended block size for coding is 256 * bytes* min(1, (<samples_per_second>/ 11 kHz)) Smaller values cause the block header to become a more significant storage overhead. But, it is up to the implementation of the coding portion of the algorithm to decide the optimal value for <block_align> within the given constraints (see above). The decoding portion of the algorithm must be able to handle any valid block size. Playback software needs to process a multiple of <block_align> bytes of data at a time, so the value of <block_align> can be used for allocating buffers.

WRID>RIFF-WAVE>fmt >block_align
fmt bits_per_sample wBitsPerSample 2 u16 WORD This is the number of bits per sample of data. DVI ADPCM supports 3 or 4 bits per sample. WRID>RIFF-WAVE>fmt >bits_per_sample
fmt extra_size cbSize 2 u16 WORD The size in bytes of the extended information after the WAVEFORMATEX structure. This should be 2. WRID>RIFF-WAVE>fmt >extra_size
fmt samples_per_block nSamplesPerBlock 2 u16 WORD Count of number of samples per block. (((block_align - (4 * channels)) * 8) / (bits_per)sample * channels)) + 1. WRID>RIFF-WAVE>fmt >samples_per_block

Related Chunks

  • [LIST-adtl] : A LIST of CuePoint annotation chunks.
  • [JUNK] : padding, filler or outdated information
  • [cue] : A series of positions in the waveform data chunk.
  • [fact] : Number of samples for compressed audio in data chunk.
  • [inst] : Pitch, volume, and velocity for playback by sampler.
  • [plst] : Play order for cue points. Very rare.
  • [smpl] : Information needed for use as a sampling instrument.
  • [LIST-INFO] : A LIST of descripitve text chunks.
  • [PAD] : padding, filler or outdated information
  • [CSET] : Character set information. Code page, language, etc. Very Rare.
  • [RIFF-WAVE] : Container structure for multimedia data.
  • [fmt] : Format of audio samples in data chunk.
  • [data] : TODO: description of data