WRID>RIFF-WAVE>iXML

iXML chunk

"All iXML parameters and objects are OPTIONAL. Readers must not assume that any particular parameters exist."

[IXML]

"... to ensure that no products are developed which enforce schema compatibility, it has been decided NOT to create a schema for iXML."

[IXML]

Buckle-up folks, you're going to see a lot of variation in iXML chunks.

iXML root tags (WRID>RIFF-WAVE>iXML)

TODO: overview

Chunk Name Bytes Type conditionDescriptionWRID
iXML id 4 u8[4] id = "iXML" ...>id
iXML size 4 u32 The raw bytes of this chunk should be interpreted as XML, encoded in any unicode encoding. ...>size
iXML ixml_version String The version number of the iXML specification used to prepare the iXML audio file. This version appears in the front page at http://www.ixml.info, and takes the form of x.y where x and y are whole numbers, for example 1.51 ...>ixml_version
iXML project String The name of the project to which this file belongs. This might typically be the name of the motion picture or program which is in production. ...>project
iXML scene String The name of the scene / slate being recorded. For US system this might typically be 32, 32A, 32B, A32B, 32AB etc. For UK system this might typically be a incrementing number with no letters. ...>scene
iXML tape String The SoundRoll which identifies a group of recordings. Normally, the SoundRoll is a vital component of workflow to differentiate audio recorded with time of day on different days. In other words for 2 (completely different) recordings each covering a period around 11am, the soundroll would differentiate them by (typically) telling you which shooting day this recording applies to. Some projects may turnover sound more than once per day, and increment the soundroll at this point. In any event, the soundroll should change at least once in any 24 hour period. Some systems change the soundroll for every recording which is also a valid option, in effect using the soundroll as a unique file identifer (although this function is explicitly provided with the iXML FILE_UID parameter). ...>tape
iXML take String The number of the take in the current scene or slate. Usually this will be a simple number, although variations for things like wild tracks may yield takes like 1, 2, 3, WT1, WT2 etc. ...>take
iXML take_type [Enum] (New in iXML v2.0) A dictionary based tag allowing selection from a defined list of values to explicitly categorise the type/purpose/function of the current take. This tag overlaps with the existing NO_GOOD / FALSE_START / WILD_TRACK which are deprecated in iXML 2.0, This tag can contain multiple entries, separated by commas and can be expanded in the future with additional dictionary entries, detailed in the TAKE_TYPE dictionary. ...>take_type
iXML (map to take_type) - (Deprecated in iXML v2.0, superceded by TAKE_TYPE) This parameter allows a recorder to mark this recording as "no good" (ie, of no use whatsoever, and in effect to be deleted). The value should be TRUE or FALSE. If absent, this should be assumed FALSE. ...>no_good
iXML (map to take_type) - (Deprecated in iXML v2.0, superceded by TAKE_TYPE) This parameter allows a recorder to mark this recording as a false start, this may indicate that another file could exist with the same take number. Typically this file might also be marked as <NO_GOOD>. The value should be TRUE or FALSE. If absent, this should be assumed FALSE. ...>false_start
iXML (map to take_type) - (Deprecated in iXML v2.0, superceded by TAKE_TYPE) This parameter allows a recorder to mark this recording as a wild track, with no specific relationship to any take, although it might be marked with a specific scene, for example when recording ambience for a given location. The value should be TRUE or FALSE. If absent, this should be assumed FALSE. ...>wild_track
iXML circled bool This parameter allows a recorder to mark this recording as a circle-take. The value should be TRUE or FALSE. If absent, this should be assumed FALSE. ...>circled
iXML file_uid String A unique number which identifies this physical FILE, regardless of the number of channels etc. If your system employs a unique SoundRoll per recording, your FILE_UID and TAPE parameters should be the same. ...>file_uid
iXML ubits String The userbits associated with this recording. This may have been extracted from incoming timecode when the file was recorded, or generated by the recorder from the date, or any other metadata. Typically the userbits are rarely used now because other more explicit metadata supercedes this function. ...>ubits
iXML note String A free text note to add user metadata to the recording. This might typically used to communicate information such as TAIL SLATE, NO SLATE, or to warn of noise interruptions - PLANE OVERHEAD etc. ...>note

SYNC_POINT_LIST tags (WRID>RIFF-WAVE>iXML>SYNC_POINT_LIST )

TODO: overview

Example:

<SYNC_POINT_LIST>
  <SYNC_POINT_COUNT>2</SYNC_POINT_COUNT>
  <SYNC_POINT>
    <SYNC_POINT_TYPE>RELATIVE</SYNC_POINT_TYPE>
    <SYNC_POINT_FUNCTION>PRE_RECORD_SAMPLECOUNT</SYNC_POINT_FUNCTION>
    <SYNC_POINT_LOW>930000</SYNC_POINT_LOW>
    <SYNC_POINT_HIGH>0</SYNC_POINT_HIGH>
    <SYNC_POINT_EVENT_DURATION>0</SYNC_POINT_EVENT_DURATION>
  </SYNC_POINT>
  <SYNC_POINT>
    <SYNC_POINT_TYPE>ABSOLUTE</SYNC_POINT_TYPE>
    <SYNC_POINT_FUNCTION>SLATE_GENERIC</SYNC_POINT_FUNCTION>
    <SYNC_POINT_COMMENT>Near start</SYNC_POINT_COMMENT>
    <SYNC_POINT_LOW>1242</SYNC_POINT_LOW>
    <SYNC_POINT_HIGH>0</SYNC_POINT_HIGH>
    <SYNC_POINT_EVENT_DURATION>0</SYNC_POINT_EVENT_DURATION>
  </SYNC_POINT>
</SYNC_POINT_LIST>
Chunk GroupName Bytes Type DescriptionWRID
iXML SYNC_POINT_LIST sync_point_count String Should appear at the start of the SYNC POINT LIST to allow readers to prepare memory for the following list. ...>sync_point_count
iXML SYNC_POINT_LIST sync_point String Multiple sample based counts which represents a sync point for this recording. ...>sync_point
Chunk GroupName Bytes Type DescriptionWRID
iXML SYNC_POINT point_type String Can be either RELATIVE or ABSOLUTE, which represents a sample frames count from the start of the file, or an absolute sample count since midnight. Note the relative sample count needs multiplying by the wordsize and number of channels to translated into a byte count from the start of the file. For 32 bit sample counts, only the _LOW parameter is needed, with _HIGH set to zero. For high sample rates, the _HIGH parameter is also needed to communicate 24 hour sample counts. ...>point_type
iXML SYNC_POINT function String Determines the function of this sync point. There are a number of defined functions for sync points, indicating things like the Pre-Record Sample Count, or the primary slate. See the Sync Point Function dictionary for a list of defined functions. ...>function
iXML SYNC_POINT comment String Allows a note for each sync point to be entered, for example "camera a", "camera b", etc. ...>comment
iXML SYNC_POINT low String For 32 bit sample counts, only the _LOW parameter is needed, with _HIGH set to zero. ...>low
iXML SYNC_POINT high String For high sample rates, the _HIGH parameter is also needed to communicate 24 hour sample counts. ...>high
iXML SYNC_POINT event_duration String Allows a sync point to be a region with a defined start and stop point/duration. This can be useful for playbacks where you would like the play to stop automatically at the end of the sync event. A value of 0 in the SYNC_POINT_EVENT_DURATION implies a non-duration (marker) or unknown duration event. (Note: units for this value are not specified in iXML spec.) ...>event_duration

ASWG tags (WRID>RIFF-WAVE>iXML>ASWG)

TODO: overview

Chunk GroupName Bytes Type DescriptionWRID
iXML ASWG content_type String Content Type (sfx/music/dialog/haptic/impulse/mixed), category:General ...>content_type
iXML ASWG project String Project name asset was developed for, category:General ...>project
iXML ASWG originator String Designer, category:General ...>originator
iXML ASWG originator_studio String Name of originating studio, category:General ...>originator_studio
iXML ASWG notes String General information not covered in other fields, category:General ...>notes
iXML ASWG session String Application (Pro Tools/Reaper etc.) session name, category:General ...>session
iXML ASWG state String File version: mastered, processed, raw, placeholder, category:General ...>state
iXML ASWG editor String Name of editor, category:General ...>editor
iXML ASWG mixer String Mix engineer, category:General ...>mixer
iXML ASWG fx_chain_name String Name of FX chain used on file, Reaper chain name, for example, category:General ...>fx_chain_name
iXML ASWG is_generated String Content is AI generated, or contains elements/sections that are AI generated. true/false, category:General ...>is_generated
iXML ASWG mastering_engineer String Name of the mastering engineer, category:General ...>mastering_engineer
iXML ASWG origination_date String Date of original upload of asset in format yyyy-MM-dd, category:General ...>origination_date
iXML ASWG channel_config String Channel configuration of the file: mono, stereo, LCR, Quad, 5.0, 5.1, 7.0, 7.1, 12.2, ambisonic, category:Format ...>channel_config
iXML ASWG ambisonic_format String Ambisonic format: #p, #h#p, #h#v. eg: 5p, 3h1v, 4h2p, category:Format" format="#p, #h#p, #h#v. eg: '5p', '3h1v', '4h2p' ...>ambisonic_format
iXML ASWG ambisonic_chn_order String Ambisonic channel order: fuma, acn, category:Format ...>ambisonic_chn_order
iXML ASWG ambisonic_norm String Ambisonic normalization: snd3, maxn, n3d, category:Format ...>ambisonic_norm
iXML ASWG mic_type String Microphone(s) used. Where multiple mics used, prefix with channel number: 1-Neumann U87i, 2-AKG C414, category:Recording ...>mic_type
iXML ASWG mic_config String Microphone configuration: Mono, AB, XY, ORTF, MS, category:Recording ...>mic_config
iXML ASWG mic_distance String Microphone distance in meters OR headmounted - 1m, 2m, 0.3m, head, category:Recording ...>mic_distance
iXML ASWG recording_loc String Recording location, category:Recording ...>recording_loc
iXML ASWG is_designed String SFX: Is the sound designed, or is it a raw recording - true if designed, false if raw recording, category:Recording ...>is_designed
iXML ASWG rec_engineer String Name of the recording engineer, category:Recording ...>rec_engineer
iXML ASWG rec_studio String Music: Recording Studio, category:Recording ...>rec_studio
iXML ASWG impulse_location String Impulse: Location of impulse, category:Impulse ...>impulse_location
iXML ASWG category String UCS compliant SFX category, category:Sound Effects" editor="UcsEditor" editorInfo="Category ...>category
iXML ASWG sub_category String UCS compliant SFX sub-category, category:Sound Effects" editor="UcsEditor" editorInfo="SubCategory ...>sub_category
iXML ASWG cat_id String UCS compliant SFX category ID, category:Sound Effects" editor="UcsEditor" editorInfo="CatID ...>cat_id
iXML ASWG user_category String UCS complaint user category, category:Sound Effects ...>user_category
iXML ASWG user_data String UCS compliant user data, category:Sound Effects ...>user_data
iXML ASWG vendor_category String UCS compliant vendor category, category:Sound Effects ...>vendor_category
iXML ASWG fx_name String UCS compliant FX name, category:Sound Effects ...>fx_name
iXML ASWG library String UCS compliant library, category:Sound Effects ...>library
iXML ASWG creator_id String UCS compliant SFX creator/publisher, category:Sound Effects ...>creator_id
iXML ASWG source_id String UCS compliant SFX SourceID, category:Sound Effects ...>source_id
iXML ASWG rms_power String RMS power of file, category:Audio Features ...>rms_power
iXML ASWG loudness String Integrated loudness of file, measured with ITU-R BS1770-3 compliant metering, category:Audio Features ...>loudness
iXML ASWG loudness_range String Loudness Range - EBU 3342 compliant, category:Audio Features ...>loudness_range
iXML ASWG max_peak String Maximum sample value, in dBFS, category:Audio Features ...>max_peak
iXML ASWG spec_density String Spectral density of file - amount of power at a standard set of frequency ranges. Freq ranges to be defined***, category:Audio Features ...>spec_density
iXML ASWG zero_cross_rate String Zero Cross Rate, average frequency of entire file, category:Audio Features ...>zero_cross_rate
iXML ASWG papr String Peak to average power ratio, category:Audio Features ...>papr
iXML ASWG text String Dialogue: Transcript of the dialogue file, category:Dialogue ...>text
iXML ASWG efforts String Dialogue: Whether the file contains efforts, dialogue or a mix of the two - True, False, Mixed, category:Dialogue ...>efforts
iXML ASWG effort_type String Effort type - strain, pain, category:Dialogue ...>effort_type
iXML ASWG projection String Dialogue projection level. 1- whispered, 2- spoken, 3- raised, 4- projected, 5- shouted, category:Dialogue ...>projection
iXML ASWG language String Dialogue language - ISO639-1 Language Code, category:Dialogue" format="## e.g 'en' ...>language
iXML ASWG timing_restriction String Dialogue timing restriction: wild, time, lip, na (not applicable), category:Dialogue ...>timing_restriction
iXML ASWG character_name String Dialogue: Character name for dialogue files, category:Dialogue ...>character_name
iXML ASWG character_gender String Dialogue: Sex/gender of character, category:Dialogue ...>character_gender
iXML ASWG character_age String Dialogue: Age of (human) character, category:Dialogue ...>character_age
iXML ASWG character_role String Dialogue: Whether the character is a main (significant) character or a background character: significant, background, category:Dialogue ...>character_role
iXML ASWG actor_name String Dialogue: Name of actor, category:Dialogue ...>actor_name
iXML ASWG actor_gender String Dialogue: Sex/gender of actor: male, female, category:Dialogue ...>actor_gender
iXML ASWG director String Dialogue: Name of director, category:Dialogue ...>director
iXML ASWG direction String Director’s notes, for context; explaining the scene and character motivation., category:Dialogue ...>direction
iXML ASWG fx_used String Effects used on file eg. Radio, category:Dialogue ...>fx_used
iXML ASWG usage_rights String Dialogue: Code for usage rights of content: *Internal, category:Dialogue ...>usage_rights
iXML ASWG is_union String Dialogue: Was recording done under a union contract: true, false, category:Dialogue ...>is_union
iXML ASWG accent String Regional accent of the spoken dialogue, if applicable, category:Dialogue ...>accent
iXML ASWG emotion String Emotional content present in the delivery of the dialogue, category:Dialogue ...>emotion
iXML ASWG addressee_gender String Gender of addressee; male/female/malegroup/femalegroup/mixedgroup, category:Dialogue ...>addressee_gender
iXML ASWG is_formal String Either formal or informal, depending on the relationship between the speaker and the addressee. formal/informal, category:Dialogue ...>is_formal
iXML ASWG dev_language String Original language used by developer, category:Dialogue ...>dev_language
iXML ASWG billing_code String Music: project billing code, category:Music ...>billing_code
iXML ASWG composer String Music: Composer, category:Music ...>composer
iXML ASWG artist String Music: Name of artist , category:Music ...>artist
iXML ASWG song_title String Music: Song title, category:Music ...>song_title
iXML ASWG genre String Music: Genre, category:Music ...>genre
iXML ASWG sub_genre String Music: Sub-genre, category:Music ...>sub_genre
iXML ASWG producer String Music: Producer name, category:Music ...>producer
iXML ASWG music_sup String Music: Music supervisor, category:Music ...>music_sup
iXML ASWG instrument String Music: Instrument on track/stem, category:Music ...>instrument
iXML ASWG music_publisher String Music: PublishtimeSiger, category:Music ...>music_publisher
iXML ASWG rights_owner String Music: Owner of the recorded work, category:Music ...>rights_owner
iXML ASWG is_source String Music: Is this an asset as the composer delivered (source) or an edit of that source? true, false, category:Music ...>is_source
iXML ASWG is_loop String Is the content loopable - true, false, category:Music ...>is_loop
iXML ASWG intensity String Music: intensity, category:Music ...>intensity
iXML ASWG is_final String Music: Is cue temp or final, category:Music ...>is_final
iXML ASWG order_ref String Order reference of cue, if applicable *Internal, category:Music ...>order_ref
iXML ASWG is_ost String Music: Is part of the Original Soundtrack, category:Music ...>is_ost
iXML ASWG is_cinematic String Music: Asset is associated with a cinematic, category:Music ...>is_cinematic
iXML ASWG is_licensed String Music: Asset is licensed and owned by 3rd party, category:Music ...>is_licensed
iXML ASWG is_diegetic String Music: Track is diegetic in game, category:Music ...>is_diegetic
iXML ASWG music_version String Music: Version number, category:Music ...>music_version
iXML ASWG isrc_id String Music: ISRC code, category:Music" format="## ### ## ##### e.g 'UK AAA 05 00001' ...>isrc_id
iXML ASWG tempo String Music: Tempo in bpm, category:Music ...>tempo
iXML ASWG time_sig String Music: Time Signature. e.g 3:4, category:Music" format="A:B e.g '3:4' ...>time_sig
iXML ASWG in_key String Music: In key, category:Music ...>in_key

BEXT and LOUDNESS tags

"If this object appears (redundantly) in a WAVE file (for continuity of format) it is ESSENTIAL that the values match those in the official bext chunk in the same file."

[IXML]

The iXML spec includes duplicate fields from the bext chunk in the <BEXT> and <LOUDNESS> tags. This gives a place to store BEXT data in non WAV formats where other chunks may not exist. However, in WAV files it creates the possibility for confusion if they values differ. On top of that, the spec gives contradtictory information on which to use if they're different.

For users, be aware that this is a tricky area between specifications and different applications will sometimes display different data when values are different between these duplicated fields. If you see applications showing different data, check the metadata with a tool which will show you both.

For implementors focusing on maxium correctness:

  • Warn users if both versions of fields exist and they have different values.
  • Do write BEXT data into the bext chunk.
  • Do NOT write <BEXT> and <LOUDNESS> tags when writing the iXML chunk.

However, I've heard of software which doesn't read from bext at all if iXML is present... so...

For implementors focusing on maximum compatibility:

  • Warn users if both versions of fields exist and they have different values.
  • Do write BEXT data into the bext chunk.
  • Do write <BEXT> and <LOUDNESS> tags when writing the iXML chunk. Abort and give an error if values differ between the duplicate fields

Learning References

Specifications

  • [IXML] Standard for embedded metadata in production media files (2004).
  • [ASWG-G006] iXML-Extension from Sony PlayStation Studios' Audio Standards Working Group (2021).