MSA / MAS / AMAS
Hyper-Dimensional Data File Specification


Discussion Document
October 2014




Prologue

The MSA / MAS / AMAS HyperDimensional Data File (HMSA, for short) is intended to be a common format to permit the exchange of hyper-dimensional microscopy and microanalytical data between different software applications. The expected applications include:

In addition to storing hyperdimensional data, the HMSA file format is suitable for storing conventional microscopy and microanalysis data such as spectra and images, as well as experimental conditions and other metadata.

I. Current status

This document presents a working draft of the HMSA specification, which should be used as the basis for discussion and feedback. This specification should be sufficient for the development of experimental software, on the understanding that the format is not finalised and major sections may still be subject to change. This version of the specification should not be used as the basis for production software.

Further work

The following issues are identified as requiring further work:

Change log

Changes in October 2014

The October 2014 update included the following changes:

Changes in March 2014

The March 2014 update included the following changes:

Changes in July 2013

The July 2013 update included the following changes:

Changes in February 2013

The February 2013 update included the following changes:

II. Contributors

The MSA/MAS/AMAS HyperDimensional Data File format specification is being developed by the Standards Committee of the Microscopy Society of America (MSA), chaired by Nestor J. Zaluzec, with contributions from members of the MSA, the Microanalysis Society (MAS), and the Australian Microbeam Analysis Society (AMAS). The draft specification presented in this document is based on contributions from:

To comment on or contribute to the development of the MSA/MAS/AMAS HyperDimensional Data File, please contact the Standards Committee chair, Nestor J. Zaluzec.


Contents


1. Overview

1.1 Design considerations

The following requirements were considered in the design of this file format:

  1. Modern experimental apparatus produce data with high dimensionality, such as a spectral maps, and 3D serial section maps. Therefore, this file format must store data of high dimensionality.
  2. High dimensionality data is necessarily very large, and consequently difficult and time consuming to store or transfer over networks. The file format must therefore be as compact as is reasonably practical.
  3. Many microanalytical techniques produce structurally similar hyperdimensional data. To simplify implementation of common tools, this file format must use a common format to store data produced by different analytical techniques.
  4. The data format must preserve the scientific accuracy and meaning of the data. Therefore, the file format must store data in without loss of precision, and include sufficient experimental parameters to permit the correct interpretation of the data.
  5. To achieve the intended mission of being a widely-supported exchange format, the file format must achieve acceptance from instrument and software vendors, and from the microanalysis community. Consequently, the file format must be useful, easy to understand, and easy to implement.
  6. Furthermore, as the file format is intended for exchange, it must be readable (and implementable) in any commonly available programming languages and environments. The format must therefore be platform independent, and not require any proprietary or special software or hardware.

1.2 Binary and XML file pair

To satisfy the above requirements, the proposed MSA/MAS/AMAS Hyperdimensional Data File format uses a pair of files; a simple binary file to efficiently store the experimental data, and a text-based XML file to store the experimental conditions. The advantages of this dual format are:

1.2.1 HMSA general structure

The HMSA file is a binary file format consisting of an 8 byte (64 bit) unique identifier (See Section 2.4.3: UID attribute), followed by one or more dataset objects. The location, size and layout of the binary dataset objects are described in the dataset definitions within the XML file (See Section 5: The <Data> list element), and are not described within the HMSA file. The values contained within the HMSA file datasets cannot therefore be read or interpreted without the corresponding dataset definition within the XML file.

The byte ordering of the binary file shall be little-endian (Intel/Windows style).

1.2.2 XML general structure

The XML file consists of human-readable hierarchical text, using a subset of the XML version 1.0 format (see Section 2.2: XML specification).

The structures within the XML file are strictly defined and self-descriptive, so that the XML file can be read and interpreted correctly without a finely detailed study of the specification. This strict definition does, however, require software that writes the XML files to diligently adhere to the specification.

The structure of the XML file is described in detail in Section 2: XML file specification.

1.2.3 HMSA-XML association

Because the XML file is required to interpret the HMSA file, the HMSA/XML files must be associated in such a way that software that loads the HMSA file can readily and unambiguously locate its associated XML file. The principal method by which the HMSA and XML files are associated together is by file name. The HMSA/XML file pairs shall share the same file name, except for their file extensions, shall be transferred together, and stored in the same directory.

Users may inadvertently rename or move one member of the file pair, which would prevent software from finding the correct experimental conditions or binary data. To reduce this risk, the XML and HMSA files each contain an identifier that is, for all intents and purposes, unique to each individual pair of files. By comparing the unique identifiers (UIDs) given in the XML and HMSA file, software can be assured that binary data matches the description in the XML file, and vice versa. Furthermore, by searching the file system for XML or HMSA files containing the UID, software may automatically find renamed or relocated files. This pseudo-unique identifier is a 64-bit code, providing a possible 264 (~1.84 × 1019) unique values. The UID is described further in Section 2.4.3: UID attribute.

1.3 HyperDimensional data

The HMSA file distinguishes between two forms of dataset dimensionality:

The HMSA format supports any combination of collection and datum dimensionality. However, this specification does not require software to implement support for all combinations of collection and datum dimensions. The principle combinations of collection and datum dimensionality envisaged for this file format are summarised in the table below:

0D datum 1D datum 2D datum
0D collection N/A * A single spectrum acquisition
(e.g. spectrometer dark noise.)
A single 2D image acquisition
(e.g. diffraction pattern) **
1D collection A linescan or time sequence of single-valued data
(e.g. Ti Kα counts, BSE yield, vacuum pressure.)
A linescan or time sequence of spectra. A linescan or time sequence of 2D data.
2D collection An X/Y map of single-valued data
(e.g. a BSE image)**
An X/Y hyperspectral map
(i.e. one spectrum per pixel)
An X/Y 'hyperimage' map
(i.e. one image per pixel)
3D collection An X/Y/Z serial section map of single valued data. An X/Y/Z hyperspectral serial section map An X/Y/Z hyperimage serial section map.

* Data with 0 collection dimensions and 0 datum dimensions implies a dataset comprising of one single-valued measurement. Single-valued data should be stored in the XML file in preference to the HMSA file to maximise readability.

** There is potential for ambiguity when storing a 2D image such as a BSE image or an EBSD pattern as to whether there should be 2 collection dimensions and 0 datum dimensions, or vice versa. The following principles should be followed:

Further templates for specific cases such as spectra, XY rastered spectral maps, etc. are defined in Appendix A.

1.4 Unicode and internationalisation

The HMSA XML file format requires the use of the UTF-8 Unicode character encoding, permitting native-language representations of the non-English names for authors, organisations, specimens, locations, etc. However, for maximum interoperability, the names of XML elements and attributes shall be given in US English using the ASCII character set. Furthermore, the values of elements shall be given in US English where possible, with non-English text provided as an alternative translation to the English text using an alt-lang-[xx][-YY] attribute (see Section 2.5.5: Alternative language attributes.)

In addition to supporting non-English scripts, the use of Unicode for the HMSA XML file allows the use of scientifically meaningful non-Latin characters such as α, μ, and Å. However, these characters may be un-typeable on many standard keyboards, and so they should only be used when no unambiguous Latin character equivalent is available. Please refer to Appendix C or a list of permitted unicode characters in units and unit prefixes.

In cases where the Unicode character set includes multiple code points for visually indistinguishable glyphs, HMSA XML files shall consistently use one code point in preference to any alternatives (see Appendix D).

1.5 Minimalism

The raison d'être of the HMSA file format is to enable the convenient exchange of files between different software packages. To succeed in this goal, the HMSA file format must be unambiguous in its specification, and easy to implement. To this end, the HMSA XML file format has been designed with a minimalist core of mandatory features that are necessary only to properly determine the layout of the hyperdimensional dataset(s) in the HMSA binary data file. The structure of the dataset definition in the XML file is strictly defined, with neither descriptive nor optional features (see Section 5: The <Data> list element).

All useful experimental conditions (such as spectrometer gain and offset) and other metadata (such as author or date) are recommended, but optional. Nevertheless, to ensure compatibility, the structure and format of these optional conditions and metadata elements are defined in this document (see Section 3: The <Header> list element and Section 4: The <Conditions> list element).

The absolute minimum effort possible to produce a conformant HMSA XML file is demonstrated in the 'baseline' HMSA XML example file in Appendix E. This file contains no optional elements such as conditions or metadata. Important conditions such as microscope settings and spectrometer calibration are not included, meaning that the spectrum can only be interpreted as raw channels, and the user is responsible for determining energy calibration and accelerating voltage. For reference, the same file is also provided in the 'typical' profile (ibid), which includes all common experimental conditions and metadata.

1.6 Extensibility

In addition to being simple and easy to implement (See Section 1.5: Minimalism), a key feature of the HMSA file format is that it is extensible. Although this specification enumerates a number of common condition objects (See Appendix B), the specification permits the unlimited use of additional, un-specified experimental conditions to be stored in the HMSA XML file (See Section 4: The <Conditions> list element). Critically, the well-formed, hierarchical and self-descriptive nature of XML allows these additional conditions to be included without imposing an additional burden on applications to support any or all of these conditions. In effect, applications are not required to read, write or interpret any conditions, but may elect to provide additional scientific meaning or interpretation to the data by including additional conditions to any degree of detail.

For example, consider the case of a typical XEDS spectral map collected in an SEM. A 'typical' HMSA file would include conditions for spectrometer calibration and beam accelerating voltage. This information is sufficient for a basic interpretation of the map data, such as peak identification in spectra and generating elemental ROI images. A more detailed file may also include a Faraday cup beam current measurement, and even intensity measurements from standard reference materials so as to allow quantification of elemental compositions. An extreme example may also include all electron gun conditions, lens currents, and the like, so as to allow the comparison or monitoring of microscope and detector performance between instruments or over time. However, not all SEMs have Faraday cups, and nor do all experiments require quantification or performance monitoring, and thus these elements are purely optional.

In addition to supporting unlimited experimental conditions, the HMSA specification also supports the inclusion of multiple binary datasets in a single HMSA/XML file pair. Typical usage cases for multiple dataset files are:

Again, support for multiple datasets is provided in such a way as to impose no additional burden on applications that that expect only single-dataset files. Applications are not required to support multiple datasets.

1.7 What HMSA does not do

To reduce the complexity of implementing HMSA support, certain features or usage cases have been excluded:


2. XML file specification

2.1 XML general structure

The XML file consists of human-readable hierarchical text, using a subset of the XML version 1.0 format (see Section 2: XML specification). The structures within the XML file are strictly defined and self-descriptive, so that the XML file can be read and interpreted correctly without a finely detailed study of the specification. This strict definition does, however, require software that writes the XML files to diligently adhere to the specification.

The XML files have the following general structure:

In XML, this looks like:

<?xml version="1.0" [...] ?>
<MSAHyperDimensionalDataFile [...] > 
	<Header>
		[...]
	</Header>
	<Conditions>
		[...]
	</Conditions>
	<Data>
		[...]
	</Data>
</MSAHyperDimensionalDataFile>

The XML declaration, <MSAHyperDimensionalDataFile> document root element, <Header>, <Conditions> and <Data> elements are described in the following sections:

2.2 XML Specification

The HMSA XML file specification follows the W3C Extensible Markup Language (XML) 1.0 Recommendation (Fifth Edition) [*], except where noted below.

2.2.1 XML features not supported

To simplify the tasks of reading, writing and interpreting HMSA XML files, this specification excludes XML certain features that may complicate implementation for no benefit in this application. HMSA XML files shall not contain the following XML feature declared in the XML 1.0 recommendation (section numbers in parentheses):

The HMSA XML format also explicitly does not support the following associated W3C XML specifications:

2.2.2 XML conformance and validation

The W3C XML specification defines two levels of compliance; conformant, and valid. Conformant XML files satisfy all requirements of the XML specification, such as well-formedness. Valid XML files are conformant XML files, and also contain document type definitions (DTDs) that specify the structure and range of all elements in the XML file. Valid XML files can therefore be validated for completeness and correctness by a generic validating XML parser, without reference to an external specification of the file format. In effect, valid XML files are self-specifying.

In the interests of minimising the size and complexity of HMSA XML files, XML document and element type definitions were excluded from the HMSA XML specification (See Section 2.2.1: XML features not supported). Consequently, a HMSA XML documents are conformant XML files, but not valid XML files.

2.2.3 Character encodings

HMSA XML files shall only be encoded in the Unicode UTF-8 character encoding. To provide backwards compatibility with the ASCII character set, HMSA XML files should use the basic Latin characters and symbols in the range of U+0000 to U+007F in preference to visually similar Unicode characters when it is customary to do so, and whenever such substitution does not change the meaning or introduce ambiguity. For example, 'Ka' should be used to represent the Kα x-ray in the Siegbahn notation, and 'um' should be use to represent μm. Further character substitutions are specified in Appendix D.

2.2.4 Byte order markers

Byte order markers (BOM) are not required for UTF-8 encoded text files, but may be automatically inserted at the start of the file stream by certain text editors. Thus, HMSA XML files may, but should not, contain the UTF-8 BOM (0xEFBBBF), and shall not contain byte order markers for other character encodings (e.g. 0xFFFE for UTF-16LE on Windows, or 0xFEFF for UTF-16BE on Unix/Linux/Mac). HMSA XML parsers shall process and ignore UTF-8 BOM, if present.

2.2.5 Case sensitivity

As defined in the XML standard, the structure of an XML file is case sensitive. The names of all elements and attributes shall be written with the case specified in this document. The values of attributes and elements are also assumed to be case sensitive, unless specified otherwise in this document.

To avoid confusion, identifier attributes such as Name and ID shall have unique values in case-insensitive comparison.

2.3 XML declaration

The HMSA XML file shall begin with an XML declaration of the form:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>

The attributes of the XML declaration are described below.

2.3.1 XML version attribute

The version attribute of the XML declaration shall have the value "1.0". XML version 1.1 or subsequent versions are not supported by this version of the HMSA/XML specification.

2.3.2 XML character encoding attribute

The encoding attribute of the XML declaration shall have the value "UTF-8". No other character encoding is permitted for HMSA XML files.

2.3.3 XML standalone attribute

The standalone attribute of the XML declaration shall have the value "yes". HMSA XML files do not support external document type definitions.

2.4 Document root element

The root element of the HMSA XML file shall named <MSAHyperDimensionalDataFile> and be declared in the following form:

<MSAHyperDimensionalDataFile Version="1.0" xml:lang="en-US" UID="193581B9DD220ABB">

The attributes of the root element are described below.

2.4.1 Version attribute

The HMSA version shall be declared as "1.0" in the Version attribute.

2.4.2 xml:lang attribute

The default language of the document shall be US English, which shall be declared using an xml:lang attribute of the document root element with a value of "en-US".

2.4.3 UID attribute

A pseudo-unique identifier shall be provided in the UID attribute in the form of 16 hexadecimal characters (0-9, A-F), representing a 64-bit binary value.

The 64-bit unique identifier, which is stored in both the XML and binary HMSA files, serves two purposes:

  1. To verify that a HMSA file and XML file match. This is required because HMSA files cannot be decoded without the XML description, and using the wrong XML description could result in corrupted results or undefined software behaviour.
  2. To allow software to search for a missing component of the file pair such as a renamed or moved file.

To ensure maximum efficacy of the UID mechanism, software that writes or modifies HMSA files shall create new UIDs when:

The UID may be retained unchanged when:

To further guarantee the integrity of HMSA UIDs, the following is required of UID generation algorithms:

The recommended method of generating a UID is to use a one-way cryptographic hash function, such as the NIST-published SHA-1 algorithm, with a diverse set of inputs to ensure sufficient hash entropy.

2.5 XML Parameter element formats

To maximise compatibility and prevent data misinterpretation, the format of elements and attributes used to store arbitrary parameters in the HMSA XML are strictly defined below.

2.5.1 Numerical data types

The data types of numerical parameters shall be explicitly declared using a DataType attribute to ensure XML readers can properly load numerical parameters in the appropriate data types without requiring tedious type-guessing code or risking data truncation. The DataType attributes are not required for strings of text, or for list elements containing nested elements. The DataType attribute, if provided, shall take one of the following values:

DataType C equivalent Description Example
"byte" unsigned char Unsigned 8 bit integer <GreyscaleLevel DataType="byte">
255
</GreyscaleLevel>
"int16" short Signed 16 bit integer <PeakCounts DataType="int16">
-32768
</PeakCounts>
"uint16" unsigned short Unsigned 16 bit integer <RoiIntegral DataType="uint16">
65535
</RoiIntegral>
"int32" long Signed 32 bit integer <AccountBalance DataType="int32">
−2147483648
</AccountBalance>
"uint32" unsigned long Unsigned 32 bit integer <PixelCount DataType="uint32">
4294967295
</PixelCount>
"int64" long long Signed 64 bit integer <MaxRangeOfInt64 DataType="int64">
9223372036854775807
</MaxRangeOfInt64>
"float" float 32-bit IEEE 754 single-precision floating point number <FloatExample DataType="float">
1.0
</FloatExample>
"double" double 64-bit IEEE 754 double-precision floating point number <DoubleExample DataType="double">
1.00
</DoubleExample>
"array:xyz" xyz[N] An array of values, where 'xyz' is one of the above data types <Fibonacci DataType="array:byte" Count="6">
1, 1, 2, 3, 5, 8
</Fibonacci>

The values of the DataType attributes shall be written in lower case.

HMSA XML parsers shall load parameters using a data type of equal or greater precision to that specified by DataType attribute.

If no data type is provided, and the element contains no child elements, HMSA XML parsers shall interpret the value to be a text string.

If the parameter is a member of a dataset template defined in Appendix A, or a condition template defined in Appendix B, the data type shall be equal to the type defined in the template.

2.5.2 Arrays of values

Arrays of values shall be specified using a DataType attribute of "array:xyz", where xyz is one of the data types specified in Section 2.5.1: Numerical data types. The number of values in the array shall be specified using a Count attribute, which is assumed to be a decimal text representation of an unsigned 32 bit integer. Array values shall be written as comma separated values. For example:

<Fibonacci DataType="array:byte" Count="6">1, 1, 2, 3, 5, 8</Fibonacci>

The use of the Count attribute name is reserved for this purpose shall not be used for other purposes.

2.5.3 Numerical values

Numerical values shall not contain digit grouping markers such as commas or spaces.

Text encoding of floating point values shall follow the IEEE 754-1985 standard for binary <-> decimal conversion. Furthermore:

2.5.4 Physical units

For numerical values with physical units, the units should be defined using a Unit attribute. Units shall be provided in SI units, SI derived units (e.g. "Pa", "Å"), or one of the customary technique-specific units defined in Appendix C (e.g. "counts", "wt%"). Units shall be declared in abbreviated form, with optional single-character SI prefix codes (e.g. "kV", for kilovolt). The list of permitted prefixes is also included in Appendix C.

Dataset and condition objects defined in appendices A and B specify the physical units that must be used for parameters within those objects. The precise formats of the unit text shall be consistent with the definitions in the appendices.

To preserve scientific accuracy, it is critical that HMSA files use a consistent scheme of defining compound units that is readable and writeable by both humans and computers. Aesthetically pleasing representations such as kg·m·s-2 are difficult to type and are prone to display or interpretation errors when moving between software packages. To avoid confusion, HMSA files shall therefore use only the full stop '.' (U+002E), solidus '/' (U+0047) and numerals 0-9 (U+0030 - U+0039) to represent compound units such as "kg.m/s2". The use of the hyphen-minus sign '-' (U+002D) to indicate negative exponents is permitted only for inverse singular units, such as inverse centimetres (cm-1), but not compound units (e.g. "m/s2", not "m.s-2") . Other informal methods of superscript markup such as the circumflex accent ^ (U+005E) shall not be used. The use of brackets in unit definitions is not permitted.

The Unicode character set defines a number of specific code points for scientific symbols, which are visually identical to non-scientific code points. For example, the Unicode micro sign 'µ' (U+00B5) is visually indistinguishable from the Greek small letter mu 'μ' (U+03BC). The casual use of one or the other symbol for the same quantity poses a risk to software compatibility. Consequently, to avoid confusion and maximise compatibility, the lowest code point shall be used in cases where a unit symbol could be written in two or more visually indistinguishable characters. Required character substitutions are provided in Appendix D.

When defining concentrations, it is mandatory to specify whether the measurement is molar or atomic (mol%), volumetric (vol%) or mass or weight (wt%). Similarly, when using parts per million or parts per billion notations for concentration, the nature of the measurement shall be specified (e.g. mol_ppm, vol_ppm, wt_ppm.)

2.5.5 Alternative language attributes

In addition to the US English text, values in other languages may be specified using alt-lang-xx[-YY...] attributes, where 'xx' is the language code and 'YY...' the locale, as in the form of IETF language tags (i.e. 'en-US'). For example, the author may be specified as:

<Author alt-lang-ru="Фёдор Миха́йлович Достое́вский">Fyodor Dostoyevsky</Author>

This method should be used only to provide proper nouns in appropriate native languages, such as the names of authors, organisations, or places.

The use of the prefix alt-lang- in attribute names is reserved for this purpose and shall not be used in other attribute names.

2.5.6 Special characters

In accordance with the XML specification, the following characters shall not be used in the names or values of elements or attributes:

When writing XML files, occurrences of these characters in value strings shall be converted to their respective XML entities:

Upon loading of XML files, following structural parsing, occurrences of these XML entities in strings shall be converted back to their corresponding character values before being presented to users or other software.

2.5.7 Ordering of elements

The order in which elements are listed within the XML file is not specified in general, with some exceptions as defined below:

Within the <MSAHyperDimensionalDataFile> document root element, the child elements must be in the following order:

Other than the cases explicitly declared above, applications shall not require elements in the XML file to be in a specific order.


3. The <Header> list element

The <Header> list element contains metadata that principally identifies the title of the document, the author/ownership of the data, and the date/time of collection. Header information shall not contain parameters that are required for the interpretation of the experimental data.

3.1 Header items are optional

In keeping with the principle of minimalism (see Section 1.5: Minimalism), all items in the <Header> list element are optional. Some elements, such as the <Checksum>, should be included, but are not mandatory. Software that reads HMSA XML files should not require the presence of any items in the <Header> list to open, display or process files.

If no items are defined within the <Header> list, the empty header list shall be specified as either an empty element (<Header />), or as a conventional matched pair of elements with no contents (<Header></Header>). XML parsers for HMSA XML files shall support both styles of empty element declaration.

3.2 <Checksum>

The <Header> list should include a <Checksum> element to allow software to verify that the binary HMSA file exactly matches that specified in the XML file. The <Checksum> element, if provided, shall take the following form:

<Checksum Algorithm="SHA-1">53AAD59C05D59A40AD746D6928EA6D2D526865FD</Checksum>

The contents of the <Checksum> element shall be the hexadecimal-encoded (A-F, 0-9) checksum digest of the entire HMSA file. The algorithm used to generate the checksum shall be declared using the Algorithm attribute. The checksum algorithm should be one of the following algorithms:

The 'SUM32' algorithm is provided for basic protection against single-bit and some multiple-bit errors, but does not protect against multiple-bit errors with zero sum change. For this reason, the 'SHA-1' algorithm is recommended, as it provides strong detection of any form of modification, and is furthermore a widely supported standard with libraries and implementations available in most programming languages and platforms.

3.3 <Title>, <Author> and <Owner>

The title, author, and legal owner of the document should be specified within the <Header> list like so:

<Title>Beep Beep</Title>
<Author>Wyle E. Coyote</Author>
<Owner>Acme Inc.</Owner>

These elements may be provided in languages other than US English using an alternative language attribute alt-lang-xx[-YY] (see Section 2.5.5: Alternative language attributes). For example, the name of the author Leo Tolstoy may be provided in his native Russian Cyrillic script as:

<Author alt-lang-ru="Лев Никола́евич Толсто́й">Leo Tolstoy</Author>

3.4 <Date>, <Time> and <Timezone>

The date and time of the original data acquisition should be stored in <Date>, <Time> and <Timezone> elements, of the following format:

<Date>1985-10-26</Date>
<Time>20:04:00</Time>
<Timezone>US Pacific Standard Time</Timezone>

The Date and Time values shall be written in the ISO 8601 date/time format, with the date as YYYY-MM-DD, the time as HH:MM:SS in 24 hour format, and the Timezone specified by country code and full formal timezone name.

Dates shall be encoded according to the Gregorian calendar, and in the common era (CE / AD).

3.5 Other optional header elements

The header may optionally include any number of other metadata elements, such as:

The formats and conventions of these optional elements are not defined, and these values shall not be required for the proper display or interpretation of the experimental data or conditions. Any scientifically meaningful metadata shall be stored within an appropriate element within the <Conditions> list (See Section 4: The <Conditions> list element.)


4. The <Conditions> list element

The <Conditions> element is a list of condition entries that may assist in the interpretation of the experimental data, such as spectrometer gains and offsets. Conditions are technique-specific, and so there will be a diverse range of possible condition elements. Templates for common conditions are discussed in Section 4.2: Conditions templates and classes, and examples are given in Appendix B.

All condition templates shall have the following base structure:

<TemplateName Class="ClassName" ID="UniqueStringOfText">
	[...]
</TemplateName>

The Class and ID elements are optional, and may not be present for all elements in the <Conditions> list.

The templates and class names are further described in Section 4.2: Conditions templates and classes, and the ID attribute is described in Section 4.3: Condition identifiers. Note that the <Conditions> list may contain any number of entries with the same template name and/or class name. However, the ID attribute, if present, shall be unique for each condition entry.

4.1 Conditions are optional

Because of the limitless number of potentially useful condition objects, it is not reasonable to assume that all software must read or understand all condition types. Consequently, HMSA/XML file format has been designed such that all conditions are optional. Software that reads HMSA files shall be able to read and display datasets without having to parse and understand any or all of the associated conditions (albeit without calibration or further interpretation.) Conditions therefore shall not contain any information that is required to load the dataset from the file, as the position and layout of the dataset object in the HMSA file is completely defined in the relevant dataset object (see Section 5: The <Data> list element).

This requirement is intended to ensure a universal base level of support for common dataset types, so that, for example, a program that can read and display any 2D rastered spectral map dataset can work with all 2D rastered spectral maps, from any technique (EELS, XEDS, CL, etc.)

4.2 Conditions templates and classes

The name of the condition object is called the 'template'. HMSA defines a number of condition templates to accommodate a range of common experimental techniques:

The Class attribute is used to define sub-variants of templates. For instance, the <Probe> template supports a class named "EM", which defines general electron column conditions for electron microscopes. This class may be further extended using a subclass such as "EM/TEM" for transmission electron microscopes (which may include lens modes &c).

Each subclass inherits the required and optional parameters of the parent template/class, as well as any restrictions on parameter values. Required parameters shall not be removed by subclasses, nor shall any restrictions on parameter ranges be violated. Consequently, and object of type <Foo Class = "Bar/Baz"> is both a valid <Foo Class = "Bar">, and a valid <Foo> object. This class hierarchy system is intended to ensure that software than can interpret an object <Foo> can validly interpret any derived sub-classes, even if not all additional parameters are read or understood.

To ensure class names are unambiguous and universally typeable, class names shall contain only Latin characters and digits from the ASCII subset of the Unicode character set (A-Z, a-z, 0-9), and the hyphen-minus '-' (U+002D). The solidus '/' (U+002F) shall only be used to delimit class/sub-class names.

A list of supported templates, which is not exhaustive, is provided in Appendix B. It is expected that users of different techniques, or different vendors, may extend these templates/classes to suit their particular needs.

4.3 Condition identifiers

Top-level element in the <Conditions> list may have a unique identifier string using the ID attribute. The purpose of this attribute, in conjunction with the dataset <IncludeConditions> list, is to permit disambiguation of multiple condition elements with the same template, such as may occur in a multi-dataset map, where one condition may apply to one dataset, and another may apply to a second dataset. If the ID attribute is specified for a condition element, it shall not be shared with any other item in the <Conditions> list, regardless of template or class type. The ID string may contain any character that is permitted in XML attribute values. However, for maximum compatibility, ID strings should contain only the ASCII subset of the Unicode character set (i.e. U+0000 to U+007F).


5. The <Data> list element

The <Data> element is a list of the binary datasets stored in the HMSA file. The <Data> element shall contain one or more dataset entries, which describe the address, size, and layout of the binary data within the associated HMSA file. Applications are not required to parse more than the first dataset in the HMSA XML file, but should provide warning to the user that additional unparsed datasets are present in the file.

By design, dataset definitions contain no extraneous data that is unrelated to the format of the binary data, such as experimental parameters to assist with the interpretation or display of the data. This arrangement ensures that common dataset types can be used across a range of techniques. For instance, the dataset definition for a spectral map will be identical regardless of whether the dataset was collected via XEDS, CL, EELS, Raman, etc.

By default, it is assumed that all conditions in the <Conditions> list apply to every dataset declared in the <Data> list. Optionally, datasets may explicitly specify a subset of conditions that apply using the <IncludeConditions> list, which may be necessary in multi-dataset files with multiple instances of the same condition template (see Section 5.6: <IncludeConditions>).

All dataset templates have the following base structure:

<TemplateName Class="ClassName" Name="Example">
   <DataOffset DataType="int64">123</DataOffset>
   <DataLength DataType="int64">456</DataLength>
   <DatumType SizeInBytes="2">uint16</DatumType>
   <DatumDimensions>
      [ zero or more dimension definitions ]		
   </DatumDimensions>
   <CollectionDimensions>
      [ zero or more dimension definitions ]		
   </CollectionDimensions>
   <IncludeConditions>
      [ zero or more references to conditions ]
   </IncludeConditions>
<TemplateName>

The elements of the base dataset object are defined below:

5.1 Dataset templates and classes

Datasets use the same template/class hierarchy scheme as defined for condition objects in Section 4.2: Condition templates and classes. However, unlike conditions, the range of dataset templates and classes is strictly limited. HMSA defines only three dataset template classes to accommodate a range of common experimental data types:

Each template supports a number of classes, which are defined in Appendix A.

5.2 <DataOffset> and <DataLength>

The location of the beginning of the dataset's binary data within the HMSA file is given in the <DataOffset> element, and is measured in bytes from the start of the file, in 64-bit integer precision (C type = long long). The first byte of the file has an offset of 0.

The location of the first dataset in the file shall be 8 bytes from the start, meaning there is no padding between the 8-byte UID and the first dataset. The length of the dataset's binary data within the HMSA file is given in the <DataLength> element, and is measured in bytes, in 64-bit integer precision.

Note that the <DataOffset> and <DataLength> integers are signed for compatibility with common file seeking functions (i.e. fseeki64), which use signed integers.

If more than one dataset is present in the file, the location of subsequent datasets shall not overlap other datasets in the file, and may be:

5.3 <DatumType>

The data type of an individual iota of measurement within the data set shall be declared using the <DatumType> element, like so:

<DatumType SizeInBytes="4">int32</DatumType>

For spectra and spectral maps, this element declares the data type of a spectrum channel. For image planes and hyperimage maps, this is the type of an image pixel.

The <DatumType> element shall take one of the following values:

DatumType C equivalent Size (B) Description
"byte" unsigned char 1 Unsigned 8 bit integer
"int16" short 2 Signed 16 bit integer
"uint16" unsigned short 2 Unsigned 16 bit integer
"int32" long 4 Signed 32 bit integer
"uint32" unsigned long 4 Unsigned 32 bit integer
"int64" long long 8 Signed 64 bit integer
"float" float 4 32-bit IEEE 754 single-precision floating point number
"double" double 8 64-bit IEEE 754 double-precision floating point number

The size of the individual datum, in bytes, should be declared in the SizeInBytes attribute. This value is provided for documentary reference for human readers, as HMSA-loading software shall reply upon the enumerated <DatumType> value to determine data storage.

5.4 <DatumDimensions>

Datasets may consist of a single value per datum (e.g. a pixel in an image), a one dimensional array of values per datum (e.g. a spectrum per pixel in a hyperspectral map), or two dimensional array of values per datum (e.g. a diffraction pattern image per pixel in a hyperimage map). The dimensionality and ordering of the datum values is defined in <DatumDimensions> element, which shall contain zero or more <Dimension> elements, as defined below:

5.4.1 The <Dimension> element

Each <Dimension> element shall define the length of the dimension (e.g. the number of channels in a spectrum), and be of the form:

<Dimension DataType="uint32" Name="Channel">1024</Dimension>

The data type of the value of the <Dimension> element shall be explicitly declared using a DataType attribute, with the value "uint32" (C type = unsigned long integer).

The <Dimension> element shall also contain a Name attribute, which is necessary to disambiguate the order of datum dimensions in multi-dimensional data, such as with 2D images per datum. The required values of the Name attribute are defined in the cases below.

5.4.2 Datum as single values

For simple image maps, for which there is only a single value per datum (i.e. one value per pixel), the datum dimensionality is zero, and hence the <DatumDimensions> element shall be empty:

<DatumDimensions />
or, equivalently:
<DatumDimensions></DatumDimensions>

5.4.3 Datum as arrays

For datum consisting of a single array of values (e.g. a spectrum per pixel in a spectral map), the datum dimensionality is one, and the <DatumDimensions> element shall contain one <Dimension> element of the form:

<DatumDimensions>
  <Dimension DataType="uint32" Name="Channel">1024</Dimension>
</DatumDimensions>

For spectral or linear diffraction datum, the Name attribute of the datum <Dimension> element shall be "Channel". Energy, wavelength, voltage or other physically meaningful concepts are not permitted in the dimension name, as these are defined in the relevant <Detector> element in the <Conditions> list.

5.4.4 Datum as 2D arrays

For datum consisting of a 2D array of values (e.g. a diffraction pattern in a hyperimage), the datum dimensionality is two, and the <DatumDimensions> element shall contain two <Dimension> elements of the form:

<DatumDimensions>
  <Dimension DataType="uint32" Name="U">512</Dimension>
  <Dimension DataType="uint32" Name="V">400</Dimension>
</DatumDimensions>

To reduce confusion with dimension names, the names "X", "Y" and "Z" shall only be used to refer to the specimen coordinate system. For dimensions that relate to the detector, such as the horizontal and vertical axes on an EBSD camera, the dimension names should be "U", "V" and "W".

For 2D image datum, the datum dimensions shall be written in row-first order, with the U dimension preceding the V dimension.

5.4.5 Datum as 3D arrays and higher dimensionality

Higher dimensionality datum (3D, etc.) are possible, but are outside the scope of this version of this specification.

5.4.6 Order of datum in binary HMSA files

The ordering of <Dimension> elements within the <DatumDimensions> list defines the ordering that the datum is stored in the binary HMSA file. The following algorithm should be followed when reading or writing datum values from HMSA binary data files:

  1. Begin at the origin in all datum dimensions.
  2. Read/write the value of the first datum point.
  3. If no datum dimensions are defined (e.g. for single-value-per-pixel images), stop now.
  4. Step to the next datum coordinate in the first datum dimension, and read/write the datum value.
  5. Iterate steps 2-4 until reaching the end of the first datum dimension.
  6. If only one datum dimension is defined (e.g. for spectrum maps), stop now.
  7. Step to the next datum coordinate in the second datum dimension, and return to the origin of the first datum dimension.
  8. Repeat steps 2-6 until reaching the end of the second datum dimension.

Datum dimensionality greater than two is not defined by this document, but may be supported by extension of the above process for 3rd and higher dimensions.

5.5 <CollectionDimensions>

The <CollectionDimensions> list element functions analogously to the <DatumDimensions> element (see Section 5.4: <DatumDimensions>), and defines the dimensionality and order of the collection of datum across, or though, the specimen. The <CollectionDimensions> list will contain zero or more <Dimension> elements, depending on the type of dataset:

The example below shows the <CollectionDimensions> element for a 3-dimensional serial section image raster using the <ImageRaster Class="3D"> dataset template:

<CollectionDimensions>
    <Dimension DataType="uint32" Name="X">512</Dimension>
    <Dimension DataType="uint32" Name="Y">400</Dimension>
    <Dimension DataType="uint32" Name="Z">256</Dimension>
</CollectionDimensions>

5.5.1 Order of data points in binary HMSA files

The ordering of <Dimension> elements within the <CollectionDimensions> list defines the ordering that the point data is stored in the binary HMSA file. The following algorithm should be followed when reading or writing data from HMSA binary data files:

  1. Begin at the origin in all collection dimensions (e.g. x = y = 0 in an XY raster map)
  2. Read/write the datum of the first point, in accordance with the <DatumDimensions> definition.
  3. Step to the next coordinate in the first collection dimension, and read/write the point datum.
  4. Iterate steps 2-3 until reaching the end of the first collection dimension.
  5. If only one collection dimension is defined, stop now.
  6. Step to the next coordinate in the second collection dimension, and return to the origin of the first collection dimension.
  7. Repeat steps 2-5 until reaching the end of the second collection dimension.
  8. If only two collection dimensions are defined, stop now.
  9. Step to the next coordinate in the third collection dimension, and return to the origin of the first and second collection dimensions.
  10. Repeat steps 2-7 until reaching the end of the third collection dimension.

Collection dimensionality greater than three is not defined by this document, but may be supported by extension of the above process for 4th and higher dimensions.

The address offset of collection point (i, j, k) relative to the start of the dataset object, in a dataset with a size in 3 dimensions of (ni , nj , nk), may be expressed as:

Offset = PointSize × ( k·ni·nj + j·ni + i )

...where PointSize is the size, in bytes, of all the datum of a single collection point.

For a 1 dimensional collection (e.g. linescan), nj and nk are zero. For a 2 dimensional collection (e.g. a map), nk is zero.

5.6 <IncludeConditions>

The <IncludeConditions> element contains zero or more references to the conditions that can be used to interpret the data in the dataset. If the <IncludeConditions> list is empty, all condition specified in the <Conditions> list are assumed to apply to the dataset.

Condition references in the <IncludeConditions>, if used, shall take the following form:

<ConditionTemplateName>ConditionIdentifier</ConditionTemplateName>

...where <ConditionTemplateName> matches the template name for the condition (e.g. <Probe>, <Detector>, etc.). The ConditionIdentifier value shall match the ID attribute of the element referenced in the <Conditions> list.

For example, to reference a condition defined in the <Conditions> list thusly:

<Detector Class = "Spectrometer/XEDS" ID="XFLASH 5010">
[...]
</Detector>
...the entry in the dataset's <IncludeConditions> list would be:
<Detector>XFLASH 5010<Detector>

Appendix A - Data set templates and classes

Note: Where data types or units are defined in the below templates, they shall be used as defined.

<Analysis>

The <Analysis> dataset template is used to store a single measurement of a specimen at a single point in space or time. This template does not specify the datum dimensionality.

Restrictions:

The <CollectionDimensions> element shall contain no entries, like so:

<CollectionDimensions />

Recommended conditions:

The following conditions should be present in the <Conditions> list, and referenced in the dataset's <IncludeConditions> list (if used):

<Analysis Class="1D">

The <Analysis Class="1D"> dataset template is used to store a measurement of a specimen at a single point in space or time with one datum dimension, such as a spectrum.

Inheritance:

Restrictions:

The <DatumDimensions> element shall contain exactly one <Dimension> item, like so:

<DatumDimensions>
<Dimension DataType="uint32" Name="Channel">4096</Dimension>
</DatumDimensions>

The Name attribute for the dimension shall take the value "Channel".

Recommended conditions:

The following conditions should be present in the <Conditions> list, and referenced in the dataset's <IncludeConditions> list (if used):

<Analysis Class="2D">

The <Analysis Class="2D"> dataset template is used to store a single measurement of the specimen at a single point in space or time with two datum dimensions, such as a diffraction pattern.

This dataset type shall not be used to store 2 dimensional images rastered over the specimen, such as a conventional TEM or SEM image. Instead, such data shall be stored using the <ImageRaster Class="2D"> dataset template.

Inheritance:

Restrictions:

The <DatumDimensions> element shall contain exactly two <Dimension> items, like so:

<DatumDimensions>
<Dimension DataType="uint32" Name="U">123</Dimension>
<Dimension DataType="uint32" Name="V">456789</Dimension>
</DatumDimensions>

The Name attributes for the dimensions shall take the values "U" and "V", in that order.

Recommended conditions:

The following conditions should be present in the <Conditions> list, and referenced in the dataset's <IncludeConditions> list (if used):

<AnalysisList>

The <AnalysisList> dataset template represents a sequence of point measurements collected under the same conditions but in an irregular pattern, such as a line scan, a time sequence, or sparsely scanned images. The data in the HMSA file is stored analysis-by-analysis, without padding. This template does not specify the datum dimensionality.

Restrictions:

The <CollectionDimensions> element shall contain exactly one <Dimension> item, like so:

<CollectionDimensions>
<Dimension DataType="uint32" Name="Analysis">12568</Dimension>
</CollectionDimensions>

Recommended conditions:

The following conditions should be present in the <Conditions> list, and referenced in the dataset's <IncludeConditions> list (if used):

<AnalysisList Class="1D">

The <AnalysisList Class="1D"> dataset template represents a sequence of point measurements with one datum dimension, such as a spectrum.

Inheritance:

Restrictions:

The <DatumDimensions> element shall contain exactly one <Dimension> item, like so:

<DatumDimensions>
<Dimension DataType="uint32" Name="Channel">4096</Dimension>
</DatumDimensions>

The Name attribute for the dimension shall take the value "Channel".

Recommended conditions:

The following conditions should be present in the <Conditions> list, and referenced in the dataset's <IncludeConditions> list (if used):

<AnalysisList Class="2D">

The <AnalysisList Class="2D"> dataset template represents a sequence of point measurements with two datum dimensions, such as a diffraction pattern.

Inheritance:

Restrictions:

The <DatumDimensions> element shall contain exactly two <Dimension> item, like so:

<DatumDimensions>
<Dimension DataType="uint32" Name="U">123</Dimension>
<Dimension DataType="uint32" Name="V">456</Dimension>
</DatumDimensions>

The Name attributes for the dimensions shall take the values "U" and "V", in that order.

Recommended conditions:

The following conditions should be present in the <Conditions> list, and referenced in the dataset's <IncludeConditions> list (if used):

<ImageRaster>

The <ImageRaster> dataset template represents a dataset that has been rastered over regularly spaced intervals in one or more dimensions, such as a 1D linescan, a 2D image, or a 3D serial section. This template does not specify the datum dimensionality.

Restrictions:

The <CollectionDimensions> list shall contain one or more <Dimension> elements, which shall be of the form:

<Dimension DataType="uint32" Name="ABC">314159</Dimension>

The Name attribute for each dimension shall be unique.

Recommended conditions:

The following conditions should be present in the <Conditions> list, and referenced in the dataset's <IncludeConditions> list (if used):

<ImageRaster Class="2D">

The <ImageRaster Class="2D"> dataset template represents a dataset that has been raster mapped in 2D (X/Y dimensions). This template does not specify the datum dimensionality.

Inheritance:

Restrictions:

The <CollectionDimensions> element shall contain exactly two <Dimension> items, like so:

<CollectionDimensions>
<Dimension DataType="uint32" Name="X">105</Dimension>
<Dimension DataType="uint32" Name="Y">98</Dimension>
</CollectionDimensions>

For X/Y spatially rastered data, the Name attributes for the dimensions shall be "X" and "Y", in that order.

Recommended conditions:

The following conditions should be present in the <Conditions> list, and referenced in the dataset's <IncludeConditions> list (if used):

<ImageRaster Class="2D/Spectral">

The <ImageRaster Class="2D/Spectral"> dataset template represents a dataset that has been raster mapped in 2D (X/Y dimensions), where for each raster coordinate, the datum collected was a 1D array (channel dimension). An example of this type of dataset is a SEM-XEDS map.

Inheritance:

Restrictions:

The <DatumDimensions> element shall contain exactly one <Dimension> item, like so:

<DatumDimensions>
<Dimension DataType="uint32" Name="Channel">4096</Dimension>
</DatumDimensions>

The Name attribute for the dimension shall take the value "Channel".

Recommended conditions:

The following conditions should be present in the <Conditions> list, and referenced in the dataset's <IncludeConditions> list (if used):

<ImageRaster Class="2D/Hyperimage">

The <ImageRaster Class="2D/Hyperimage"> dataset template represents a dataset that has been raster mapped in 2D (X/Y dimensions), where for each raster coordinate, the datum collected was a 2D image (U/V dimensions). An example of this type of dataset is a SEM-EBSD map.

Inheritance:

Restrictions:

The <DatumDimensions> element shall contain exactly two <Dimension> items, like so:

<DatumDimensions>
<Dimension DataType="uint32" Name="U">123</Dimension>
<Dimension DataType="uint32" Name="V">456</Dimension>
</DatumDimensions>

The Name attributes for the dimensions shall take the values "U" and "V", in that order.

Recommended conditions:

The following conditions should be present in the <Conditions> list, and referenced in the dataset's <IncludeConditions> list (if used):

<ImageRaster Class="3D">

The <ImageRaster Class="3D"> dataset template represents a dataset that has been raster mapped in 3D (X/Y/Z dimensions). This template does not specify the datum dimensionality.

Inheritance:

Restrictions:

The <CollectionDimensions> element shall contain exactly three <Dimension> item, like so:

<CollectionDimensions>
<Dimension DataType="uint32" Name="X">105</Dimension>
<Dimension DataType="uint32" Name="Y">98</Dimension>
<Dimension DataType="uint32" Name="Z">591</Dimension>
</CollectionDimensions>

For X/Y/Z spatially rastered data, the Name attributes for the dimensions shall be "X", "Y" and "Z", in that order.

Recommended conditions:

The following conditions should be present in the <Conditions> list, and referenced in the dataset's <IncludeConditions> list (if used):

<ImageRaster Class="3D/Spectral">

The <ImageRaster Class="3D/Spectral"> dataset template represents a dataset that has been raster mapped in 3D (X/Y/Z dimensions), where for each raster coordinate, the datum collected was a 1D array (channel dimension). An example of this type of dataset is a 3D serial section XEDS map.

Inheritance:

Restrictions:

The <DatumDimensions> element shall contain exactly one <Dimension> item, like so:

<DatumDimensions>
<Dimension DataType="uint32" Name="Channel">4096</Dimension>
</DatumDimensions>

The Name attribute for the dimension shall take the value "Channel".

Recommended conditions:

The following conditions should be present in the <Conditions> list, and referenced in the dataset's <IncludeConditions> list (if used):

<ImageRaster Class="3D/Hyperimage">

The <ImageRaster Class="3D/Hyperimage"> dataset template represents a dataset that has been raster mapped in 3D (X/Y/Z dimensions), where for each raster coordinate, the datum collected was a 2D image (U/V dimensions). An example of this type of dataset is a 3D serial section EBSD map.

Inheritance:

Restrictions:

The <DatumDimensions> element shall contain exactly two <Dimension> items, like so:

<DatumDimensions>
<Dimension DataType="uint32" Name="U">123</Dimension>
<Dimension DataType="uint32" Name="V">456</Dimension>
</DatumDimensions>

The Name attributes for the dimensions shall take the values "U" and "V", in that order.

Recommended conditions:

The following conditions should be present in the <Conditions> list, and referenced in the dataset's <IncludeConditions> list (if used):


Appendix B - Condition templates and classes

This list of condition templates is incomplete. If you would like to recommend changes to a template, or contribute a new template, please contact the HMSA working group via the Chair, Nestor J. Zaluzec.

Note: Where data types or units are defined in the below templates, they shall be used as defined.

<Acquisition>

The <Acquisition> condition template is a generic object that describes the position and duration of one or more measurements of the specimen. This template should not be used directly. Instead, use a subclass appropriate for the type of acquisition, such as:

Optional elements:

<DwellTime Unit="s" DataType="float">35.0</DwellTime>
<TotalTime Unit="s" DataType="float">14400.0</TotalTime>
<DwellTime_Live Unit="s" DataType="float">35.0</DwellTime_Live>

The <DwellTime> element defines the uniform real time taken for each individual measurement, such as a point spectrum acquisition, a single point in a linescan, or a pixel in a map. The <DwellTime_Live> element defines the analogous detector live time for each individual measurement, if known.

If the acquisition includes multiple measurements (such as a linescan or map), the <TotalTime> element defines the total real time taken to collect all measurements in the acquisition set.

<Acquisition Class="Point">

The <Acquisition Class="Point"> condition template defines the position and duration for a singular measurement of the specimen, such as may be used with a <Analysis> dataset.

Inheritance:

Required elements:

<SpecimenPosition>...</SpecimenPosition>

<Acquisition Class="Multipoint">

The <Acquisition Class="Multipoint"> condition template defines the position and duration of an irregular sequence of measurements of the specimen, such as may be used with a <AnalysisList> dataset.

Inheritance:

Required elements:

<PointCount DataType="uint32">256</PointCount>

Optional elements:

<Positions>
   <SpecimenPosition>...</SpecimenPosition>
   <SpecimenPosition>...</SpecimenPosition>
   [...]
</Positions>  

If the <Positions> list element is defined, the number of <SpecimenPosition> elements within the list shall be equal to the value of <PointCount>.

<Acquisition Class="Raster">

The <Acquisition Class="Raster"> condition template is a generic object that defines the position and duration of a regular raster over the specimen. This template should not be used directly. Instead, use a subclass appropriate for the type of raster, such as:

Inheritance:

Optional elements:

<RasterMode>Stage|Beam</RasterMode>
<SpecimenPosition Name="Start">...</SpecimenPosition>

One <SpecimenPosition> element may be provided to reference the raster to a physical location on (or inside) the specimen. The Name attribute is required, and may take the value of "Start" (meaning the 1st ordinal in all raster axes) or "Center" (meaning the mid-point in all raster axes).

<Acquisition Class="Raster/Linescan">

The <Acquisition Class="Raster/Linescan"> condition template defines the position and duration of a one-dimensional raster over the specimen, such as may be used with a <AnalysisList> dataset. This template applies only to a linear sequence of steps, using equal step sizes and dwell times for each measurement. For irregular step sizes, refer to the <Acquisition Class="Multipoint"> template.

Inheritance:

Required elements:

<StepCount DataType="uint32">1024</StepCount>

Optional elements:

<StepSize Unit="um" DataType="float">10.</StepSize>
<SpecimenPosition Name="Start">
   <X Unit="mm" DataType="float">0.0</X>
   <Y Unit="mm" DataType="float">0.0</Y>
   <Z Unit="mm" DataType="float">10.0</Z>
</SpecimenPosition>
<SpecimenPosition Name="End">
   <X Unit="mm" DataType="float">10.24</X>
   <Y Unit="mm" DataType="float">0.0</Y>
   <Z Unit="mm" DataType="float">10.0</Z>
</SpecimenPosition>
<FrameCount DataType="uint32">40</FrameCount>

If the two optional <SpecimenPosition> elements are defined in the <Acquisition Class="Raster/Linescan"> element, the Name attributes must be "Start" and "End", respectively, so as to uniquely define the linescan direction.

<Acquisition Class="Raster/XY">

The <Acquisition Class="Raster/XY"> condition template defines the position and duration of a two-dimensional X/Y raster over the specimen, such as may be used with a <ImageRaster Class="2D"> dataset.

Inheritance:

Required elements:

<XStepCount DataType="uint32">158</XStepCount>
<YStepCount DataType="uint32">98</YStepCount>

Optional elements:

<XStepSize Unit="um" DataType="float">1.</XStepSize>
<YStepSize Unit="um" DataType="float">1.</YStepSize>
<FrameCount DataType="uint32">40</FrameCount>

The units for X and Y step sizes shall be length, using any SI prefix from pm to m (um is default; note list of SI prefixes).

<Acquisition Class="Raster/XYZ">

The <Acquisition Class="Raster/XYZ"> condition template defines the position and duration of a three-dimensional X/Y/Z raster over the specimen, such as may be used with a <ImageRaster Class="3D"> dataset.

Inheritance:

Required elements:

<XStepCount DataType="uint32">158</XStepCount>
<YStepCount DataType="uint32">98</YStepCount>
<ZStepCount DataType="uint32">185</ZStepCount>

Optional elements:

<XStepSize Unit="um" DataType="float">1.</XStepSize>
<YStepSize Unit="um" DataType="float">1.</YStepSize>
<ZStepSize Unit="um" DataType="float">1.</ZStepSize>
<ZRasterMode>FIB|etc</ZRasterMode>

The units for X and Y step sizes shall be length, using any SI prefix from pm to m ('um' is default).

If the units for the Z dimension are length, the Unit attribute for Z shall use any SI prefix from pm to m ('um' is default). If the units for the Z dimension are not length (e.g. etch time), appropriate units and prefixes should be used (See Appendix C).

<Calibration>

The <Calibration> condition template describes the calibration of a set of measurement ordinals with respect to a physical quantity, such as converting channels in an EELS spectrum to energy, or steps in a WDS peak scan to position, angle, wavelength or energy. This template should not be used directly. Instead, use a subclass appropriate for the calibration relationship, such as:

Note that calibration condition objects are embedded within other conditions, such as the <Detector Class = "Spectrometer"> condition template.

Required elements:

<Quantity>Energy</Quantity>
<Unit>eV</Unit>

The <Quantity> element defines the physical quantity of the calibration object, such as "Energy", "Wavelength", "Position", etc.

The list of permitted values of the <Unit> element is defined in Appendix C - Units and prefixes.

<Calibration Class="Constant">

The <Calibration Class="Constant"> element defines the energy/wavelength/etc calibration of a spectrometer or other measurement device operating at a fixed position, such as a CL monochromator.

Inheritance:

Required elements:

<Value DataType="float">-237.098251</Value>

The <Value> element does not include a declaration of physical units using a Unit attribute. Instead, the physical unit is declared in the separate <Unit> element, as required by the base <Calibration> template.

Example:

<Calibration Class="Constant">
  <Quantity>Energy</Quantity>
  <Unit>eV</Unit>
  <Value DataType="float">-237.098251</Value>
</Calibration>

<Calibration Class="Linear">

The <Calibration Class="Linear"> element defines the energy/wavelength/etc calibration of a spectrometer or other measurement device, for which the measurement ordinals (e.g. channel numbers) have a linear relationship to the physical quantity (e.g. nm), with a constant offset and gain.

Inheritance:

Required elements:

<Gain DataType="float">2.49985</Gain>
<Offset DataType="float">-237.098251</Offset>

The <Gain> and <Offset> elements do not include declarations of physical units using a Unit attribute. Instead, the physical unit is declared for both parameters in the separate <Unit> element, as required by the base <Calibration> template.

The value of <Offset> shall be the calibration value (energy, wavelength, position, etc.) corresponding to the first measurement ordinal. For example, with an XEDS detector, the value of the <Offset> element is the energy of channel 0.

Example:

<Calibration Class="Linear">
  <Quantity>Energy</Quantity>
  <Unit>eV</Unit>
  <Gain DataType="float">2.49985</Gain>
  <Offset DataType="float">-237.098251</Offset>
</Calibration>

<Calibration Class="Polynomial">

The <Calibration Class="Polynomial"> element defines the energy/wavelength/etc calibration of a spectrometer or other measurement device, for which the measurement ordinals (e.g. channel numbers) have a relationship to the physical quantity (e.g. nm) that may be modelled by an nth order polynomial. The coefficients are expressed as an array of floating point values (see Section 2.5.2: Arrays of values).

Inheritance:

Required elements:

<Coefficients DataType="array:float" Count="4">-2.225, 0.677, 0.134, -0.018</Coefficients>

The coefficients do not include declarations of physical units using a Unit attribute. Instead, the physical unit is declared for all in the separate <Unit> element, as required by the base <Calibration> template.

Example:

<Calibration Class="Polynomial">
  <Quantity>Energy</Quantity>
  <Unit>eV</Unit>
  <Coefficients DataType="array:float" Count="4">-2.225, 0.677, 0.134, -0.018</Coefficients>
</Calibration>

<Calibration Class="Explicit">

The <Calibration Class="Explicit"> element defines the energy/wavelength/etc calibration of a spectrometer or other measurement device, for which relationship between the measurement ordinals (e.g. channel numbers) and physical quantity (e.g. nm) cannot be adequately modelled by linear or polynomial functions, and therefore must be declared explicitly for each ordinal as an array of floating point values (see Section 2.5.2: Arrays of values).

Inheritance:

Required elements:

<Values DataType="array:float" Count="1024">
198.557114, 199.364639, 200.172089, 200.979446, 201.786743, 202.593948, ...
</Values>

The array values do not include declarations of physical units using a Unit attribute. Instead, the physical unit is declared in the separate <Unit> element, as required by the base <Calibration> template.

Optional elements:

<Labels>Peak, BgndLow, BgndHigh</Labels>

The <Labels> element may be used to provide a comma separated list of text labels for each of the calibration points. If used, the number of CSV text strings in the <Labels> element shall be equal to the number of items in <Values> array.

Example:

<Calibration Class="Explicit">
  <Quantity>Wavelength</Quantity>
  <Unit>nm</Unit>
  <Values DataType="array:float" Count="1024">
    198.557114, 199.364639, 200.172089, 200.979446, 201.786743, 202.593948, ...
  </Values>
</Calibration>

<Composition>

The <Composition> conditions template is a generic object that describes the composition of a material. This template should not be used directly. Instead, use a subclass appropriate for the composition measurement, such as:

Required elements:

<Unit>wt%</Unit>

Compositions shall be specified in units of "atoms", "mol%", "wt%", "vol%" (or equivalent ppm/ppb units) using the <Unit> element. The concentrations of all constituents (e.g. elements, molecules, etc.) shall be defined using the same unit.

Optional elements:

<Components>
   [...]
</Components>

The <Components> list may be used to store the abundance of each component of the composition. The formatting of the items in the <Components> is determined by the subclass, e.g. <Composition Class="Elemental">.

The measurement units for the composition are defined by the <Unit> element, and therefore separate Unit attributes for each individual component shall not be used.

<Composition Class="Elemental">

The <Composition Class = "Elemental"> conditions template defines the composition of a material in terms of its constituent elements.

Inheritance:

Required elements:

Each <Composition Class="Elemental"> element shall contain one or more <Element> elements within the <Components> list, defined like so:

<Element Z="11" DataType="float">3.</Element>

Example:

<Composition Class="Elemental">
   <Unit>atoms</Unit>
   <Components>
      <Element Z="11" DataType="float">3.</Element>
      <Element Z="13" DataType="float">1.</Element>
      <Element Z="9" DataType="float">6.</Element>
   </Components>
</Composition>

<Detector>

The <Detector> condition template is a generic object that describes the type and configuration of a detector used to collect a HMSA dataset. This template should not be used directly. Instead, use a subclass appropriate for the type of detector, such as <Detector Class="Spectrometer">.

Optional elements:

<SignalType>EDS|WDS|ELS|AES|PES|XRF|CLS|GAM</SignalType>
<Manufacturer>Example Inc.</Manufacturer>
<Model>Example Model 123</Model>
<SerialNumber>12345-abc-67890</SerialNumber>
<MeasurementUnit>counts</MeasurementUnit>
<Elevation Unit="degrees" DataType = "float">45.</Elevation>
<Azimuth Unit="degrees" DataType = "float">0.</Azimuth>
<Distance Unit="mm" DataType = "float">50</Distance>
<Area Unit="mm2" DataType = "float">20</Area>
<SolidAngle Unit="sr" DataType = "float">1.</SolidAngle>
<SemiAngle Unit="mrad" DataType = "float">3.4</SemiAngle>
<Temperature Unit="degreesC" DataType="float">-20.0</Temperature>

If the <MeasurementUnit> element is not specified, a default value of "counts" should be assumed.

The recognised values of the <SignalType> element are the same as those defined in the EMSA/MAS spectrum file specification, namely:

Furthermore, this specification defines additional values for <SignalType> not included in the EMSA/MAS spectrum file specification:

<Detector Class="Camera">

The <Detector Class="Camera"> condition template describes the calibration and collection mode of a camera used to collect a HMSA dataset, such as an EBSD or TEM camera. The camera detector is expected to have two datum axes (U and V) which are, in general, assumed to be independent of the specimen coordinate dimensions (X/Y/Z). In instances where the camera axes may be considered horizontal and vertical, the U axis shall be horizontal, and the V axis shall be vertical.

Inheritance:

Required elements:

<UPixelCount DataType="uint32">512</UPixelCount>
<VPixelCount DataType="uint32">400</VPixelCount>

Optional elements:

<ExposureTime Unit="ms" DataType="float">200.</ExposureTime>
<Magnification DataType="float">4.5</Magnification>
<FocalLength Unit="mm" DataType="float">80.</FocalLength>

<Detector Class="Spectrometer">

The <Detector Class="Spectrometer"> condition template describes the calibration and collection mode of a spectrometer used to collect a HMSA dataset.

Inheritance:

Required elements:

<ChannelCount DataType="uint32">4096</ChannelCount>
<Calibration Class="...">[...]</Calibration>

Optional elements:

<CollectionMode>Parallel|Serial</CollectionMode>

<Detector Class="Spectrometer/CL">

The <Detector Class="CL"> condition template describes the type and configuration of a cathodoluminescence spectrometer.

Inheritance:

Optional elements:

<Grating-d Unit="mm-1" DataType="float">800</Grating-d>

Restrictions:

If the spectrometer is operating as a monochromator (e.g. monochromatic CL mapping), the calibration definition — as inherited from the <Detector Class="Spectrometer"> base template — shall be of type <Calibration Class="Constant">.

<Detector Class="Spectrometer/WDS">

The <Detector Class="Spectrometer/WDS"> condition template describes the type and configuration of a wavelength dispersive x-ray spectrometer.

Inheritance:

Optional elements:

<DispersionElement>TAP|LIF|PET|PETJ|PETH|LDE1|etc</DispersionElement> 
<Crystal-2d Unit="Å" DataType="float">8.742</Crystal-2d> 
<RowlandCircleDiameter Unit="mm" DataType="float">140.</RowlandCircleDiameter>
<PulseHeightAnalyser> 
   <Bias Unit="V" DataType="float">1700.</Bias> 
   <Gain DataType="float">16.</Gain> 
   <BaseLevel Unit="V" DataType="float">0.7</BaseLevel> 
   <Window Unit="V" DataType="float">9.3</Window> 
   <Mode>Integral|Differential</Mode> 
</PulseHeightAnalyser>
<Window>
    <Layer Material="Al|Be|etc" Unit="um" DataType="float">1.</Layer>
    [ multiple layers are supported ]
</Window>

Restrictions:

If the spectrometer is operating as a monochromator (e.g. WDS mapping), the calibration definition — as inherited from the <Detector Class="Spectrometer"> base template — shall be of type <Calibration Class="Constant">.

<Detector Class="Spectrometer/XEDS">

The <Detector Class="Spectrometer/XEDS"> condition template describes the type and configuration of an energy dispersive x-ray spectrometer.

Inheritance:

Optional elements:

<Technology>Ge|SiLi|SDD|microcalorimeter</Technology>
<NominalThroughput Unit="kcounts/s" DataType="float">180.</NominalThroughput>
<TimeConstant Unit="us" DataType="float">11.1</TimeConstant>
<StrobeRate Unit="Hz" DataType="float">2000</StrobeRate>
<Window>
    <Layer Material="Al|Be|etc" Unit="um" DataType="float">1.</Layer>
    [ multiple layers are supported ]
</Window>

<ElementalID>

The <ElementalID> condition template defines and elemental identification, as may be useful for region of interest images, XAFS spectral maps, and the like.

Required elements:

<Element DataType="uint32" Symbol="Na">11</Element>

Note that the Symbol attribute is optional, and is only provided for human readability. The elemental identification shall only be determined from the atomic number provided in the <Element> element.

<ElementalID Class="X-ray">

The <ElementalID Class="X-ray"> condition template defines and elemental identification based on an x-ray peak, as may be useful for region of interest images and the like.

Inheritance:

Required elements:

<Line>Ma</Line>

X-ray line names may be given in Siegbahn or IUPAC naming conventions. In either case, the notation should be specified using the Notation attribute, and alternative notations may be provided using the alt-IUPAC and alt-Siegbahn attributes, as shown below.

<Line Notation = "IUPAC" alt-Siegbahn = "Ma">M5-N6,7</Line>
<Line Notation = "Siegbahn" alt-IUPAC = "L2-M4">Lb1</Line>

For compatibility reasons, the Siegbahn notation shall use the Latin characters a, b, g, z and n in place of the Greek α, β γ, ζ and η. Similarly, for the IUPAC notation, numerals shall not be given in subscripts.

A list of principal x-ray lines in both IUPAC and Siegbahn notations is given below. For a complete list of corresponding transition names between IUPAC and Siegbahn notations, please refer to:

K-series L-series M-series
SiegbahnIUPAC SiegbahnIUPAC SiegbahnIUPAC
Ka K-L2,3 La L3-M4,5 Ma M5-N6,7
Ka1 K-L3 La1 L3-M5 Ma1 M5-N7
Ka2 K-L2 La2 L3-M4 Ma2 M5-N6
Kb K-M,N Lb1 L2-M4 Mb M4-N6
Kb1 K-M3 Lb2 L3-N5 Mg M3-N5
Kb2 K-N2,3 Lb3 L1-M3 Mz M4,5-N2,3
Lb4 L1-M2
Lg1 L2-N4
Lg2 L1-N2
Lg3 L1-N3
Ll L3-M1
Ln L2-M1

Optional element:

<Energy Unit="eV" DataType="float">1234</Energy>

<Instrument>

The <Instrument> condition template is a generic object that describes the type of instrument used to collect a HMSA dataset.

Required elements:

<Manufacturer>Example Inc.</Manufacturer>
<Model>Example Model 123</Model>

Optional elements:

<SerialNumber>12345-abc-67890</SerialNumber>

<Probe>

The <Probe> condition template is a generic object that describes the type and conditions of the analytical probe used to collect a HMSA dataset, such as settings for electron or ion columns, lasers, etc. This template should not be used directly. Instead, use a subclass appropriate for the type of probe, such as <Probe Class="EM">.

This template has neither required nor optional elements.

<Probe Class="EM">

The <Probe Class="EM"> condition template describes the electron column conditions of the electron microscope used to collect a HMSA dataset.

Inheritance:

Required elements:

<BeamVoltage DataType="float" Unit="kV">15.</BeamVoltage>

Optional elements:

<BeamCurrent DataType="float" Unit="nA">47.59</BeamCurrent>
<GunType>W filament|LaB6|Cold FEG|Schottky FEG</GunType>
<EmissionCurrent DataType="float" Unit="uA">12345</EmissionCurrent>
<FilamentCurrent DataType="float" Unit="A">1.234</FilamentCurrent>
<ExtractorBias DataType="float" Unit="V">4200</ExtractorBias>
<BeamDiameter DataType="float" Unit="nm">12345</BeamDiameter>
<ChamberPressure DataType="float" Unit="Pa">3.14E-6</ChamberPressure>
<GunPressure DataType="float" Unit="Pa">3.14E-10</GunPressure>
<ScanMagnification DataType="float">2500.</ScanMagnification>
<WorkingDistance DataType="float" Unit="mm">10</WorkingDistance>

<Probe Class="EM/TEM">

The <Probe Class="EM/TEM"> condition template describes the electron column conditions of the transmission electron microscope used to collect a HMSA dataset.

Inheritance:

Required elements:

<LensMode>IMAGE|DIFFR|SCIMG|SCDIF</LensMode>

Optional elements:

<CameraMagnification DataType="float">2</CameraMagnification>
<ConvergenceAngle Unit = "mrad" DataType="float">1.5</ConvergenceAngle>

As with the EMSA/MAS spectrum file format, the <ConvergenceAngle> element refers to the semi-angle of incident beam, in milli-radians.

<RegionOfInterest>

The <RegionOfInterest> condition template defines a region of a spectrum (or other one-dimensional datum), as may be useful for defining start and end channels used for a region of interest image.

Required elements:

<StartChannel DataType="uint32">556</StartChannel>
<EndChannel DataType="uint32">636</EndChannel>

Restrictions:

The value of <StartChannel> must be equal to or greater than 0, and smaller than or equal to the value of <EndChannel>.

<Specimen>

The <Specimen> conditions template defines a physical specimen, including the name, origin, composition, etc.

Required elements:

<Name>Cryolite</Name>

Optional elements:

<Description>Natural cryolite standard</Description>
<Origin>Kitaa, Greenland</Origin>
<Formula>Na3AlF6</Formula>
<Composition>
   [...]
</Composition>
<Temperature Unit="degreesC" DataType="float">-20.0</Temperature>

The contents of the <Composition> element shall follow the definition of the <Composition> condition template, or a sub-class thereof.

<Specimen Class="Multilayer">

The <Specimen Class="Multilayer"> conditions template defines a multi-layered physical specimen.

Inheritance:

Required elements:

<Layers>
   <Layer Name="Carbon coat">
      <Thickness Unit="nm" DataType="float">50</Thickness>
      <Formula>C</Formula>
      <Composition>
         [...]
      </Composition>
   </Layer>
   [...]
</Layers>

Multiple <Layer> elements are permitted. The first layer is assumed to be the top surface. If <Thickness> is not specified, a bulk layer is assumed. In multi-layer specimens, a bulk layer may only be defined for the last <Layer> element.

The contents of the <Composition> elements of each layer shall follow the definition of the <Composition> condition template, or a sub-class thereof.

<SpecimenPosition>

The <SpecimenPosition> condition template defines a physical location on (or in) the specimen. The position shall be defined in the coordinate system of the instrument. This version of the HMSA standard does not specify a template or definition of coordinate systems.

Optional elements:

<X Unit="mm" DataType="float">0.0</X>
<Y Unit="mm" DataType="float">0.0</Y>
<Z Unit="mm" DataType="float">10.0</Z>
<R Unit="°" DataType="float">90.0</R>
<T Unit="°" DataType="float">70.0</T>

Appendix C - Units and prefixes

Parameters in HMSA XML files shall use SI units, SI derived units, or a limited set of non-SI units defined below. Except where noted below, all SI magnitude prefixes are permitted for all units (e.g. mm, keV).

Units and prefixes are case sensitive, and shall be written with appropriate capitalisation as given below.

SI units

Symbol Unit Quantity
m Metre Length
kg Kilogram Mass. Prefixes may be used for values smaller than one kilogram (e.g. mg, ng), but shall not for values larger than one kilogram (no Gg, etc.)
s Second Time. Prefixes may be used for values smaller than one second (e.g. ns, ms), but shall not for values larger than one second (no ks, Ms, etc.)
A Ampere Current
K Kelvin Temperature
mol Mole Amount of substance
Cd Candela Luminous intensity

Si-derived units

Symbol Unit Quantity
Å Ångström Length, equivalent to 10-10m. Prefixes are not permitted (e.g. no kÅ). Note the character used is the Latin letter A with ring above (U+00C5). If 'Å' is untypeable, use appropriate conversions to 'nm' or 'pm', but never 'A'.
Bq Becquerel Radioactivity, equivalent to s-1.
C Coulomb Electrical charge, equivalent to A.s.
Da Dalton Atomic mass, equivalent to 1.66053886 × 10-27 kg.
degreesC Degree Celsius Temperature, equivalent to K + 273.15. For compatibility reasons, 'degreesC' should be used in place of the Unicode degree symbol (U+00B0), as in '°C'.
F Farad Capacitance, equivalent to C/V.
Gy Gray Absorbed dose, equivalent to J/kg.
H Henry Inductance, equivalent to Wb/A.
Hz Hertz Frequency, equivalent to s-1.
J Joule Energy, equivalent to kg.m2/s2.
L Litre Volume, equivalent to 10-3 m3.
lm Lumen Luminous flux, equivalent to W/m2.
lx Lux Illuminance, equivalent to lm/m2.
N Newton Force, equivalent to kg.m/s2.
Ohm Ohm Electrical resistance, equivalent to V/A. For compatibility reasons, 'Ohm' should be used in place of the Unicode Greek capital letter omega 'Ω' (U+03A9).
Pa Pascal Pressure, equivalent to N/m2.
rad Radian Angle.
S Siemens Electrical conductance, equivalent to Ohm-1.
Sv Sievert Equivalent dose.
sr Steradian Solid angle.
T Tesla Magnetic flux density, equivalent to Wb/m2.
V Volt Electrical potential, equivalent to J/C.
W Watt Power, equivalent to J/s.
Wb Weber Magnetic flux, equivalent to J/A.

Non-SI units

Symbol Unit Quantity
degrees Degree Angle. For compatibility reasons, 'degrees' should be used in place of the Unicode degree symbol '°' (U+00B0).
atoms Number of atoms
counts Counts Dimensionless number of events.
counts/s Counts per second Number of events per second, equivalent to Hz. Customary equivalents "cps" or "c/s" shall not be used.
eV electron volt Energy, equivalent to 1.60217646 × 10-19J.
% Percent Dimensionless ratio. Not to be used for concentrations (use mol%, vol%, wt%, etc.)
mol% Molar/atomic percent
vol% Volumetric percent
wt% Weight percent
mol_ppm Molar parts per million
vol_ppm Volumetric parts per million
wt_ppm Weight parts per million
mol_ppb Molar parts per billion [*]
vol_ppb Volumetric parts per billion [*]
wt_ppb Weight parts per billion [*]

* Parts per billion shall refer to short billions (109), never long billions (1012).

SI prefixes

Except where noted above, each unit supports the range of SI prefix codes, excluding centi (c), deci (d), deca (da) and hecto (h). These prefixes should only be used in cases where the prefix forms part of the widely accepted unit of measure for a particular quantity, such as the use of inverse centimetres (cm-1) for measuring the wavenumber of light.

The supported range of prefix codes are:

Symbol Magnitude Note
Y 1024
Z 1021
E 1018
P 1015
T 1012
G 109
M 106
k 103
m 10-3
u 10-6 For compatibility reasons, the Latin character 'u' (U+0075) should be used in place of the Unicode micro sign 'µ' (U+00B5).
n 10-9
p 10-12
f 10-15
a 10-18
z 10-21
y 10-24

Appendix D - Unicode character substitutions

In the Unicode character set, there are several code points that produce visually indistinguishable glyphs. Consequently, to avoid confusion and maximise compatibility, the lowest code point shall be used in these cases. A non-exhaustive list of the required character substitutions are provided below:


Appendix E - Example HMSA XML files

Example: SEM-XEDS hyperspectral map

This example represents a typical XEDS spectral map, as captured on an SEM. A baseline example of the same map, excluding all optional conditions and metadata, is provided thereafter.

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<MSAHyperDimensionalDataFile Version="1.0" UID="7FE6B4B91EB3B81E" xml:lang="en-US">
    <Header>
        <Title>Gneiss</Title>
        <Date>2012-08-15</Date>
        <Time>16:15:16</Time>
        <Timezone>AUS Eastern Standard Time</Timezone>
        <Author>Clayton Microbeam Laboratory; CSIRO Process Science and Engineering.</Author>
        <Owner>CSIRO Process Science and Engineering</Owner>
        <AuthorSoftware Version="12.2.0.0" libhmsaVersion="12.2.0.0">EpmxToHmsa</AuthorSoftware>
        <SplitFrom UID="59E71C1137E64CAE" Software="hmsaSplitter">Gneiss.hmsa</SplitFrom>
        <Checksum Algorithm="SHA-1">79C5C30510A4F515E62F9F8BC9762BB8F59CF6ED</Checksum>
    </Header>
    <Conditions>
        <Instrument>
            <Manufacturer>FEI Company</Manufacturer>
            <Model>Quanta400F FEG-ESEM</Model>
        </Instrument>
        <Probe Class="EM">
            <BeamVoltage DataType="float" Unit="kV">15.</BeamVoltage>
        </Probe>
        <Raster Class="XY">
            <XStepCount DataType="uint32">512</XStepCount>
            <YStepCount DataType="uint32">400</YStepCount>
            <XStepSize DataType="float" Unit="um">3.36</XStepSize>
            <YStepSize DataType="float" Unit="um">3.36</YStepSize>
            <RasterMode>Stage</RasterMode>
            <DwellTime DataType="float" Unit="ms">272.</DwellTime>
        </Raster>
        <Detector Class="XEDS">
            <Manufacturer>Bruker AXS</Manufacturer>
            <Model>XFLASH 5010</Model>
            <MeasurementUnit>counts</MeasurementUnit>
            <Technology>SDD</Technology>
            <Channels DataType="uint32">2047</Channels>
            <Calibration Class="Linear">
                <Quantity>Energy</Quantity>
                <Unit>eV</Unit>
                <Gain DataType="float">10.</Gain>
                <Offset DataType="float"quot;>-475.</Offset>
            </Calibration>
            <MaxThroughput DataType="float" Unit="kcounts/s">60.</MaxThroughput>
            <TimeConstant DataType="float" Unit="us">16.700001</TimeConstant>
            <StrobeRate DataType="float" Unit="Hz">1000.</StrobeRate>
            <Area DataType="float" Unit="mm2">10.</Area>
            <Elevation DataType="float" Unit="degrees">45.</Elevation>
        </Detector>
    </Conditions>
    <Data>
        <ImageRaster Class="2D/Spectral" Name="EDS map">
            <DataOffset DataType="int64">8</DataOffset>
            <DataLength DataType="int64">419225600</DataLength>
            <DatumType SizeInBytes="1">byte</DatumType>
            <DatumDimensions>
                <Dimension DataType="uint32" Name="Channel">2047</Dimension>
            </DatumDimensions>
            <CollectionDimensions>
                <Dimension DataType="uint32" Name="X">512</Dimension>
                <Dimension DataType="uint32" Name="Y">400</Dimension>
            </CollectionDimensions>
            <IncludeConditions />
        </ImageRaster>
    </Data>
</MSAHyperDimensionalDataFile>

The same file, stripped of all conditions and header metadata, produces the following baseline file. Note that to use this file, the user must manually keep track of EDS gain & offset, beam current & voltage, etc.

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<MSAHyperDimensionalDataFile Version="1.0" UID="1801E95BD3570275" xml:lang="en-US">
    <Header />
    <Conditions />
    <Data>
        <ImageRaster Class="2D/Spectral" Name="EDS map">
            <DataOffset DataType="int64">8</DataOffset>
            <DataLength DataType="int64">419225600</DataLength>
            <DatumType SizeInBytes="1">byte</DatumType>
            <DatumDimensions>
                <Dimension DataType="uint32" Name="Channel">2047</Dimension>
            </DatumDimensions>
            <CollectionDimensions>
                <Dimension DataType="uint32" Name="X">512</Dimension>
                <Dimension DataType="uint32" Name="Y">400</Dimension>
            </CollectionDimensions>
            <IncludeConditions />
        </ImageRaster>
    </Data>
</MSAHyperDimensionalDataFile>