YAML Data File Specification
This file constitutes the official specification for the YAML data files used to store the data for an instrument.
Spec
The data must be stored in a YAML format file with the following structure:
name: "instrument_name"
default_version: "version1"
version:
version1:
default_model: "model1"
models:
model1: "model1_v1"
model1_v1:
function: "model_function_name"
citation: ["citation1", "citation2"]
parameters:
defaults:
setting1: 1
setting2: "value1"
restrictions:
setting1: [1, 100, 5]
setting2: !!set {"value1", "value2", "value3"}
setting3: [0, 2000]
parameter1: "value1"
parameter2: 2
parameter3: 3.14
parameter4: [1, 2, 3, 4]
configurations:
configuration1:
default_option: "option1"
option1:
parameter5: "value2"
parameter6: 56.1687
The highlighted lines contain keys with preset names that must not be
changed. The names of the remaining keys as well as the values can and should be
changed appropriately. Further, these freeform keys are free to have multiples
of (e.g. version1, version2, see Example), given that their
substructure is kept.
name
This key (see in spec) specifies the name of the
instrument, set as a string ("instrument_name"). This should be the
public, official name that should be used everywhere else in ResINS. This is
the name that resins.instrument.Instrument.name is set
to and that will be shown when printing Instrument, i.e.
print(instrument).
This name does not have to be unique, but since it is recommended to match the other uses of the same instrument in ResINS, it should be globally unique.
default_version
This key (see in spec) specifies the name
of the version that will be used by default for this instrument
when user does not specify which version they want to use, i.e. calling
from_default() with only
one argument, e.g. Instrument.from_default('TOSCA').
The value of this key, specified as a string, must match one of the version keys (see version).
version
This key (see in spec) contains all the data for all the versions. It must be a (YAML) dictionary in which each key is the name of an instrument version and its corresponding value is another dictionary with the associated data.
Warning
All of the entries in this dictionary will be interpreted as versions - no other data is permissible in this section. If anything not following the below guidelines is placed in the dictionary, it will lead to errors.
All the subkeys (version names) must be mutually unique, but none has to be globally unique, though it is recommended, if possible. Regardless, though, each of the subkeys must not be arbitrary - it should represent an official name for the given version.
Each value for the subkey (version name) in the dictionary must be a correctly formatted data for an instrument version in the form of a (YAML) dictionary. That said, though, this inner dictionary has less strict specification - the only requirement is that it contains a key called models. In fact, this space is encouraged to be used for storing shared data (see YAML magic).
default_model
This key (see in spec), found inside the
(YAML) dictionary corresponding to a particular instrument
version (see the version key), specifies the name
of the model that will be used by default when the user does not specify
which model they want to use, e.g. when calling
resins.instrument.Instrument.get_resolution_function().
The value of this key, specified as a string, must match one of the model keys (see version).
models
This key (see in spec), found inside the (YAML) dictionary corresponding to a particular instrument version (see the version key), contains all the data for all the models. Its value must be a (YAML) dictionary in which each key is the name of a model and its corresponding value is either:
Another dictionary with the associated data
In this case, the key (model name) must include a version number in the form
{model_name}_v{version_number}, e.g.PyChop_fit_v1, where theversion_numberis an integer.
A string whose value matches one of the keys whose value is a dictionary. Chaining will lead to errors.
In this case, the key (model name) must not include a version number.
Warning
All of the entries in this inner dictionary will be interpreted as models - no other data is permissible in this section. If anything not following the below guidelines is placed in the dictionary, it will lead to errors.
All the subkeys (model names) must be mutually unique, but none has to be globally unique - in fact, if a model is applicable to multiple instruments or versions, it is recommended that the same name is used for that model in each YAML file. Regardless, though, each of the subkeys must not be arbitrary - it should represent an official name for the given model.
Each value for the subkey (model name) in the dictionary must be a correctly formatted data for a model in the form of a (YAML) dictionary. That said, though, this inner dictionary has less strict specification - the only requirement is that it must contain the following keys:
Otherwise, other entries for the dictionary are not defined and may similarly be used for storing shared data (see YAML magic), so long as they do not clash with the names above.
function
This key (see in spec), found inside the (YAML) dictionary corresponding to a particular model, (see the model key), specifies the exact ResINS model object that will be instantiated when a user wants to use the particular model. The value for this key is a string.
Important
The value for this key must correspond to one of the keys in
resins.models.MODELS (and therefore must be
globally unique. For creating a new model, see How To Add A New Model.
citation
This key (see in spec), found inside the (YAML)
dictionary corresponding to a particular model, (see the
model key), specifies the citations/references associated
with the particular model of the particular instrument. These
are exposed to the user as-is via ModelData.citation and
InstrumentModel.citation.
The value corresponding to this key must be a list of strings, where each string is a shortened citation (only initials and last name, no paper title, etc.). There is no requirement for citation style beyond that, though the DOI should be included if there is one.
parameters
This key (see in spec), found inside the (YAML) dictionary corresponding to a particular model, (see the model key), specifies all the parameters required by the particular model. Its value must be a (YAML) dictionary in which each key is the name of a parameter of that model, and the value is a valid value for that parameter of that model.
The only intrinsic restrictions on this dictionary are that it must contain the
defaults and restrictions
key-value pairs. Otherwise, the only requirement is that it must contain
exactly the parameters required by the ResINS model specified by the
function value. There can be no missing or extra
parameters, though please note that some of the parameters required by the model
may be stored in the configurations dictionary. The
values must match the arguments expected by the associated ModelData
subclass, which means that the type of each parameter could be anything -
int, float, string, list, dict - as long as the
ModelData expects it. In fact, when
creating new models, it is encouraged to further
structure the data if there are many parameters.
defaults
This key (see in spec), found inside the parameters (YAML) dictionary, specifies the default values for the settings of a particular model. This key is required and its value must be a (YAML) dictionary in which each key is the name of a setting of that model, and the value is the default value that will be used if user does not provide a value for that setting.
Note
This (defaults) key is allowed to be an empty dictionary and also may specify only some of the settings for the model. I.e., it is allowed to have settings with no default values.
Each key inside the dictionary must correspond to a setting of that model, and its value must match the type. Additionally, if a default value is provided, it must be a valid value for that model:
It must be within the associated restrictions
If the model has other failure states (e.g. the PyChop model has a
NoTransmissionErrorat certain values), the use of the default values must not result in any of the failure states arising.
restrictions
This key (see in spec), found inside the parameters (YAML) dictionary, specifies the restrictions on the values for the settings of a particular model. This key is required and its value must be a (YAML) dictionary in which each key is the name of a setting of that model, and the value is the specification of the restrictions on the values for that setting. I.e., if the user provides a value that lies outside the restrictions (allowed values) an exception will be raised.
Note
This (restrictions) key is allowed to be an empty dictionary and also may specify only some of the settings for the model. I.e., it is allowed to have settings with no restrictions.
Each key inside the dictionary must correspond to a setting of that model, and its value must be one of the following:
A set (
!!set {}) - in this case, all the allowed values must be listed - a value not in the set will raise an error.A list (
[]):Length-2 list (e.g.
[1, 100]) - in this case, the two values specify the lower and upper bound for the allowed values (included, i.e. above example is <1, 100>)Length-3 list (e.g.
[1, 100, 10]) - in this case, the three values are arguments to therangefunction (i.e.range(1, 100, 10)), the result of which is treated as thesetcase (list of all allowed values).
Any other values for a key inside the dictionary is not valid and will be treated as a bug.
Note
If a setting is not bounded from exactly one side, the !!float inf
construct may be used to specify an infinity as one of the bounds. However,
if there is no restriction on a setting, please leave out the key rather
than specifying the bounds as +inf and -inf.
configurations
This key (see in spec), found inside the (YAML) dictionary corresponding to a particular model, (see the model key), specifies all the configurations available to the particular model. Its value must be a (YAML) dictionary in which each key is the name of a configuration, and the corresponding value is the data associated with the configuration. This data consists of two different things:
The default_option key
The various options associated with the configuration.
Besides the special default_option entry, all the other entries in this inner dictionary will be interpreted as options - no other data is permissible in this section. If anything not following the below guidelines is placed in the dictionary, it will lead to errors.
All the subkeys (option names) must be mutually unique, but none needs to be globally unique. The only thing that matters is that they must not be arbitrary - each subkey should represent an official name for the given option.
Each value for the subkey (option name) in the dictionary must be a correctly formatted data for an option in the form of a (YAML) dictionary. Each key in this dictionary must be a parameter of the associated model and its value a valid value for that parameter of that model. Each entry must contain all the parameters that configuration can change; shared values should be handled via YAML magic.
Similar to parameters, there are no restrictions on the
values for the entries in this dictionary except those placed by the relevant
ModelData. The parameters in the parameters section
and those in this section must together make up exactly the parameters
required by the ModelData.
Important
While, in the
get_resolution_function()
method, the configurations override the
parameters, using this fact is heavily discouraged
because it is not guaranteed.
default_option
This key (see in spec), found inside the
(YAML) dictionary corresponding to a particular configuration, (see the
configurations key), specifies the name of the
option that will be used by default for this configuration
when user does not specify which option they want to use, i.e. calling
get_resolution_function()
without specifying the configuration, e.g.
maps.get_resolution_function('PyChop_fit').
The value of this key, specified as a string, must match one of the option keys (see configurations).
YAML magic
To avoid repetition and prevent errors, the use of anchors and aliases is encouraged. This allows for data to be set only once and used in multiple places, keeping the files smaller and hopefully avoiding bugs. That said, the shared data has to be placed somewhere where it will not clash with the expectations that ResINS has, as it still remains in its original location when expanded by the YAML parser. There are multiple such places:
Example
name: "instrument"
default_version: "new_version"
version:
old_version:
default_model: "model3"
models:
model3: "model3_v1"
model3_v1:
function: "model3_function"
citation: ["https://mantid.org/docs/relevant-page.html"]
parameters:
fit: [0.6546, 2.10548, -9.5, -0.00004]
configurations: {}
old_model: "old_model_v1"
old_model_v1:
function: "old_function"
citation: ["A. Doof et. al., Sci. Mag., 1975, 1, 1-6."]
parameters:
distance: 1.5
length: 2e-2
configurations:
chopper_package:
default_option: "G"
G:
value1: 1
H:
value1: 2
analyzer:
default_option: "Forward"
Forward:
value2: 3
Backward:
value2: 4
new_version:
constants: &version1_constants
distance: 2.0
length: 1e-3
allowed_e_init: [10, 1000]
kind: "kind1"
matrix:
[[1, 0],
[0, 1]]
sample:
width: 1.0
height: 2.0
choppers: &version1_choppers
chopper: &version1_chopper
chopper1:
number: 2
size: 2.25
chopper2:
number: 1
size: 9.1
chopper3: &version1_chopper3
number: 4
size: 0.2
configurations: &version1_configurations
chopper_package:
default_option: "A"
A:
slit: 3.14e-3
<<: *version1_choppers
B:
slit: 1.88e-3
<<: *version1_choppers
C:
slit: 1.88e-3
chopper:
<<: *version1_chopper
chopper3:
<<: *version1_chopper3
size: 0.3
default_model: "model1"
models:
model1: "model1_v3"
model1_v1:
function: "model1_function"
citation: ["A. Yi, H. Wells, and Y. Li, Sci. Mag., 2009, 42, 700-706. https://doi.org/164648"]
parameters: *version1_constants
configurations: *version1_configurations
model1_v2:
function: "model1_function_modified"
citation: ["A. Yi, H. Wells, and Y. Li, Sci. Mag., 2010, 44, 700-706. https://doi.org/164648"]
parameters: *version1_constants
configurations: *version1_configurations
model1_v3:
function: "model1_function_modified"
citation: ["A. Yi, H. Wells, and Y. Li, Sci. Mag., 2015, 69, 700-706. https://doi.org/164648"]
parameters:
<<: *version1_constants
kind: "kind2"
configurations: *version1_configurations
model2: "model2_v1"
model2_v1:
function: "model2_function"
citation: ["Z. Zun et. al., Book On The Topic, Publisher, 1999. ISBN 000-000-000-0", "J. Adams et. al., Sci. Mag., 2000, 27, 1-12."]
parameters: *version1_constants
configurations: {}
model3: "model3_v1"
model3_v1:
function: "model3_function"
citation: ["https://mantid.org/docs/relevant-page.html"]
parameters:
fit: [1.6546, 0.10548, -99.5, 0.00004]
configurations: {}
Validation
Validation of data files can be performed using a script found in the GitHub
repository at resins/dev/validate_data_file.py.