How To Add A New Model
ResINS strives to be a repository for all models of INS instruments, but it is inevitable that there will be some that have not been included yet. Fortunately, ResINS was designed to make adding new models as straightforward as possible, though depending on how unique the model is, it may require significant amount of work:
How to add a new parameter-only model
Adding a new model which only alters the parameters of another, existing model is the simpler task; all that is required is to edit the YAML data file of the corresponding instrument. Now, there is going to be some difference in how to do this when creating a model for personal use only and when contributing to ResINS, but the overall process is identical to adding a new version, so see that guide for details. Here, going forward, it is assumed that you have a YAML data file ready for editing.
A model is a “property” of a version of an instrument
and so it must be added to a particular version (though
YAML magic can be used to avoid repetition). Therefore,
to add a new model, up to two new entries must be added inside
models key of a particular version of an instrument:
If adding a new version of an existing model:
A new entry must be added whose key has the same name as the previous versions, but whose version number is incremented by one. E.g., given a model called
model1and which has versionsmodel1_v1andmodel1_v2, the new model should be calledmodel1_v3.The corresponding value should be a dictionary containing the data for the model, see below for details.
If this new version should become the new default version for the model (for example in the case of a bugfix), please edit the associated alias entry to point to the new version. For example, using the above names, there should be an entry
model1: "model1_v2"which should be changed tomodel1: "model1_v3".In this case, the key-value pair should already exist and only needs to be changed.
The value of this entry must be kept as a string which must correspond to a valid key in the
modelsdictionary.
If adding a new model, unrelated to the others:
A new entry must be added whose key has the name following the schema:
{model_name}_v1(since this is a new model, its first version should have the number1), e.g.model1_v1.The corresponding value should be a dictionary containing the data for the model, see below for details.
A new entry must be added whose key is only the
model_namefrom above.The corresponding value must be a string whose value must be the key added previously, so (using the above example)
model1: "model1_v1".
How to specify model parameters
With the model keys added, the next step is to add the data associated with the versioned model. The data must follow the YAML file spec, where the guidance on what belongs where can also be found, but the general points to keep in mind are as follows:
The
functionentry must have a value that corresponds to one of the existing models. Seeresins.models.MODELSfor a dictionary that maps thesefunctionvalues to ResINS model objects. The created model will use the corresponding object.To use a
functionnot listed inMODELS, new code will have to be written, see How to add a new algorithm for a new model.
The
parametersdictionary must contain all the parameters that the relevant model expects (see the associated API documentation, especially that of theModelDatasubclass) and that is not included inconfigurations.It might be useful to look at the existing YAML files that use the same model as the
configurationsare likely to be the same. Normally, only the values of some of the parameters are likely to be different between different use-cases of the same model.Ultimately, the
parametersandconfigurationsdepend on the mathematics/physics of the model and the physical INS instrument. If unsure, and especially when contributing to ResINS, do not hesitate to contact us on our GitHub.
How to add a new algorithm for a new model
Creating a new model that uses new physics/mathematics - ones that are not yet implemented in ResINS - is significantly more work than the case above, since Python code will have to be written. Though, before we start, some notes on the procedures for different outcomes:
For personal use, the new code can be placed wherever - it will have to be registered with ResINS as explained below.
If contributing to ResINS, please use a new file in
resins/src/resins/models.
How to add a new model data class
The first step should be creating a new class that will hold all the data accessible to the new model. It does not have to be completed immediately - it is ok to continue adding to it as the model is being developed - but it is a good starting point since the model class will need this class.
The data class must be a subclass of
resins.models.model_base.ModelData (please read its
documentation for details on how it and its subclasses are supposed to work)
and it must be decorated with the dataclasses.dataclass() decorator
with the following arguments:
init=Truerepr=Truefrozen=Trueslots=Truekw_only=True
Thus, for example:
from dataclasses import dataclass
from resins.models.model_base import ModelData
@dataclass(init=True, repr=True, frozen=True, slots=True, kw_only=True)
class TestModelData(ModelData):
param1: int
param2: bool
The parameters in this class should be specified as is normal for a
dataclass and should represent all the parameters required by the model.
I.e., these should be the combination of the values from the YAML file
parameters and from the chosen
options. How these required values will be split
between the two places does not matter for the Python code as
get_resolution_function()
combines the two before creating the data object (i.e. TestModelData).
Note
If the YAML file is going to contain default values and/or restrictions on
some arguments for the model (e.g. the model takes e_init and YAML file
specifies that the default value is 100 and that only values between
10 and 1000 are allowed), you will need to reimplement the
resins.models.model_base.ModelData.defaults and/or
resins.models.model_base.ModelData.restrictions
properties (the documentation does not need to be overwritten, just the
code). For example:
from dataclasses import dataclass
from typing import Any
from resins.models.model_base import ModelData
@dataclass(init=True, repr=True, frozen=True, slots=True, kw_only=True)
class TestModelData(ModelData):
default_e_init: int
e_init_restrictions: list[int]
@property
def defaults(self) -> dict[str, list[int | float]]:
return {'e_init': self.default_e_init}
@property
def restrictions(self) -> dict[str, Any]:
return {'e_init': self.e_init_restrictions}
How to add a new model class
With the data class in place, it is possible to create the model, which is a
subclass of
resins.models.model_base.InstrumentModel (see its
documentation for detailed specification of how to inherit from it). This
must specify three class-level variables:
input- an integer specifying the number of arguments that the `__call__`method takesoutput- an integer specifying the number of outputs the__call__method returnsdata_class- a reference to the data class created above, for example:
class TestModel(InstrumentModel):
input = 1
output = 1
data_class = TestModelData
Next, the __init__ method should be defined. The function signature:
Must take
model_data(an instance of the data class above) as its first argument.May take any number of other argument (usually representing settings, but may be others).
Must not expect the independent variables (i.e. the variables that the model is a function of such as energy transfer or momentum).
Must take
**kwargs.
The body of __init__():
Must call
super().__init__.Should (if necessary) perform any validation of the arguments, e.g. that the
e_initis within the allowed range etc.Should perform as much of the calculation as possible without the independent variables. These pre-computed, intermediate values should be stored as instance variables.
For more complex calculations, it might be advisable to break them up into multiple methods. Any such methods should be private (i.e. start with
_) and, if possible, should be made@staticmethodor@classmethod.
It should not keep a reference to
model_data.
For example:
from resins.models.model_base import InstrumentModel, InvalidInputError
class TestModel(InstrumentModel):
def __init__(self, model_data: TestModelData, e_init: float | None = None, **_):
super().__init__(model_data)
if e_init is None:
e_init = model_data.default_e_init
elif not model_data.e_init_restrictions[0] <= e_init <= model_data.e_init_restrictions[1]:
raise InvalidInputError('Good message')
self.useful_value = 0.5 * e_init ** 3
Lastly, the __call__ method must be implemented:
It must take as arguments all the independent variables that the model models the resolution as a function of.
It must accept
*argsand**kwargs.It should perform the remaining computation of the resolution, using the instance variables.
It must return the resolution at the values of independent variables provided via the arguments.
For example:
from jaxtyping import Float
import numpy as np
from resins.models.model_base import InstrumentModel
class TestModel(InstrumentModel):
def __call__(self, frequencies: Float[np.ndarray, 'frequencies'], *args, **kwargs
) -> Float[np.ndarray, 'sigma']:
return frequencies * self.useful_value
Then, with the model code complete, all that remains is to register it with ResINS:
If the above code is outside the ResINS repo (i.e. for personal use), somewhere in your program (before the model is intended to be used) a new key-value pair has to be inserted into
resins.models.MODELS:
from resins.models import MODELS
from resins import Instrument
from custom_model_source import TestModel
MODELS['test_model'] = TestModel
instr = Instrument.from_file('path/to/data.yaml', 'version')
model = instrument.get_resolution_function(model_name='model1', e_init=200)
assert isinstance(model, TestModel)
If the above code is inside the ResINS repo (i.e. to be submitted to the code-base), the
resins.models.MODELSdictionary (found atresins/src/resins/models/__init__.py) has to be modified by adding a new key-value pair, where the key is the “name” of the function and the value is a reference to the above-created model.The key can be anything as it is not exposed to the user - it is only present here and in the YAML data files, but it should be somewhat relevant. The only thing that matters is that it is globally unique.
from resins.models.test_model import TestModel
MODELS = {
...
'test_model': TestModel,
}
How to add the data
Now, with all the code in place, only the data that the model will use has to be added. Since the code is written, the case has effectively become the same as adding a parameter-only model (see the guide for more details). The only difference is that, in this case, it is possible to tweak the code (if necessary) to make everything nicer. Along with that, though, comes the responsibility of structuring the data appropriately - there is no example among the other YAML files to look to. How to do this is up to you, but some points of advice are:
The
configurationssection should reflect the physical INS instrument, see configuration.All parameters that change depending on which option for a configuration is chosen should be in the
configurationssection. The should not be any parameters that are present in both theparametersandconfigurationssections.There is no advice for parameters that depend on a combination of multiple different configurations - contact the maintainers
If the
parameterssection contains a large number of parameters, it can be a good idea to group some of these parameters into dictionaries.This is especially recommended if there is a logical reason for the grouping, for example grouping all the moderator parameters together.
To maintain type hinting, the corresponding fields in the Python (the model data class) can be made in
typing.TypedDict:
model1_v1:
parameters:
param1: 1
param2: 2
param3: 3
param4: 4
and
class TestModelData(ModelData):
param1: int
param2: int
param3: int
param4: int
can be changed into:
model1_v1:
parameters:
param1: 1
group1:
param2: 2
param3: 3
param4: 4
and
class TestModelData(ModelData):
param1: int
group1: Group1
class Group1(TypedDict):
param2: int
param3: int
param4: int