Skip to content

#59: Revise DataObject's

This MR makes the following changes/contributions and tackles problems mentioned in #59 (closed)

In domain_def.py

  • add an abstract base class Domain to act as a super class for Options and Interval

New normalization.py

  • Add a class DataObjectNormalization, which can be subclassed to implement per data object normalizations
  • Implement all the existing normalizations as an own subclass

New transform.py

  • Add a class DataObjectTransform, which can be subclassed to implement per data object transformations
  • Implement all the existing transformations as an own subclass

In data_types.py

  • Use the new classes for transformations and normalization
  • Replace _apply_transf(...) by transform(...) and inverse_transform(...)
  • Add setters and getters to handle DataObject name
  • Add proper type annotations and documentation
  • Remove the property value(), data can not be set anymore on the DataObject
  • Add TODOs for further revision and simplification

New test_data_objects.py

  • Add tests to validate implementation

Other

  • Make the adjustments in the code base to fit the new interfaces of DataObject, this includes the toy example.

Breaking changes:

  • There is no flag flag_norm_perfeat on the DataObject anymore. However, you can use the per_column flag on the normalization. So to init a DataObject with per feature normalization do
dobj_a = DataObject("test_a", dim=1, domain=Interval(0,1), normalization="norm_0to1", norm_arg_dict={"per_column": True})
dobj_b = DataObject("test_b", dim=1, domain=Interval(0,1), normalization=ZeroToOne(per_column=True))

dobj_c = DataObject("test_c", dim=1, domain=Interval(0,1))
dobj_c.normalization = ZeroToOne(per_column=True)
  • There is no DataObject._apply_transf(...) anymore, use transform(...) and inverse_transform(...)
  • The masked normalization, used for instance when the domain is a MaskedInterval, must set explicitly.
dobj_a = DataReal("test_a", dim=1, domain=IntervalMasked(1,2), normalization="masked_norm_0to1")
dobj_b = DataReal("test_b", dim=1, domain=IntervalMasked(1,2), normalization=MaskedZeroToOne())

dobj_c = DataReal("test_c", dim=1, domain=IntervalMasked(1,2))
dobj_c.normalization = "masked_norm_0to1"
  • Before it was possible to init a DataReal as dobj = DataReal("name", domain=Interval(0,1)) or dobj = DataReal("name", range=[0,1]). The latter is not possible anymore. However you can use the new classmethod for that, i.e., dobj = DataReal.from_range("name", vmin=0, vmax=1). Similar changes for init categorical variables from an option list, use DataCategorical.from_options(...)

In a nutshell this MR does simplify back-transformation of predictions, resolves problems when copying DataObjects, and partially tackles #49 (closed) by documenting the per DataObject normalzations.

Edited by Alessandro Maissen

Merge request reports

Loading