Separate input/output ML from Dataset class
Motivation
Currently, the selection of which data objects serve as input and output of the ML model is set and stored in the Dataset
object. This means that this also saved in the serialized dataset object file (both in pickle and json).
At the same time we also say that the role of the Dataset
class is to manage the data samples, which does not have to / should not extend to setting up the ML model.
Consider a scenario when the user has a given dataset, and wants to carry out various experiments with different inputML and outputML settings. They should be able to do this using the exact same copy of the dataset, and be able to somehow save these settings.
A possible solution we discussed is to delegate the input/output ML settings to the DataModule
class.