Convenience function for data import from CSV and DataFrames
There are two new convenience functions to import data from a pandas DataFrame or from a CSV file.
It follows the following rules for mapping the columns in the DataFrame
to the DataObject
's in the dataset:
- If a custom mapping is provided for a
DataObject
, this takes precedence over all the other rules. - If the name of a
DataObject
is found in the columns of the dataframe, it is mapped to that column. Note that for multidimensionalDataObject
's, columns can contain lists. - If a
DataObject is
multidimensional and the name of aDataObject
is not found in the columns of the dataframe, we search forDataObject.columns_df
in the columns of the dataframe. - If above rules do not apply, the import fails, and the user can either provide a custom mapping or rename the columns of the dataframe.
For the performance attributes, the importer also checks if the column "error" is present in the dataframe. If so, it is added to the performance attributes.
Example
dataset = Dataset(
name="toy_data_set",
design_par=design_parameters,
perf_attributes=performance_attributes)
# A custom mapping is only needed if the names of the columns in the DataFrame do not match with the names of the DataObject's
custom_mapping = {
"score_0": "col_0",
"multi_score_0": "col_1",
"multi_score_1": ["col_4", "col_5"],
"score_1": ["col_2", "col_3"],
}
dataset.import_data_from_df(df, custom_mapping=custom_mapping)
Edited by Alessandro Maissen