Skip to content

Convenience function for data import from CSV and DataFrames

Alessandro Maissen requested to merge 67-add-csv-importer into master

There are two new convenience functions to import data from a pandas DataFrame or from a CSV file.

It follows the following rules for mapping the columns in the DataFrame to the DataObject's in the dataset:

  • If a custom mapping is provided for a DataObject, this takes precedence over all the other rules.
  • If the name of a DataObject is found in the columns of the dataframe, it is mapped to that column. Note that for multidimensional DataObject's, columns can contain lists.
  • If a DataObject is multidimensional and the name of a DataObject is not found in the columns of the dataframe, we search for DataObject.columns_df in the columns of the dataframe.
  • If above rules do not apply, the import fails, and the user can either provide a custom mapping or rename the columns of the dataframe.

For the performance attributes, the importer also checks if the column "error" is present in the dataframe. If so, it is added to the performance attributes.

Example

dataset = Dataset(
     name="toy_data_set",
     design_par=design_parameters,
     perf_attributes=performance_attributes)

# A custom mapping is only needed if the names of the columns in the DataFrame do not match with the names of the DataObject's
custom_mapping = {
      "score_0": "col_0",
      "multi_score_0": "col_1",
      "multi_score_1": ["col_4", "col_5"],
       "score_1": ["col_2", "col_3"],
}
dataset.import_data_from_df(df, custom_mapping=custom_mapping)
Edited by Alessandro Maissen

Merge request reports

Loading