Requests
TrainIIDReqBody
Fields:
training_data: the raw bytes representing the training data. Is a CSV file as a sting. Should not be accessed directly.feature_types: the columns in the training data.target: the targetColumn.folds: the number of folds during cross validation.validation_data: the optional raw bytes representing the validation data. Should not be accessed directly.holdout_data: the optional raw bytes representing the holdout data. Should not be accessed directly.fold_assignment_column: an optional field denoting a column representing the fold that each row belongs to. If empty,exodusutilswill cut the cv folds by taking the modulo byfoldsto the row indices.
Methods:
get_feature_names: returns the names of the columns.get_training_df: returns the parsed pandasDataFramegenerated from the bytes intraining_data. Note that the data types of the columns will correspond to the ones specified infeature_types.get_validation_data: returns the parsed pandasDataFramegenerated from the bytes invalidation_data, orNoneif there's no validation data.get_holdout_data: returns the parsed pandasDataFramegenerated from the bytes inholdout_data, orNoneif there's no holdout data.
PredictReqBody
Fields:
model_id: the ID of the model you want to predict with.data: the raw bytes representing the prediction data. Is a JSON string. Should not be accessed directly.threshold: the threshold for classification predictions.keep_columns: the columns to keep in the prediction results.
Methods:
get_prediction_df: takes in a list ofColumns, and returns a pandasDataFramerepresenting the prediction data. The list ofColumnshould be stored into MongoDB once training is completed.