Train the machine learning model
Let's take a look at the train_model
function, where we actually train a machine learning model:
def train_model(train_frames: TrainFrames, target: Column):
"""
Trains a model.
Parameters
----------
train_frames : TrainFrames
The frames required during training. Includes the training dataframe, the validation dataframe, and the testing dataframe.
target : Column
The target column, tells you the name and the type of the column.
Returns
-------
model
The trained model.
"""
train = train_frames.train
features = train.loc[:, train.columns != target.name]
targets = train[target.name]
if target.data_type == DataType.double:
model = LinearRegression()
else:
model = LogisticRegression()
model.fit(X=features, y=targets)
return model
Here, we extract the sub-frame that we will train the model with:
train = train_frames.train
And then split the sub-frame into one dataframe with features to train with, and another one with only the target values to fit the machine learning model:
features = train.loc[:, train.columns != target.name]
targets = train[target.name]
If we are dealing with a regression problem, where the target column consists of numeric values, we use LinearRegression
algorithm. Otherwise we are dealing with a classification problem, and we use LogisticRegression
algorithm.
if target.data_type == DataType.double:
model = LinearRegression()
else:
model = LogisticRegression()
After we decide which algorithm to use, we can fit the model, and then return it:
model.fit(X=features, y=targets)
return model
Implementing your own thing
Most of the code here should be changed by you. In particular these are the things you need to consider:
- Perhaps your machine learning algorithm is able to handle both classification and regression problems
- Your model might need to filter out some more features
- Instead of
fit
, your model might need to do something else - Your model might require a validation dataframe, which can be retreived via
train_frames.validation
Either way, the content of this function should only serve as an example, it is up to you to decide what should actually be the machine learning model.