jorgehcb
4/15/2018 - 1:56 AM

[Python] GridSearchCV sample

'''
To search to tune parameter is to use Grid Search. Basically, it explores a 
range of parameters and finds the best combination of parameters. 
Then repeat the process several times until the best parameters are discovered. 
We will also use Stratified k-fold cross-validation that will prevent a certain
class only split them to the same subset.
'''

from sklearn.grid_search import GridSearchCV

dtc = DecisionTreeClassifier()

parameter_grid = {'criterion': ['gini', 'entropy'],
                  'splitter': ['best', 'random'],
                  'max_depth': [1, 2, 3, 4, 5],
                  'max_features': [1, 2, 3, 4]}

cross_validation = StratifiedKFold(all_classes, n_folds=10)

grid_search = GridSearchCV(dct, param_grid=parameter_grid, cv=cross_validation)

grid_search.fit(all_inputs, all_classes)
print('Best score: {}'.format(grid_search.best_score_))
print('Best parameters: {}'.format(grid_search.best_params_))

dtc = grid_search.best_estimator_
dtc