Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm attempting to create, train, and test sklearn models iteratively:

for min_samples_leaf in [1, 2, 3, 5]:
    for min_samples_split in [2, 3, 4, 10]:
        for n_estimators in [200, 500, 1000, 1500]:
                classifier = RandomForestClassifier(bootstrap=True, min_samples_leaf=min_samples_leaf, min_samples_split=min_samples_split, n_estimators=n_estimators, random_state=6, n_jobs=4)
                classifier.fit(X_train, y_train)
                print(accuracy_score(y_validate, classifier.predict(X_validate)))

However, the accuracy score is the same every time that classifier is trained and prints a result against a validation set.

My questions are (1) why is this happening? (2) what is the correct way to take this approach?

Edit: It may be relevant to note that I'm also measuring accuracy in other ways as well as the accuracy score, and the results are truly identical with every iteration.

question from:https://stackoverflow.com/questions/65943860/instantiating-sklearn-models-iteratively

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
86 views
Welcome To Ask or Share your Answers For Others

1 Answer

That's because you're training the last classifier variable value, which happens to be the last configuration of the loop.

To solve this scenario I suggest you two approach:

  1. run the classifier.fit() function right after the classifier assignation and store the result into an array/dictionary, in the fashion that suits you best.
  2. create an array (classifiers =[]) on top of the first loop and append to it every new classifier you configure, latter iterate such list, and fit each classifier.

Going further

What you're trying to do is a hyper parameter search, and that's not the most scalable way.

You can take a look into this blog entry for how to properly do it: https://towardsdatascience.com/a-conceptual-explanation-of-bayesian-model-based-hyperparameter-optimization-for-machine-learning-b8172278050f

If you are in a hurry and need to see how to implement hyper parameter tuning right away and need an example, see the notebook of the above blog entry.

https://github.com/WillKoehrsen/hyperparameter-optimization/blob/master/Bayesian%20Hyperparameter%20Optimization%20of%20Gradient%20Boosting%20Machine.ipynb


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...