When we train a machine learning model, there are two types of parameters: those that are learned from the data (like the weights in a linear regression), and those that are set by the data scientist before training begins. These external, user-set parameters are called hyperparameters.
Examples of hyperparameters include:
- The learning rate (
alphaorlambda) in regularization. - The number of trees (
n_estimators) in a Random Forest. - The number of clusters (
k) in K-Means. - The number of layers and neurons in a neural network.
The performance of a model can be critically dependent on the choice of these hyperparameters. The process of finding the optimal combination of hyperparameters is called hyperparameter tuning.
Common Tuning Strategies
How do we find the best settings? We can’t know them ahead of time. The solution is to try many different combinations and see what works best, using a validation set to evaluate performance.
Grid Search
Grid Search is the most traditional method. You define a “grid” of specific hyperparameter values you want to try. The algorithm then exhaustively trains and evaluates a model for every possible combination of these values.
- Pros: It’s guaranteed to find the best combination within the grid you specify.
- Cons: It can be incredibly slow and computationally expensive, especially if you have many hyperparameters or a large range of values for each. This is often called the “curse of dimensionality.”
Random Search
Random Search offers a more efficient alternative. Instead of trying every single combination, it randomly samples a fixed number of combinations from the hyperparameter space.
- Pros: It’s much faster than Grid Search. Research has shown that Random Search is often more effective because some hyperparameters are much more important than others. By searching randomly, you are more likely to find a good value for the important parameters.
- Cons: It’s not guaranteed to find the absolute best combination.
More advanced techniques like Bayesian Optimization exist, which use the results from previous iterations to intelligently choose the next set of hyperparameters to try.
Hyperparameter Tuning with Scikit-Learn
scikit-learn provides excellent tools for both Grid Search (GridSearchCV) and Random Search (RandomizedSearchCV). Let’s see how to use them to tune a RandomForestClassifier.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, GridSearchCV, RandomizedSearchCV
from sklearn.ensemble import RandomForestClassifier
from scipy.stats import randint
# --- 1. Generate Data ---
X, y = make_classification(n_samples=1000, n_features=20,
n_informative=5, n_redundant=0,
random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# --- 2. Grid Search Example ---
print("--- Starting Grid Search ---")
# Define the grid of parameters to search
param_grid = {
'n_estimators': [100, 200],
'max_depth': [10, 20, None],
'min_samples_leaf': [1, 2, 4]
}
# Total combinations: 2 * 3 * 3 = 18
# Instantiate the grid search model
# cv=3 means 3-fold cross-validation
grid_search = GridSearchCV(estimator=RandomForestClassifier(random_state=42),
param_grid=param_grid,
cv=3, n_jobs=-1, verbose=1)
grid_search.fit(X_train, y_train)
print(f"Best parameters found by Grid Search: {grid_search.best_params_}")
# --- 3. Random Search Example ---
print("\\n--- Starting Random Search ---")
# Define the distribution of parameters to sample from
param_dist = {
'n_estimators': randint(50, 250),
'max_depth': randint(5, 30),
'min_samples_leaf': randint(1, 5)
}
# n_iter=18 means we will try 18 random combinations
random_search = RandomizedSearchCV(estimator=RandomForestClassifier(random_state=42),
param_distributions=param_dist,
n_iter=18, cv=3, n_jobs=-1, verbose=1, random_state=42)
random_search.fit(X_train, y_train)
print(f"Best parameters found by Random Search: {random_search.best_params_}")
# You can access the best model directly
best_model = random_search.best_estimator_
print(f"\\nBest model accuracy on test set: {best_model.score(X_test, y_test):.4f}")What the Code Does
- Grid Search: We define a
param_gridwith specific values forn_estimators,max_depth, andmin_samples_leaf.GridSearchCVwill train a model for all 18 possible combinations, using 3-fold cross-validation for each one to ensure robustness. - Random Search: We define a
param_distusing probability distributions fromscipy.stats. For example,randint(50, 250)will randomly sample integers between 50 and 250. We setn_iter=18to make it comparable to the grid search in terms of the number of models trained. - Best Model: Both search objects have a
best_params_attribute that shows the optimal combination found, and abest_estimator_attribute that gives you the model trained with those parameters, which you can then use for prediction.
Conclusion
Hyperparameter tuning is a critical step for maximizing the performance of your machine learning models. While Grid Search is a solid, exhaustive approach, Random Search often provides a better balance between computation time and performance. By systematically exploring different hyperparameter settings, you can move from a good baseline model to a highly optimized one that is tailored to the specific characteristics of your dataset.



