Parallelize mlrose randomized optimizations#
Faster experimentation using the new parallel script.
Tutorial by Nikhil Kapila - 21.02.2025
Warning
21.02.2025: I do not have access to the PyPI project so the code can be copied from GitHub into your local environment, i.e. this will not work if you install it through pip.
The source can be viewed here.
import mlrose_ky as mlrose
from mlrose_ky.utils import parallel
import numpy as np
import time
from pprint import pprint
Creating the parameter grid#
In this section, we initialize all the different parameters required by our algorithms.
We use an example of Four Peaks for this notebook to explain the parallelized FASTER runs.
# We use 3 seeds
seeds = [1, 2, 3]
# for Four Peaks problem
t_pct = 0.05
# for all algorithms
max_attempt = [10, 25, 50]
max_iter = [5000]
# RHC
restart = [1, 2, 5, 10, 15, 25, 50, 100]
# pop_size for GA and MIMIC
# mutation_prob for GA
# keep_pct for MIMIC
pop_size = [50, 100, 200, 400, 800]
mutation_prob = [0.05, 0.1, 0.2, 0.3, 0.4]
keep_pct = [0.1, 0.2, 0.3, 0.4]
# decays for SA, you can pick the other types too --> refer docs
decays = []
for t in [1, 2, 5]:
for decay in [0.0001, 0.00025, 0.0005]:
decays.append(mlrose.ArithDecay(init_temp=t, decay=decay, min_temp=0.001))
RHC#
We use a single sized RHC, pass in all our previous parameters, and generate the output results.
view_params displays how your input parameters are expanded into all possible combinations of their values, generating a full set of parameter permutations.
With view_params as False#
for size in [20]: #, 60, 100]:
print("fourpeaks of size:", size)
problem = mlrose.DiscreteOpt(
length=size, fitness_fn=mlrose.FourPeaks(t_pct=t_pct), maximize=True
)
problem.set_mimic_fast_mode(fast_mode=True)
# Hyperparameter tunning section
print("rhc")
rhc_grid = {
"problem": [problem],
"max_attempt": max_attempt,
"max_iter": max_iter,
"restart": restart,
"seeds": [seeds],
}
rhc_results = parallel.get_results(rhc_grid, parallel.rhc_run, verbose=True, view_params=False)
fourpeaks of size: 20
rhc
Number of params: 24
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done 24 out of 24 | elapsed: 3.8s finished
With view_params as True#
You can see how the different combinations come out.
for size in [20]: #, 60, 100]:
print("fourpeaks of size:", size)
problem = mlrose.DiscreteOpt(
length=size, fitness_fn=mlrose.FourPeaks(t_pct=t_pct), maximize=True
)
problem.set_mimic_fast_mode(fast_mode=True)
# Hyperparameter tunning section
print("rhc")
rhc_grid = {
"problem": [problem],
"max_attempt": max_attempt,
"max_iter": max_iter,
"restart": restart,
"seeds": [seeds],
}
rhc_results = parallel.get_results(rhc_grid, parallel.rhc_run, verbose=True, view_params=True)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
fourpeaks of size: 20
rhc
[{'max_attempt': 10,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 1,
'seeds': [1, 2, 3]},
{'max_attempt': 10,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 2,
'seeds': [1, 2, 3]},
{'max_attempt': 10,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 5,
'seeds': [1, 2, 3]},
{'max_attempt': 10,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 10,
'seeds': [1, 2, 3]},
{'max_attempt': 10,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 15,
'seeds': [1, 2, 3]},
{'max_attempt': 10,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 25,
'seeds': [1, 2, 3]},
{'max_attempt': 10,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 50,
'seeds': [1, 2, 3]},
{'max_attempt': 10,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 100,
'seeds': [1, 2, 3]},
{'max_attempt': 25,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 1,
'seeds': [1, 2, 3]},
{'max_attempt': 25,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 2,
'seeds': [1, 2, 3]},
{'max_attempt': 25,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 5,
'seeds': [1, 2, 3]},
{'max_attempt': 25,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 10,
'seeds': [1, 2, 3]},
{'max_attempt': 25,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 15,
'seeds': [1, 2, 3]},
{'max_attempt': 25,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 25,
'seeds': [1, 2, 3]},
{'max_attempt': 25,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 50,
'seeds': [1, 2, 3]},
{'max_attempt': 25,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 100,
'seeds': [1, 2, 3]},
{'max_attempt': 50,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 1,
'seeds': [1, 2, 3]},
{'max_attempt': 50,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 2,
'seeds': [1, 2, 3]},
{'max_attempt': 50,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 5,
'seeds': [1, 2, 3]},
{'max_attempt': 50,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 10,
'seeds': [1, 2, 3]},
{'max_attempt': 50,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 15,
'seeds': [1, 2, 3]},
{'max_attempt': 50,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 25,
'seeds': [1, 2, 3]},
{'max_attempt': 50,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 50,
'seeds': [1, 2, 3]},
{'max_attempt': 50,
'max_iter': 5000,
'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
'restart': 100,
'seeds': [1, 2, 3]}]
Number of params: 24
[Parallel(n_jobs=-1)]: Done 24 out of 24 | elapsed: 0.9s finished
Understanding the output#
The output is a list of DataFrames, we had 24 different combinations and hence, there are 24 different output dataframes wherein each row in each output refers to a specific seed.
len(rhc_results)
24
Looking at the first dataframe, we can see our hyperparams of Restart=1 and the time taken per seed.
rhc_results[0]
| Seed | Best State | Best Fitness | Fitness Value | Fevals | Max Attempt | Max Iters | Restart | Problem | Time | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, ... | 29.0 | [4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 5.0, 8.0, ... | [18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 26.... | 10 | 5000 | 1 | <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... | 0.002735 |
| 1 | 2 | [0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, ... | 6.0 | [6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, ... | [25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0, 32.... | 10 | 5000 | 1 | <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... | 0.002735 |
| 2 | 3 | [1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, ... | 3.0 | [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... | [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, ... | 10 | 5000 | 1 | <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... | 0.002735 |
To access fitness value of Restart=1 for Seed=1, we can do the following.
rhc_results[1].loc[0, 'Fitness Value']
# or rhc_results[1]['Fitness Value'][0]
# view more ways here: https://pandas.pydata.org/docs/reference/api/pandas.Series.html
[4.0,
4.0,
4.0,
4.0,
4.0,
4.0,
4.0,
5.0,
8.0,
28.0,
28.0,
29.0,
29.0,
29.0,
29.0,
29.0,
29.0,
29.0,
29.0,
29.0,
29.0,
29.0]
Similarly, for SA#
sa_grid = {
"problem": [problem],
"max_attempt": max_attempt,
"max_iter": max_iter,
"decay": decays,
"seeds": [seeds],
}
sa_results = parallel.get_results(sa_grid, parallel.sa_run, verbose=True, view_params=False)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
Number of params: 27
[Parallel(n_jobs=-1)]: Done 27 out of 27 | elapsed: 1.2s finished
sa_results[0]
| Seed | Best State | Best Fitness | Fitness Value | Fevals | Max Attempt | Max Iters | Decay | Problem | Time | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, ... | 31.0 | [22.0, 22.0, 22.0, 22.0, 22.0, 22.0, 22.0, 26.... | [2.0, 4.0, 6.0, 8.0, 9.0, 11.0, 12.0, 14.0, 16... | 10 | 5000 | ArithDecay(init_temp=1, decay=0.0001, min_temp... | <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... | 0.132708 |
| 1 | 2 | [1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... | 34.0 | [2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 0.0, 0.0, 0.0, ... | [2.0, 4.0, 6.0, 8.0, 10.0, 12.0, 14.0, 16.0, 1... | 10 | 5000 | ArithDecay(init_temp=1, decay=0.0001, min_temp... | <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... | 0.132708 |
| 2 | 3 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... | 36.0 | [3.0, 3.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, ... | [2.0, 4.0, 6.0, 8.0, 10.0, 12.0, 14.0, 16.0, 1... | 10 | 5000 | ArithDecay(init_temp=1, decay=0.0001, min_temp... | <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... | 0.132708 |
GA#
ga_grid = {
"problem": [problem],
"pop_size": pop_size,
"mutation_prob": mutation_prob,
"max_attempt": max_attempt,
"max_iter": max_iter,
"seeds": [seeds],
}
ga_results = parallel.get_results(ga_grid, parallel.ga_run, verbose=True)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
Number of params: 75
[Parallel(n_jobs=-1)]: Done 52 tasks | elapsed: 9.1s
[Parallel(n_jobs=-1)]: Done 60 out of 75 | elapsed: 11.0s remaining: 2.8s
[Parallel(n_jobs=-1)]: Done 75 out of 75 | elapsed: 20.6s finished
ga_results[0]
| Seed | Best State | Best Fitness | Fitness Value | Fevals | Max Attempt | Max Iters | Pop Size | Mutation Prob | Problem | Time | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... | 38.0 | [27.0, 27.0, 27.0, 27.0, 28.0, 29.0, 29.0, 29.... | [102.0, 153.0, 204.0, 255.0, 307.0, 359.0, 410... | 10 | 5000 | 50 | 0.05 | <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... | 0.162858 |
| 1 | 2 | [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... | 37.0 | [26.0, 27.0, 27.0, 28.0, 28.0, 30.0, 30.0, 33.... | [102.0, 154.0, 205.0, 257.0, 308.0, 360.0, 411... | 10 | 5000 | 50 | 0.05 | <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... | 0.162858 |
| 2 | 3 | [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... | 38.0 | [28.0, 29.0, 31.0, 31.0, 33.0, 34.0, 34.0, 34.... | [102.0, 154.0, 206.0, 257.0, 309.0, 361.0, 412... | 10 | 5000 | 50 | 0.05 | <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... | 0.162858 |
MIMIC#
mimic_grid = {
"problem": [problem],
"pop_size": pop_size,
"keep_pct": keep_pct,
"max_attempt": max_attempt,
"max_iter": max_iter,
"seeds": [seeds],
}
mimic_results = parallel.get_results(mimic_grid, parallel.mimic_run, verbose=True)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
Number of params: 60
[Parallel(n_jobs=-1)]: Done 34 tasks | elapsed: 2.6s
[Parallel(n_jobs=-1)]: Done 60 out of 60 | elapsed: 9.4s finished
mimic_results[0]
| Seed | Best State | Best Fitness | Fitness Value | Fevals | Max Attempt | Max Iters | Pop Size | Keep Pct | Problem | Time | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | [1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, ... | 25.0 | [25.0, 25.0, 25.0, 25.0, 25.0, 25.0, 25.0, 25.... | [102.0, 153.0, 204.0, 255.0, 306.0, 357.0, 408... | 10 | 5000 | 50 | 0.1 | <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... | 0.295165 |
| 1 | 2 | [1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, ... | 29.0 | [27.0, 29.0, 29.0, 29.0, 29.0, 29.0, 29.0, 29.... | [102.0, 154.0, 205.0, 256.0, 307.0, 358.0, 409... | 10 | 5000 | 50 | 0.1 | <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... | 0.295165 |
| 2 | 3 | [1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, ... | 26.0 | [26.0, 26.0, 26.0, 26.0, 26.0, 26.0, 26.0, 26.... | [102.0, 153.0, 204.0, 255.0, 306.0, 357.0, 408... | 10 | 5000 | 50 | 0.1 | <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... | 0.295165 |