Parallelize mlrose randomized optimizations#

Faster experimentation using the new parallel script.
Tutorial by Nikhil Kapila - 21.02.2025

Warning

21.02.2025: I do not have access to the PyPI project so the code can be copied from GitHub into your local environment, i.e. this will not work if you install it through pip.

The source can be viewed here.

import mlrose_ky as mlrose
from mlrose_ky.utils import parallel
import numpy as np
import time
from pprint import pprint

Creating the parameter grid#

In this section, we initialize all the different parameters required by our algorithms.
We use an example of Four Peaks for this notebook to explain the parallelized FASTER runs.

# We use 3 seeds
seeds = [1, 2, 3]

# for Four Peaks problem
t_pct = 0.05

# for all algorithms
max_attempt = [10, 25, 50]
max_iter = [5000]

# RHC 
restart = [1, 2, 5, 10, 15, 25, 50, 100]

# pop_size for GA and MIMIC
# mutation_prob for GA
# keep_pct for MIMIC
pop_size = [50, 100, 200, 400, 800]
mutation_prob = [0.05, 0.1, 0.2, 0.3, 0.4]
keep_pct = [0.1, 0.2, 0.3, 0.4]

# decays for SA, you can pick the other types too --> refer docs
decays = []
for t in [1, 2, 5]:
    for decay in [0.0001, 0.00025, 0.0005]:
        decays.append(mlrose.ArithDecay(init_temp=t, decay=decay, min_temp=0.001))

RHC#

We use a single sized RHC, pass in all our previous parameters, and generate the output results.
view_params displays how your input parameters are expanded into all possible combinations of their values, generating a full set of parameter permutations.

With view_params as False#

for size in [20]: #, 60, 100]:
    print("fourpeaks of size:", size)
    problem = mlrose.DiscreteOpt(
        length=size, fitness_fn=mlrose.FourPeaks(t_pct=t_pct), maximize=True
    )
    problem.set_mimic_fast_mode(fast_mode=True)

    # Hyperparameter tunning section
    print("rhc")
    rhc_grid = {
        "problem": [problem],
        "max_attempt": max_attempt,
        "max_iter": max_iter,
        "restart": restart,
        "seeds": [seeds],
    }

    rhc_results = parallel.get_results(rhc_grid, parallel.rhc_run, verbose=True, view_params=False)
fourpeaks of size: 20
rhc
Number of params: 24


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  24 out of  24 | elapsed:    3.8s finished

With view_params as True#

You can see how the different combinations come out.

for size in [20]: #, 60, 100]:
    print("fourpeaks of size:", size)
    problem = mlrose.DiscreteOpt(
        length=size, fitness_fn=mlrose.FourPeaks(t_pct=t_pct), maximize=True
    )
    problem.set_mimic_fast_mode(fast_mode=True)

    # Hyperparameter tunning section
    print("rhc")
    rhc_grid = {
        "problem": [problem],
        "max_attempt": max_attempt,
        "max_iter": max_iter,
        "restart": restart,
        "seeds": [seeds],
    }

    rhc_results = parallel.get_results(rhc_grid, parallel.rhc_run, verbose=True, view_params=True)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.


fourpeaks of size: 20
rhc
[{'max_attempt': 10,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 1,
  'seeds': [1, 2, 3]},
 {'max_attempt': 10,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 2,
  'seeds': [1, 2, 3]},
 {'max_attempt': 10,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 5,
  'seeds': [1, 2, 3]},
 {'max_attempt': 10,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 10,
  'seeds': [1, 2, 3]},
 {'max_attempt': 10,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 15,
  'seeds': [1, 2, 3]},
 {'max_attempt': 10,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 25,
  'seeds': [1, 2, 3]},
 {'max_attempt': 10,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 50,
  'seeds': [1, 2, 3]},
 {'max_attempt': 10,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 100,
  'seeds': [1, 2, 3]},
 {'max_attempt': 25,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 1,
  'seeds': [1, 2, 3]},
 {'max_attempt': 25,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 2,
  'seeds': [1, 2, 3]},
 {'max_attempt': 25,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 5,
  'seeds': [1, 2, 3]},
 {'max_attempt': 25,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 10,
  'seeds': [1, 2, 3]},
 {'max_attempt': 25,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 15,
  'seeds': [1, 2, 3]},
 {'max_attempt': 25,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 25,
  'seeds': [1, 2, 3]},
 {'max_attempt': 25,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 50,
  'seeds': [1, 2, 3]},
 {'max_attempt': 25,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 100,
  'seeds': [1, 2, 3]},
 {'max_attempt': 50,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 1,
  'seeds': [1, 2, 3]},
 {'max_attempt': 50,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 2,
  'seeds': [1, 2, 3]},
 {'max_attempt': 50,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 5,
  'seeds': [1, 2, 3]},
 {'max_attempt': 50,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 10,
  'seeds': [1, 2, 3]},
 {'max_attempt': 50,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 15,
  'seeds': [1, 2, 3]},
 {'max_attempt': 50,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 25,
  'seeds': [1, 2, 3]},
 {'max_attempt': 50,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 50,
  'seeds': [1, 2, 3]},
 {'max_attempt': 50,
  'max_iter': 5000,
  'problem': <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt object at 0x11cf53a70>,
  'restart': 100,
  'seeds': [1, 2, 3]}]
Number of params: 24


[Parallel(n_jobs=-1)]: Done  24 out of  24 | elapsed:    0.9s finished

Understanding the output#

The output is a list of DataFrames, we had 24 different combinations and hence, there are 24 different output dataframes wherein each row in each output refers to a specific seed.

len(rhc_results)
24

Looking at the first dataframe, we can see our hyperparams of Restart=1 and the time taken per seed.

rhc_results[0]
Seed Best State Best Fitness Fitness Value Fevals Max Attempt Max Iters Restart Problem Time
0 1 [1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, ... 29.0 [4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 5.0, 8.0, ... [18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 26.... 10 5000 1 <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... 0.002735
1 2 [0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, ... 6.0 [6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, ... [25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0, 32.... 10 5000 1 <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... 0.002735
2 3 [1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, ... 3.0 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, ... 10 5000 1 <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... 0.002735

To access fitness value of Restart=1 for Seed=1, we can do the following.

rhc_results[1].loc[0, 'Fitness Value']
# or rhc_results[1]['Fitness Value'][0]
# view more ways here: https://pandas.pydata.org/docs/reference/api/pandas.Series.html
[4.0,
 4.0,
 4.0,
 4.0,
 4.0,
 4.0,
 4.0,
 5.0,
 8.0,
 28.0,
 28.0,
 29.0,
 29.0,
 29.0,
 29.0,
 29.0,
 29.0,
 29.0,
 29.0,
 29.0,
 29.0,
 29.0]

Similarly, for SA#

sa_grid = {
    "problem": [problem],
    "max_attempt": max_attempt,
    "max_iter": max_iter,
    "decay": decays,
    "seeds": [seeds],
}
sa_results = parallel.get_results(sa_grid, parallel.sa_run, verbose=True, view_params=False)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.


Number of params: 27


[Parallel(n_jobs=-1)]: Done  27 out of  27 | elapsed:    1.2s finished
sa_results[0]
Seed Best State Best Fitness Fitness Value Fevals Max Attempt Max Iters Decay Problem Time
0 1 [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, ... 31.0 [22.0, 22.0, 22.0, 22.0, 22.0, 22.0, 22.0, 26.... [2.0, 4.0, 6.0, 8.0, 9.0, 11.0, 12.0, 14.0, 16... 10 5000 ArithDecay(init_temp=1, decay=0.0001, min_temp... <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... 0.132708
1 2 [1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... 34.0 [2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 0.0, 0.0, 0.0, ... [2.0, 4.0, 6.0, 8.0, 10.0, 12.0, 14.0, 16.0, 1... 10 5000 ArithDecay(init_temp=1, decay=0.0001, min_temp... <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... 0.132708
2 3 [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... 36.0 [3.0, 3.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, ... [2.0, 4.0, 6.0, 8.0, 10.0, 12.0, 14.0, 16.0, 1... 10 5000 ArithDecay(init_temp=1, decay=0.0001, min_temp... <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... 0.132708

GA#

ga_grid = {
    "problem": [problem],
    "pop_size": pop_size,
    "mutation_prob": mutation_prob,
    "max_attempt": max_attempt,
    "max_iter": max_iter,
    "seeds": [seeds],
}
ga_results = parallel.get_results(ga_grid, parallel.ga_run, verbose=True)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.


Number of params: 75


[Parallel(n_jobs=-1)]: Done  52 tasks      | elapsed:    9.1s
[Parallel(n_jobs=-1)]: Done  60 out of  75 | elapsed:   11.0s remaining:    2.8s
[Parallel(n_jobs=-1)]: Done  75 out of  75 | elapsed:   20.6s finished
ga_results[0]
Seed Best State Best Fitness Fitness Value Fevals Max Attempt Max Iters Pop Size Mutation Prob Problem Time
0 1 [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... 38.0 [27.0, 27.0, 27.0, 27.0, 28.0, 29.0, 29.0, 29.... [102.0, 153.0, 204.0, 255.0, 307.0, 359.0, 410... 10 5000 50 0.05 <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... 0.162858
1 2 [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... 37.0 [26.0, 27.0, 27.0, 28.0, 28.0, 30.0, 30.0, 33.... [102.0, 154.0, 205.0, 257.0, 308.0, 360.0, 411... 10 5000 50 0.05 <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... 0.162858
2 3 [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... 38.0 [28.0, 29.0, 31.0, 31.0, 33.0, 34.0, 34.0, 34.... [102.0, 154.0, 206.0, 257.0, 309.0, 361.0, 412... 10 5000 50 0.05 <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... 0.162858

MIMIC#

mimic_grid = {
    "problem": [problem],
    "pop_size": pop_size,
    "keep_pct": keep_pct,
    "max_attempt": max_attempt,
    "max_iter": max_iter,
    "seeds": [seeds],
}

mimic_results = parallel.get_results(mimic_grid, parallel.mimic_run, verbose=True)
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.


Number of params: 60


[Parallel(n_jobs=-1)]: Done  34 tasks      | elapsed:    2.6s
[Parallel(n_jobs=-1)]: Done  60 out of  60 | elapsed:    9.4s finished
mimic_results[0]
Seed Best State Best Fitness Fitness Value Fevals Max Attempt Max Iters Pop Size Keep Pct Problem Time
0 1 [1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, ... 25.0 [25.0, 25.0, 25.0, 25.0, 25.0, 25.0, 25.0, 25.... [102.0, 153.0, 204.0, 255.0, 306.0, 357.0, 408... 10 5000 50 0.1 <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... 0.295165
1 2 [1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, ... 29.0 [27.0, 29.0, 29.0, 29.0, 29.0, 29.0, 29.0, 29.... [102.0, 154.0, 205.0, 256.0, 307.0, 358.0, 409... 10 5000 50 0.1 <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... 0.295165
2 3 [1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, ... 26.0 [26.0, 26.0, 26.0, 26.0, 26.0, 26.0, 26.0, 26.... [102.0, 153.0, 204.0, 255.0, 306.0, 357.0, 408... 10 5000 50 0.1 <mlrose_ky.opt_probs.discrete_opt.DiscreteOpt ... 0.295165