mlrose_ky Generator and Runner Usage Examples - Andrew Rollings#

Modified by Kyle Nakamura

Overview#

These examples will not solve assignment 2 for you, but they will give you some idea on how to use the problem generator and runner classes.

Hopefully this will result in slightly fewer "How do I \<insert basic usage here>" questions every semester...

Also, and in case it hasn't been made clear enough by the TAs, using any of the visualizations from this tutorial for your report is a bad idea for two reasons:
1. It provides nothing useful as far as the assignment goes, and
2. The TAs will undoubtedly frown upon it.

Visualization is part of the analysis and, for the most part, you're supposed to do that by yourself. Just including
images of the before/after state of a problem really isn't useful in terms of what you're supposed to be analyzing.

Import Libraries#

%pip install chess IPython
Requirement already satisfied: chess in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (1.10.0)
Requirement already satisfied: IPython in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (8.25.0)
Requirement already satisfied: decorator in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (5.1.1)
Requirement already satisfied: jedi>=0.16 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (0.18.1)
Requirement already satisfied: matplotlib-inline in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (0.1.6)
Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.41 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (3.0.47)
Requirement already satisfied: pygments>=2.4.0 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (2.18.0)
Requirement already satisfied: stack-data in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (0.2.0)
Requirement already satisfied: traitlets>=5.13.0 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (5.14.3)
Requirement already satisfied: typing-extensions>=4.6 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (4.12.2)
Requirement already satisfied: pexpect>4.3 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (4.9.0)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from jedi>=0.16->IPython) (0.8.3)
Requirement already satisfied: ptyprocess>=0.5 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from pexpect>4.3->IPython) (0.7.0)
Requirement already satisfied: wcwidth in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from prompt-toolkit<3.1.0,>=3.0.41->IPython) (0.2.13)
Requirement already satisfied: executing in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from stack-data->IPython) (0.8.3)
Requirement already satisfied: asttokens in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from stack-data->IPython) (2.0.5)
Requirement already satisfied: pure-eval in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from stack-data->IPython) (0.2.2)
Requirement already satisfied: six in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from asttokens->stack-data->IPython) (1.16.0)
Note: you may need to restart the kernel to use updated packages.
from IPython.display import HTML

import numpy as np
import logging
import networkx as nx
import matplotlib.pyplot as plt
import string

from ast import literal_eval
import chess

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder
from sklearn.metrics import accuracy_score

import mlrose_ky as mlrose
from mlrose_ky.generators import QueensGenerator, MaxKColorGenerator, TSPGenerator
from mlrose_ky.runners import SARunner, GARunner, NNGSRunner

# Hide warnings from libraries
logging.basicConfig(level=logging.WARNING)

Example 1: Solving the 8-Queens problem using the SA algorithm#

Initializing and viewing the problem#

First, we'll use the QueensGenerator to create an instance of the 8-Queens problem.

# Generate a new 8-Queens optimization problem using a fixed seed
problem = QueensGenerator.generate(seed=123456, size=8)

The initial, un-optimized state can be seen below, both as a list and as a chess board.

# View the initial state as a list
state = problem.get_state()
print('Initial state:', state)

# View the initial state as a chess board
board_layout = "/".join(["".join(([str(s)] if s > 0 else []) + ["Q"] + ([str((7 - s))] if s < 7 else [])) for s in state])
chess.Board(board_layout)  # You may need to "trust" this notebook for the board visualization to work
Initial state: [2 3 3 2 7 0 1 1]

svg

Runners are used to execute "grid search" experiments on optimization problems.

We'll use the Simulated Annealing Runner (SARunner) to perform a grid search on the 8-Queens problem and then extract the optimal SA hyperparameters.

Here is a brief explanation of the SARunner parameters used in the example below:
- max_attempts: A list of maximum attempts to try improving the fitness score before terminating a run
- temperature_list: A list of temperatures to try when initializing the SA algorithm's decay function (e.g., GeomDecay(init_temp=1.0))
- decay_list: A list of decay schedules to try as the SA algorithm's decay function (e.g., GeomDecay, ExpDecay, etc.)
- iteration_list: A list of iterations to snapshot the state of the algorithm at (only determines the rows that the Runner will output)

Disclaimer: the values used here are just toy values picked specifically for this example.
You will have to choose your own range of values for your experiments.
I strongly recommend you don't just copy these, or you will find that the grading is unlikely to go the way you would like.

# Create an SA Runner instance to solve the problem
sa = SARunner(
    problem=problem,
    experiment_name="queens_8_sa",
    seed=123456,
    output_directory=None,  # Note: specify an output directory (str) to have these results saved to disk
    max_attempts=100,
    temperature_list=[0.1, 0.5, 0.75, 1.0, 2.0, 5.0],
    decay_list=[mlrose.GeomDecay],
    iteration_list=2 ** np.arange(11),  # list of 11 integers from 2^0 to 2^11
)

# Run the SA Runner and retrieve its results
df_run_stats, df_run_curves = sa.run()
# Calculate some simple stats about the experiment
temperatures_per_run = max(1, len(sa.temperature_list))
decays_per_run = max(1, len(sa.decay_list))
iters_per_run = len(sa.iteration_list) + 1
total_runs = temperatures_per_run * decays_per_run

print(f"The experiment executed {total_runs} runs, each with {iters_per_run} snapshots at different iterations.")
print(f"In total, the output dataframe should contain {total_runs * iters_per_run} rows.")
The experiment executed 6 runs, each with 12 snapshots at different iterations.
In total, the output dataframe should contain 72 rows.

The df_run_stats dataframe contains snapshots of the state of the algorithm at the iterations specified in the iteration_list.

Since iterations_list contains 11 numbers, and iteration 0 is always included in the results, each run will take up 12 rows of the dataframe.

The first 12 rows (i.e., results from the first run) are shown below:

HTML(df_run_stats[["Iteration", "Fitness", "FEvals", "Time", "State"]][:iters_per_run].to_html(index=False))
Iteration Fitness FEvals Time State
0 11.0 0 0.000856 [1, 2, 2, 1, 0, 3, 7, 3]
1 9.0 2 0.002331 [1, 2, 2, 0, 0, 3, 7, 3]
2 8.0 4 0.002864 [1, 2, 2, 0, 0, 3, 7, 5]
4 8.0 7 0.003399 [1, 2, 2, 5, 0, 3, 7, 5]
8 5.0 13 0.004127 [1, 2, 7, 5, 0, 3, 5, 5]
16 4.0 24 0.005042 [1, 2, 7, 5, 3, 0, 5, 5]
32 4.0 47 0.006737 [1, 5, 7, 5, 0, 0, 3, 4]
64 1.0 86 0.009472 [1, 5, 2, 6, 3, 0, 7, 4]
128 1.0 155 0.014421 [1, 5, 2, 6, 3, 0, 4, 7]
256 1.0 295 0.025960 [1, 7, 2, 6, 3, 5, 0, 4]
512 0.0 461 0.044061 [1, 5, 0, 6, 3, 7, 2, 4]
1024 0.0 461 0.044061 [1, 5, 0, 6, 3, 7, 2, 4]

Some information was intentionally excluded from the previous output. Let's now preview the entirety of the first run:

HTML(df_run_stats[:iters_per_run].to_html(index=False))
Iteration Fitness FEvals Time State schedule_type schedule_init_temp schedule_decay schedule_min_temp schedule_current_value Temperature max_iters
0 11.0 0 0.000856 [1, 2, 2, 1, 0, 3, 7, 3] geometric 0.1 0.99 0.001 0.099999 0.1 1024
1 9.0 2 0.002331 [1, 2, 2, 0, 0, 3, 7, 3] geometric 0.1 0.99 0.001 0.099998 0.1 1024
2 8.0 4 0.002864 [1, 2, 2, 0, 0, 3, 7, 5] geometric 0.1 0.99 0.001 0.099997 0.1 1024
4 8.0 7 0.003399 [1, 2, 2, 5, 0, 3, 7, 5] geometric 0.1 0.99 0.001 0.099997 0.1 1024
8 5.0 13 0.004127 [1, 2, 7, 5, 0, 3, 5, 5] geometric 0.1 0.99 0.001 0.099996 0.1 1024
16 4.0 24 0.005042 [1, 2, 7, 5, 3, 0, 5, 5] geometric 0.1 0.99 0.001 0.099995 0.1 1024
32 4.0 47 0.006737 [1, 5, 7, 5, 0, 0, 3, 4] geometric 0.1 0.99 0.001 0.099993 0.1 1024
64 1.0 86 0.009472 [1, 5, 2, 6, 3, 0, 7, 4] geometric 0.1 0.99 0.001 0.099990 0.1 1024
128 1.0 155 0.014421 [1, 5, 2, 6, 3, 0, 4, 7] geometric 0.1 0.99 0.001 0.099986 0.1 1024
256 1.0 295 0.025960 [1, 7, 2, 6, 3, 5, 0, 4] geometric 0.1 0.99 0.001 0.099974 0.1 1024
512 0.0 461 0.044061 [1, 5, 0, 6, 3, 7, 2, 4] geometric 0.1 0.99 0.001 0.099956 0.1 1024
1024 0.0 461 0.044061 [1, 5, 0, 6, 3, 7, 2, 4] geometric 0.1 0.99 0.001 0.099956 0.1 1024

What does all this information tell us about the first run of our experiment?

  1. Iteration shows the index of the snapshot, with 12 snapshots per run in this example.
  2. Fitness shows the fitness score of the state at the corresponding iteration, where 0.0 is optimal for minimization problems.
  3. FEvals shows the number of fitness function evaluations performed by the algorithm at the corresponding iteration.
  4. Time shows the time elapsed (calculated using time.perf_counter()) up to the corresponding iteration.
  5. State shows the state of the algorithm at the corresponding iteration (see mlrose_ky.fitness.queens for more details).
  6. Temperature shows the decay function (and its parameters) that was used for this run. We will have 6 unique values in this column, one for each run.
  7. max_iters shows the maximum number of iterations allowed for the algorithm to run. It defaults to max(iteration_list) for all Runners, which is 1024 in this case.

To pick out the most performant run from the dataframe, we need to find the row with the best fitness.
Since Queens is a minimization problem, we're looking for the row with minimal fitness (i.e., zero).

It's likely that multiple runs will achieve the same fitness, so we need to find the run that achieved the best Fitness in the fewest FEvals (Note: we could make this selection using Iterations or Time if we so desired.)

best_fitness = df_run_stats["Fitness"].min()  # Should be 0.0 in this case

# Get all runs with the best fitness value
best_runs = df_run_stats[df_run_stats["Fitness"] == best_fitness]
best_runs
Iteration Fitness FEvals Time State schedule_type schedule_init_temp schedule_decay schedule_min_temp schedule_current_value Temperature max_iters
10 512 0.0 461 0.044061 [1, 5, 0, 6, 3, 7, 2, 4] geometric 0.1 0.99 0.001 0.099956 0.1 1024
11 1024 0.0 461 0.044061 [1, 5, 0, 6, 3, 7, 2, 4] geometric 0.1 0.99 0.001 0.099956 0.1 1024
22 512 0.0 461 0.040817 [1, 5, 0, 6, 3, 7, 2, 4] geometric 0.5 0.99 0.001 0.499795 0.5 1024
23 1024 0.0 461 0.040817 [1, 5, 0, 6, 3, 7, 2, 4] geometric 0.5 0.99 0.001 0.499795 0.5 1024
58 512 0.0 427 0.044455 [7, 1, 3, 0, 6, 4, 2, 5] geometric 2.0 0.99 0.001 1.999107 2.0 1024
59 1024 0.0 427 0.044455 [7, 1, 3, 0, 6, 4, 2, 5] geometric 2.0 0.99 0.001 1.999107 2.0 1024
70 512 0.0 583 0.061001 [6, 0, 2, 7, 5, 3, 1, 4] geometric 5.0 0.99 0.001 4.996936 5.0 1024
71 1024 0.0 583 0.061001 [6, 0, 2, 7, 5, 3, 1, 4] geometric 5.0 0.99 0.001 4.996936 5.0 1024

This gives us our candidates for the best run.

The best run will be the one that achieved the best fitness in the fewest evaluations.

minimum_evaluations = best_runs["FEvals"].min()  # Should be 461 in this case

# Extract the best run with the minimum number of evaluations
best_run = best_runs[best_runs["FEvals"] == minimum_evaluations]

The best run using these criteria is as follows:

print(best_run.iloc[0])
Iteration                                      512
Fitness                                        0.0
FEvals                                         427
Time                                      0.044455
State                     [7, 1, 3, 0, 6, 4, 2, 5]
schedule_type                            geometric
schedule_init_temp                             2.0
schedule_decay                                0.99
schedule_min_temp                            0.001
schedule_current_value                    1.999107
Temperature                                    2.0
max_iters                                     1024
Name: 58, dtype: object

Which has the following identifying state information:

best_temperature_param = best_run["Temperature"].iloc[0].init_temp
best_temperature_param
2.0

To map this result back to the original output of the Runner, we are looking for all rows in df_run_stats where the temperature is equal to 2.

run_stats_best_run = df_run_stats[df_run_stats["schedule_init_temp"] == best_temperature_param]
run_stats_best_run[["Iteration", "Fitness", "FEvals", "Time", "State"]]
Iteration Fitness FEvals Time State
48 0 11.0 0 0.000136 [1, 2, 2, 1, 0, 3, 7, 3]
49 1 9.0 2 0.001460 [1, 2, 2, 0, 0, 3, 7, 3]
50 2 8.0 4 0.002754 [1, 2, 2, 0, 0, 3, 7, 5]
51 4 8.0 7 0.004162 [1, 2, 2, 5, 0, 3, 7, 5]
52 8 7.0 14 0.005876 [1, 2, 2, 5, 0, 3, 5, 5]
53 16 6.0 27 0.007869 [3, 2, 3, 5, 0, 1, 5, 5]
54 32 4.0 57 0.011020 [3, 5, 6, 5, 5, 0, 4, 7]
55 64 5.0 114 0.015731 [2, 0, 3, 6, 1, 2, 1, 7]
56 128 3.0 205 0.022950 [2, 0, 6, 3, 5, 0, 4, 3]
57 256 2.0 358 0.036341 [7, 1, 3, 6, 6, 4, 0, 5]
58 512 0.0 427 0.044455 [7, 1, 3, 0, 6, 4, 2, 5]
59 1024 0.0 427 0.044455 [7, 1, 3, 0, 6, 4, 2, 5]

And the best state associated with this is:

best_state = run_stats_best_run[["schedule_current_value", "schedule_init_temp", "schedule_min_temp"]].tail(1)
best_state
schedule_current_value schedule_init_temp schedule_min_temp
59 1.999107 2.0 0.001

The final state is as follows:

state = literal_eval(run_stats_best_run["State"].tail(1).values[0])
print(state)

board_layout = "/".join(["".join(([str(s)] if s > 0 else []) + ["Q"] + ([str((7 - s))] if s < 7 else [])) for s in state])
board = chess.Board(board_layout)
board
[7, 1, 3, 0, 6, 4, 2, 5]

svg

Example 2: Generating and Running Max K Color using the GA algorithm#

# Generate a new Max K problem using a fixed seed.
problem = MaxKColorGenerator().generate(seed=123456, number_of_nodes=10, max_connections_per_node=3, max_colors=3)

The input graph generated for the problem looks like this:

nx.draw(problem.source_graph, pos=nx.spring_layout(problem.source_graph, seed=3))
plt.show()

png

# create a runner class and solve the problem
ga = GARunner(
    problem=problem,
    experiment_name="max_k_ga",
    output_directory=None,  # note: specify an output directory to have results saved to disk
    seed=123456,
    iteration_list=2 ** np.arange(11),
    population_sizes=[10, 20, 50],
    mutation_rates=[0.1, 0.2, 0.5],
)

# the two data frames will contain the results
df_run_stats, df_run_curves = ga.run()

The preceding code will run the GA algorithm nine times for at most 1024 iterations per run.
Each run is a permutation from the list of population_sizes and mutation_rates.

Note that the initial state parameters here are just toy values picked specifically
for this example. You will have to choose your own range of values for your
assignment. I strongly recommend you don't just copy these, or you will find
that the grading is unlikely to go the way you would like.

Really. I mean it... A mutation rate of 0.5 is little better than a pure random search.

The output in the df_run_stats dataframe contains snapshots of the state of the algorithm at the iterations
specified in the iteration_list passed into the runner class.

The first row (corresponding to the first run of this algorithm) are as follows:

df_run_stats[["Iteration", "Fitness", "FEvals", "Time", "State"]][0:1]
Iteration Fitness FEvals Time State
0 0 3.0 10 0.000702 [1, 2, 2, 1, 0, 0, 0, 0, 2, 2]

The state information is excluded from the previous output.

A sample of this is below:

state_sample = df_run_stats[["Population Size", "Mutation Rate"]][:1]
state_sample
Population Size Mutation Rate
0 10 0.1

So, to pick out the most performant run from the dataframe, you need to find the row with the best fitness.
As Max-K-Color is a minimization problem, you'd pick the row with the minimum fitness.

However, I'm going to look in the run_curves (which stores minimal basic information every iteration) to
find out which input state achieved the best fitness in the fewest fitness evaluations.

best_fitness = df_run_curves["Fitness"].min()
best_runs = df_run_curves[df_run_curves["Fitness"] == best_fitness]
best_runs
Iteration Time Fitness FEvals Population Size Mutation Rate max_iters
4 4 0.003967 0.0 57.0 10 0.1 1024
9 4 0.003967 0.0 57.0 10 0.2 1024
18 8 0.001445 0.0 101.0 10 0.5 1024
24 5 0.000106 0.0 127.0 20 0.1 1024
27 2 0.003304 0.0 64.0 20 0.2 1024
31 3 0.003797 0.0 85.0 20 0.5 1024
41 9 0.001615 0.0 511.0 50 0.1 1024
45 3 0.003797 0.0 205.0 50 0.2 1024
50 4 0.003967 0.0 256.0 50 0.5 1024

This gives us nine candidates for the best run. We are going to pick the one with
that reached the best fitness value in the fewest number of evaluations.

(We could also have chosen to use Iterations as our criteria.)

minimum_evaluations = best_runs["FEvals"].min()

best_run = best_runs[best_runs["FEvals"] == minimum_evaluations]

The best runs using these criteria is as follows:

best_run
Iteration Time Fitness FEvals Population Size Mutation Rate max_iters
4 4 0.003967 0.0 57.0 10 0.1 1024
9 4 0.003967 0.0 57.0 10 0.2 1024

We will arbitrarily pick the first row for this example,
which has the following identifying state information:

best_mr = best_run["Mutation Rate"].iloc()[0]
best_pop_size = best_run["Population Size"].iloc()[0]
print(f"Best Mutation Rate: {best_mr}, best Population Size: {best_pop_size}")
Best Mutation Rate: 0.1, best Population Size: 10

To map this back to the run_stats we look at the configuration data included in
the curve data. The curve data includes at least the minimum identifying information
to determine which run each row came from.

In this case, the values we are looking for are the Mutation Rate and Population Size.

So, we are looking for all rows in df_run_stats where the mutation rate and population size are equal to our best values.

run_stats_best_run = df_run_stats[(df_run_stats["Mutation Rate"] == best_mr) & (df_run_stats["Population Size"] == best_pop_size)]
run_stats_best_run[["Iteration", "Fitness", "FEvals", "Time"]]
Iteration Fitness FEvals Time
0 0 3.0 10 0.000702
1 1 3.0 21 0.002124
2 2 2.0 33 0.003304
3 4 0.0 57 0.003967
4 8 0.0 57 0.003967
5 16 0.0 57 0.003967
6 32 0.0 57 0.003967
7 64 0.0 57 0.003967
8 128 0.0 57 0.003967
9 256 0.0 57 0.003967
10 512 0.0 57 0.003967
11 1024 0.0 57 0.003967

And the best state associated with this is:

best_state = run_stats_best_run[["State"]].tail(1)
best_state
State
11 [0, 1, 2, 0, 0, 2, 0, 1, 2, 1]

For the following node ordering:

print([n for n in problem.source_graph.nodes])
[0, 2, 8, 1, 3, 4, 6, 7, 9, 5]

Reordering the state by ascending node number gives the following:

color_indexes = literal_eval(run_stats_best_run["State"].tail(1).values[0])
ordered_state = [color_indexes[n] for n in problem.source_graph.nodes]
print(ordered_state)
[0, 2, 2, 1, 0, 0, 0, 1, 1, 2]

Which results in a graph looking like this:

colors = ["lightcoral", "lightgreen", "yellow"]
node_color_map = [colors[s] for s in ordered_state]

nx.draw(problem.source_graph, pos=nx.spring_layout(problem.source_graph, seed=3), with_labels=True, node_color=node_color_map)
plt.show()

png

Example 3: Generating and Running TSP using the GA algorithm#

# Generate a new TSP problem using a fixed seed.
problem = TSPGenerator().generate(seed=123456, number_of_cities=20)

The input graph generated for the problem looks like this:

fig, ax = plt.subplots(1)  # Prepare 2 plots
ax.set_yticklabels([])
ax.set_xticklabels([])
for i, (x, y) in enumerate(problem.coords):
    ax.scatter(x, y, s=200, c="cornflowerblue")  # plot A
node_labels = {k: str(v) for k, v in enumerate(string.ascii_uppercase) if k < len(problem.source_graph.nodes)}
for i in node_labels.keys():
    x, y = problem.coords[i]
plt.text(
    x,
    y,
    node_labels[i],
    ha="center",
    va="center",
    c="white",
    fontweight="bold",
    bbox=dict(boxstyle=f"circle,pad=0.15", fc="cornflowerblue"),
)

plt.tight_layout()
plt.show()

png

# create a runner class and solve the problem
ga = GARunner(
    problem=problem,
    experiment_name="tsp_ga",
    output_directory=None,  # note: specify an output directory to have results saved to disk
    seed=123456,
    iteration_list=2 ** np.arange(11),
    population_sizes=[10, 20],
    mutation_rates=[0.1, 0.25, 0.5],
)

# the two data frames will contain the results
df_run_stats, df_run_curves = ga.run()

The preceding code will run the GA algorithm nine times for at most 1024 iterations per run.
Each run is a permutation from the list of population_sizes and mutation_rates.

Note that the initial state parameters here are just toy values picked specifically
for this example. You will have to choose your own range of values for your
assignment. I strongly recommend you don't just copy these, or you will find
that the grading is unlikely to go the way you would like.

Really. I mean it... A mutation rate of 0.5 is little better than a pure random search.

The output in the df_run_stats dataframe contains snapshots of the state of the algorithm at the iterations
specified in the iteration_list passed into the runner class.

The first row (corresponding to the first run of this algorithm) are as follows:

df_run_stats[["Iteration", "Fitness", "FEvals", "Time", "State"]][:1]
Iteration Fitness FEvals Time State
0 0 2722.031402 10 0.00062 [19, 13, 12, 9, 5, 6, 2, 3, 18, 8, 4, 7, 0, 14...

The state information is excluded from the previous output.

A sample of this is below:

state_sample = df_run_stats[["Population Size", "Mutation Rate"]][:1]
state_sample
Population Size Mutation Rate
0 10 0.1

So, to pick out the most performant run from the dataframe, you need to find the row with the best fitness.
As TSP is a minimization problem, you'd pick the row with the minimum fitness.

However, I'm going to look in the run_curves (which stores minimal basic information every iteration) to
find out which input state achieved the best fitness in the fewest fitness evaluations.

best_fitness = df_run_curves["Fitness"].min()
best_runs = df_run_curves[df_run_curves["Fitness"] == best_fitness]
best_runs[:10]
Iteration Time Fitness FEvals Population Size Mutation Rate max_iters
3709 707 0.296142 941.582778 14903.0 20 0.1 1024
3710 708 0.296608 941.582778 14924.0 20 0.1 1024
3711 709 0.297102 941.582778 14945.0 20 0.1 1024
3712 710 0.297565 941.582778 14966.0 20 0.1 1024
3713 711 0.298021 941.582778 14987.0 20 0.1 1024
3714 712 0.298481 941.582778 15008.0 20 0.1 1024
3715 713 0.298940 941.582778 15029.0 20 0.1 1024
3716 714 0.299380 941.582778 15050.0 20 0.1 1024
3717 715 0.299820 941.582778 15071.0 20 0.1 1024
3718 716 0.300268 941.582778 15092.0 20 0.1 1024

This gives us nine candidates for the best run. We are going to pick the one with
that reached the best fitness value in the fewest number of evaluations.

(We could also have chosen to use Iterations as our criteria.)

minimum_evaluations = best_runs["FEvals"].min()
best_run = best_runs[best_runs["FEvals"] == minimum_evaluations]

The best runs using these criteria is as follows:

best_run
Iteration Time Fitness FEvals Population Size Mutation Rate max_iters
3709 707 0.296142 941.582778 14903.0 20 0.1 1024

This has the following identifying state information:

best_mr = best_run["Mutation Rate"].iloc()[0]
best_pop_size = best_run["Population Size"].iloc()[0]
print(f"Best Mutation Rate: {best_mr}, best Population Size: {best_pop_size}")
Best Mutation Rate: 0.1, best Population Size: 20

To map this back to the run_stats we look at the configuration data included in
the curve data. The curve data includes at least the minimum identifying information
to determine which run each row came from.

In this case, the values we are looking for are the Mutation Rate and Population Size.

So, we are looking for all rows in df_run_stats where the mutation rate and population size are equal to our best values.

run_stats_best_run = df_run_stats[(df_run_stats["Mutation Rate"] == best_mr) & (df_run_stats["Population Size"] == best_pop_size)]
run_stats_best_run[["Iteration", "Fitness", "FEvals", "Time"]]
Iteration Fitness FEvals Time
36 0 2722.031402 20 0.000232
37 1 2141.868537 42 0.002583
38 2 2141.868537 63 0.004905
39 4 2141.868537 105 0.007919
40 8 2044.319536 190 0.012401
41 16 1807.055513 360 0.019740
42 32 1785.714484 697 0.032535
43 64 1479.922718 1376 0.059770
44 128 1151.614761 2731 0.114027
45 256 1081.456468 5421 0.211686
46 512 970.009048 10803 0.410439
47 1024 941.582778 21560 0.831667

And the best state associated with this is:

best_state = run_stats_best_run[["State"]].tail(1)
best_state
State
47 [9, 0, 19, 17, 14, 15, 16, 8, 12, 5, 1, 6, 18,...

Which results in a graph looking like this:

ordered_state = literal_eval(run_stats_best_run["State"].tail(1).values[0])
edge_labels = {(ordered_state[i], ordered_state[(i + 1) % len(ordered_state)]): f"{str(i + 1)}➜" for i in range(len(ordered_state))}
fig, ax = plt.subplots(1)  # Prepare 2 plots
ax.set_yticklabels([])
ax.set_xticklabels([])
for i, (x, y) in enumerate(problem.coords):
    ax.scatter(x, y, s=1, c="green" if i == 5 else "cornflowerblue")  # plot A

for i in range(len(ordered_state)):
    start_node = ordered_state[i]
end_node = ordered_state[(i + 1) % len(ordered_state)]
start_pos = problem.coords[start_node]
end_pos = problem.coords[end_node]
ax.annotate(
    "",
    xy=start_pos,
    xycoords="data",
    xytext=end_pos,
    textcoords="data",
    c="red",
    arrowprops=dict(arrowstyle="->", ec="red", connectionstyle="arc3"),
)
node_labels = {k: str(v) for k, v in enumerate(string.ascii_uppercase) if k < len(problem.source_graph.nodes)}

for i in node_labels.keys():
    x, y = problem.coords[i]
plt.text(
    x,
    y,
    node_labels[i],
    ha="center",
    va="center",
    c="white",
    fontweight="bold",
    bbox=dict(boxstyle=f"circle,pad=0.15", fc="green" if i == ordered_state[0] else "cornflowerblue"),
)

plt.tight_layout()
plt.show()

png

And, to verify that the route is correct (or at least, the shortest one found):

all_edge_lengths = {(x, y): d for x, y, d in problem.distances}
all_edge_lengths.update({(y, x): d for x, y, d in problem.distances})

route_length = sum([all_edge_lengths[k] for k in edge_labels.keys()])
print(f"route_length: ({round(route_length, 6)}) equal to best_fitness: ({round(best_fitness, 6)})")
route_length: (941.582778) equal to best_fitness: (941.582778)

Example 4: Using the NNGSRunner with the RHC algorithm#

# Load and Split data into training and test sets
data = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.3, random_state=123456
)

# Normalize feature data
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# One hot encode target values
one_hot = OneHotEncoder()
y_train_hot = np.asarray(one_hot.fit_transform(y_train.reshape(-1, 1)).todense())
y_test_hot = np.asarray(one_hot.transform(y_test.reshape(-1, 1)).todense())

grid_search_parameters = {
    'max_iters': [1000],  # nn params
    'learning_rate': [1e-2],  # nn params
    'activation': [mlrose.relu],  # nn params
    'restarts': [1],  # rhc params
}

nnr = NNGSRunner(x_train=X_train_scaled, y_train=y_train_hot, x_test=X_test_scaled, y_test=y_test_hot, experiment_name='nn_test_rhc',
                 algorithm=mlrose.algorithms.rhc.random_hill_climb, grid_search_parameters=grid_search_parameters,
                 iteration_list=[1, 10, 50, 100, 250, 500, 1000], hidden_layer_sizes=[[2]], clip_max=5, n_jobs=5, seed=123456)

run_stats_df, curves_df, cv_results_df, grid_search_cv = nnr.run()
Fitting 5 folds for each of 1 candidates, totalling 5 fits

The runner returns the run_stats and curves corresponding to best hyperparameter combination,
as well as the cross validation results and the underlying GridSearchCV object used in the run.

y_test_pred = grid_search_cv.predict(X_test_scaled)
y_test_accuracy = accuracy_score(np.asarray(y_test_hot), y_test_pred)
y_test_accuracy
0.3333333333333333
y_train_pred = grid_search_cv.predict(X_train_scaled)
y_train_accuracy = accuracy_score(y_train_hot, y_train_pred)
y_train_accuracy
0.3333333333333333

Run stats dataframe

run_stats_df[["current_restart", "Iteration", "Fitness", "FEvals", "Time", "learning_rate"]][:14]
current_restart Iteration Fitness FEvals Time learning_rate
0 0 0 1.306448 1 0.001905 0.01
1 0 1 1.290536 3 0.004327 0.01
2 0 10 1.246087 14 0.009143 0.01
3 0 50 1.115004 67 0.027937 0.01
4 0 100 1.044654 127 0.048129 0.01
5 0 250 0.877647 308 0.107263 0.01
6 0 500 0.739120 614 0.212110 0.01
7 0 1000 0.732424 1175 0.433252 0.01
8 1 0 1.306448 1175 0.436525 0.01
9 1 1 1.306448 1176 0.438441 0.01
10 1 10 1.247648 1188 0.444550 0.01
11 1 50 1.148455 1239 0.465959 0.01
12 1 100 1.105441 1298 0.491722 0.01
13 1 250 0.931072 1490 0.570107 0.01

curves dataframe

curves_df[["current_restart", "Iteration", "Fitness", "FEvals", "Time", "learning_rate"]][:20]
current_restart Iteration Fitness FEvals Time learning_rate
0 0 0 1.290536 1.0 0.001905 0.01
1 0 1 1.290536 3.0 0.004327 0.01
2 0 2 1.290536 4.0 0.005450 0.01
3 0 3 1.290536 5.0 0.005843 0.01
4 0 4 1.263664 7.0 0.006530 0.01
5 0 5 1.263664 8.0 0.006857 0.01
6 0 6 1.246087 10.0 0.007742 0.01
7 0 7 1.246087 11.0 0.008119 0.01
8 0 8 1.246087 12.0 0.008459 0.01
9 0 9 1.246087 13.0 0.008787 0.01
10 0 10 1.246087 14.0 0.009143 0.01
11 0 11 1.246087 15.0 0.010364 0.01
12 0 12 1.246087 16.0 0.010792 0.01
13 0 13 1.246087 17.0 0.011168 0.01
14 0 14 1.246087 18.0 0.011561 0.01
15 0 15 1.246087 19.0 0.011916 0.01
16 0 16 1.246087 20.0 0.012291 0.01
17 0 17 1.232142 22.0 0.013562 0.01
18 0 18 1.216927 24.0 0.014337 0.01
19 0 19 1.216927 25.0 0.014832 0.01

cv results dataframe

cv_results_df[
    [
        "mean_test_score",
        "rank_test_score",
        "mean_train_score",
        "param_activation",
        "param_hidden_layer_sizes",
        "param_learning_rate",
        "param_max_iters",
        "param_restarts",
    ]
]
mean_test_score rank_test_score mean_train_score param_activation param_hidden_layer_sizes param_learning_rate param_max_iters param_restarts
0 0.333333 1 0.333333 relu [2] 0.01 1000 1