Modified by Kyle Nakamura
Overview#
These examples will not solve assignment 2 for you, but they will give you some idea on how to use the problem generator and runner classes.
Hopefully this will result in slightly fewer "How do I \<insert basic usage here>" questions every semester...
Also, and in case it hasn't been made clear enough by the TAs, using any of the visualizations from this tutorial for your report is a bad idea for two reasons:
1. It provides nothing useful as far as the assignment goes, and
2. The TAs will undoubtedly frown upon it.
Visualization is part of the analysis and, for the most part, you're supposed to do that by yourself. Just including
images of the before/after state of a problem really isn't useful in terms of what you're supposed to be analyzing.
Import Libraries#
Requirement already satisfied: chess in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (1.10.0)
Requirement already satisfied: IPython in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (8.25.0)
Requirement already satisfied: decorator in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (5.1.1)
Requirement already satisfied: jedi>=0.16 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (0.18.1)
Requirement already satisfied: matplotlib-inline in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (0.1.6)
Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.41 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (3.0.47)
Requirement already satisfied: pygments>=2.4.0 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (2.18.0)
Requirement already satisfied: stack-data in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (0.2.0)
Requirement already satisfied: traitlets>=5.13.0 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (5.14.3)
Requirement already satisfied: typing-extensions>=4.6 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (4.12.2)
Requirement already satisfied: pexpect>4.3 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from IPython) (4.9.0)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from jedi>=0.16->IPython) (0.8.3)
Requirement already satisfied: ptyprocess>=0.5 in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from pexpect>4.3->IPython) (0.7.0)
Requirement already satisfied: wcwidth in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from prompt-toolkit<3.1.0,>=3.0.41->IPython) (0.2.13)
Requirement already satisfied: executing in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from stack-data->IPython) (0.8.3)
Requirement already satisfied: asttokens in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from stack-data->IPython) (2.0.5)
Requirement already satisfied: pure-eval in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from stack-data->IPython) (0.2.2)
Requirement already satisfied: six in /Users/kylenakamura/anaconda3/envs/machine-learning/lib/python3.11/site-packages (from asttokens->stack-data->IPython) (1.16.0)
Note: you may need to restart the kernel to use updated packages.
from IPython.display import HTML
import numpy as np
import logging
import networkx as nx
import matplotlib.pyplot as plt
import string
from ast import literal_eval
import chess
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder
from sklearn.metrics import accuracy_score
import mlrose_ky as mlrose
from mlrose_ky.generators import QueensGenerator, MaxKColorGenerator, TSPGenerator
from mlrose_ky.runners import SARunner, GARunner, NNGSRunner
# Hide warnings from libraries
logging.basicConfig(level=logging.WARNING)
Example 1: Solving the 8-Queens problem using the SA algorithm#
Initializing and viewing the problem#
First, we'll use the QueensGenerator
to create an instance of the 8-Queens problem.
# Generate a new 8-Queens optimization problem using a fixed seed
problem = QueensGenerator.generate(seed=123456, size=8)
The initial, un-optimized state can be seen below, both as a list and as a chess board.
# View the initial state as a list
state = problem.get_state()
print('Initial state:', state)
# View the initial state as a chess board
board_layout = "/".join(["".join(([str(s)] if s > 0 else []) + ["Q"] + ([str((7 - s))] if s < 7 else [])) for s in state])
chess.Board(board_layout) # You may need to "trust" this notebook for the board visualization to work
Initial state: [2 3 3 2 7 0 1 1]
Solving 8-Queens using a Runner (i.e., grid search)#
Runners are used to execute "grid search" experiments on optimization problems.
We'll use the Simulated Annealing Runner (SARunner) to perform a grid search on the 8-Queens problem and then extract the optimal SA hyperparameters.
Here is a brief explanation of the SARunner parameters used in the example below:
- max_attempts
: A list of maximum attempts to try improving the fitness score before terminating a run
- temperature_list
: A list of temperatures to try when initializing the SA algorithm's decay function (e.g., GeomDecay(init_temp=1.0)
)
- decay_list
: A list of decay schedules to try as the SA algorithm's decay function (e.g., GeomDecay
, ExpDecay
, etc.)
- iteration_list
: A list of iterations to snapshot the state of the algorithm at (only determines the rows that the Runner will output)
Disclaimer: the values used here are just toy values picked specifically for this example.
You will have to choose your own range of values for your experiments.
I strongly recommend you don't just copy these, or you will find that the grading is unlikely to go the way you would like.
# Create an SA Runner instance to solve the problem
sa = SARunner(
problem=problem,
experiment_name="queens_8_sa",
seed=123456,
output_directory=None, # Note: specify an output directory (str) to have these results saved to disk
max_attempts=100,
temperature_list=[0.1, 0.5, 0.75, 1.0, 2.0, 5.0],
decay_list=[mlrose.GeomDecay],
iteration_list=2 ** np.arange(11), # list of 11 integers from 2^0 to 2^11
)
# Run the SA Runner and retrieve its results
df_run_stats, df_run_curves = sa.run()
# Calculate some simple stats about the experiment
temperatures_per_run = max(1, len(sa.temperature_list))
decays_per_run = max(1, len(sa.decay_list))
iters_per_run = len(sa.iteration_list) + 1
total_runs = temperatures_per_run * decays_per_run
print(f"The experiment executed {total_runs} runs, each with {iters_per_run} snapshots at different iterations.")
print(f"In total, the output dataframe should contain {total_runs * iters_per_run} rows.")
The experiment executed 6 runs, each with 12 snapshots at different iterations.
In total, the output dataframe should contain 72 rows.
The df_run_stats
dataframe contains snapshots of the state of the algorithm at the iterations specified in the iteration_list
.
Since iterations_list
contains 11 numbers, and iteration 0 is always included in the results, each run will take up 12 rows of the dataframe.
The first 12 rows (i.e., results from the first run) are shown below:
HTML(df_run_stats[["Iteration", "Fitness", "FEvals", "Time", "State"]][:iters_per_run].to_html(index=False))
Iteration | Fitness | FEvals | Time | State |
---|---|---|---|---|
0 | 11.0 | 0 | 0.000856 | [1, 2, 2, 1, 0, 3, 7, 3] |
1 | 9.0 | 2 | 0.002331 | [1, 2, 2, 0, 0, 3, 7, 3] |
2 | 8.0 | 4 | 0.002864 | [1, 2, 2, 0, 0, 3, 7, 5] |
4 | 8.0 | 7 | 0.003399 | [1, 2, 2, 5, 0, 3, 7, 5] |
8 | 5.0 | 13 | 0.004127 | [1, 2, 7, 5, 0, 3, 5, 5] |
16 | 4.0 | 24 | 0.005042 | [1, 2, 7, 5, 3, 0, 5, 5] |
32 | 4.0 | 47 | 0.006737 | [1, 5, 7, 5, 0, 0, 3, 4] |
64 | 1.0 | 86 | 0.009472 | [1, 5, 2, 6, 3, 0, 7, 4] |
128 | 1.0 | 155 | 0.014421 | [1, 5, 2, 6, 3, 0, 4, 7] |
256 | 1.0 | 295 | 0.025960 | [1, 7, 2, 6, 3, 5, 0, 4] |
512 | 0.0 | 461 | 0.044061 | [1, 5, 0, 6, 3, 7, 2, 4] |
1024 | 0.0 | 461 | 0.044061 | [1, 5, 0, 6, 3, 7, 2, 4] |
Some information was intentionally excluded from the previous output. Let's now preview the entirety of the first run:
Iteration | Fitness | FEvals | Time | State | schedule_type | schedule_init_temp | schedule_decay | schedule_min_temp | schedule_current_value | Temperature | max_iters |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 11.0 | 0 | 0.000856 | [1, 2, 2, 1, 0, 3, 7, 3] | geometric | 0.1 | 0.99 | 0.001 | 0.099999 | 0.1 | 1024 |
1 | 9.0 | 2 | 0.002331 | [1, 2, 2, 0, 0, 3, 7, 3] | geometric | 0.1 | 0.99 | 0.001 | 0.099998 | 0.1 | 1024 |
2 | 8.0 | 4 | 0.002864 | [1, 2, 2, 0, 0, 3, 7, 5] | geometric | 0.1 | 0.99 | 0.001 | 0.099997 | 0.1 | 1024 |
4 | 8.0 | 7 | 0.003399 | [1, 2, 2, 5, 0, 3, 7, 5] | geometric | 0.1 | 0.99 | 0.001 | 0.099997 | 0.1 | 1024 |
8 | 5.0 | 13 | 0.004127 | [1, 2, 7, 5, 0, 3, 5, 5] | geometric | 0.1 | 0.99 | 0.001 | 0.099996 | 0.1 | 1024 |
16 | 4.0 | 24 | 0.005042 | [1, 2, 7, 5, 3, 0, 5, 5] | geometric | 0.1 | 0.99 | 0.001 | 0.099995 | 0.1 | 1024 |
32 | 4.0 | 47 | 0.006737 | [1, 5, 7, 5, 0, 0, 3, 4] | geometric | 0.1 | 0.99 | 0.001 | 0.099993 | 0.1 | 1024 |
64 | 1.0 | 86 | 0.009472 | [1, 5, 2, 6, 3, 0, 7, 4] | geometric | 0.1 | 0.99 | 0.001 | 0.099990 | 0.1 | 1024 |
128 | 1.0 | 155 | 0.014421 | [1, 5, 2, 6, 3, 0, 4, 7] | geometric | 0.1 | 0.99 | 0.001 | 0.099986 | 0.1 | 1024 |
256 | 1.0 | 295 | 0.025960 | [1, 7, 2, 6, 3, 5, 0, 4] | geometric | 0.1 | 0.99 | 0.001 | 0.099974 | 0.1 | 1024 |
512 | 0.0 | 461 | 0.044061 | [1, 5, 0, 6, 3, 7, 2, 4] | geometric | 0.1 | 0.99 | 0.001 | 0.099956 | 0.1 | 1024 |
1024 | 0.0 | 461 | 0.044061 | [1, 5, 0, 6, 3, 7, 2, 4] | geometric | 0.1 | 0.99 | 0.001 | 0.099956 | 0.1 | 1024 |
What does all this information tell us about the first run of our experiment?
Iteration
shows the index of the snapshot, with 12 snapshots per run in this example.Fitness
shows the fitness score of the state at the corresponding iteration, where 0.0 is optimal for minimization problems.FEvals
shows the number of fitness function evaluations performed by the algorithm at the corresponding iteration.Time
shows the time elapsed (calculated usingtime.perf_counter()
) up to the corresponding iteration.State
shows the state of the algorithm at the corresponding iteration (seemlrose_ky.fitness.queens
for more details).Temperature
shows the decay function (and its parameters) that was used for this run. We will have 6 unique values in this column, one for each run.max_iters
shows the maximum number of iterations allowed for the algorithm to run. It defaults tomax(iteration_list)
for all Runners, which is 1024 in this case.
To pick out the most performant run from the dataframe, we need to find the row with the best fitness.
Since Queens is a minimization problem, we're looking for the row with minimal fitness (i.e., zero).
It's likely that multiple runs will achieve the same fitness, so we need to find the run that achieved the best Fitness
in the fewest FEvals
(Note: we could make this selection using Iterations
or Time
if we so desired.)
best_fitness = df_run_stats["Fitness"].min() # Should be 0.0 in this case
# Get all runs with the best fitness value
best_runs = df_run_stats[df_run_stats["Fitness"] == best_fitness]
best_runs
Iteration | Fitness | FEvals | Time | State | schedule_type | schedule_init_temp | schedule_decay | schedule_min_temp | schedule_current_value | Temperature | max_iters | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
10 | 512 | 0.0 | 461 | 0.044061 | [1, 5, 0, 6, 3, 7, 2, 4] | geometric | 0.1 | 0.99 | 0.001 | 0.099956 | 0.1 | 1024 |
11 | 1024 | 0.0 | 461 | 0.044061 | [1, 5, 0, 6, 3, 7, 2, 4] | geometric | 0.1 | 0.99 | 0.001 | 0.099956 | 0.1 | 1024 |
22 | 512 | 0.0 | 461 | 0.040817 | [1, 5, 0, 6, 3, 7, 2, 4] | geometric | 0.5 | 0.99 | 0.001 | 0.499795 | 0.5 | 1024 |
23 | 1024 | 0.0 | 461 | 0.040817 | [1, 5, 0, 6, 3, 7, 2, 4] | geometric | 0.5 | 0.99 | 0.001 | 0.499795 | 0.5 | 1024 |
58 | 512 | 0.0 | 427 | 0.044455 | [7, 1, 3, 0, 6, 4, 2, 5] | geometric | 2.0 | 0.99 | 0.001 | 1.999107 | 2.0 | 1024 |
59 | 1024 | 0.0 | 427 | 0.044455 | [7, 1, 3, 0, 6, 4, 2, 5] | geometric | 2.0 | 0.99 | 0.001 | 1.999107 | 2.0 | 1024 |
70 | 512 | 0.0 | 583 | 0.061001 | [6, 0, 2, 7, 5, 3, 1, 4] | geometric | 5.0 | 0.99 | 0.001 | 4.996936 | 5.0 | 1024 |
71 | 1024 | 0.0 | 583 | 0.061001 | [6, 0, 2, 7, 5, 3, 1, 4] | geometric | 5.0 | 0.99 | 0.001 | 4.996936 | 5.0 | 1024 |
This gives us our candidates for the best run.
The best run will be the one that achieved the best fitness in the fewest evaluations.
minimum_evaluations = best_runs["FEvals"].min() # Should be 461 in this case
# Extract the best run with the minimum number of evaluations
best_run = best_runs[best_runs["FEvals"] == minimum_evaluations]
The best run using these criteria is as follows:
Iteration 512
Fitness 0.0
FEvals 427
Time 0.044455
State [7, 1, 3, 0, 6, 4, 2, 5]
schedule_type geometric
schedule_init_temp 2.0
schedule_decay 0.99
schedule_min_temp 0.001
schedule_current_value 1.999107
Temperature 2.0
max_iters 1024
Name: 58, dtype: object
Which has the following identifying state information:
2.0
To map this result back to the original output of the Runner, we are looking for all rows in df_run_stats
where the temperature is equal to 2.
run_stats_best_run = df_run_stats[df_run_stats["schedule_init_temp"] == best_temperature_param]
run_stats_best_run[["Iteration", "Fitness", "FEvals", "Time", "State"]]
Iteration | Fitness | FEvals | Time | State | |
---|---|---|---|---|---|
48 | 0 | 11.0 | 0 | 0.000136 | [1, 2, 2, 1, 0, 3, 7, 3] |
49 | 1 | 9.0 | 2 | 0.001460 | [1, 2, 2, 0, 0, 3, 7, 3] |
50 | 2 | 8.0 | 4 | 0.002754 | [1, 2, 2, 0, 0, 3, 7, 5] |
51 | 4 | 8.0 | 7 | 0.004162 | [1, 2, 2, 5, 0, 3, 7, 5] |
52 | 8 | 7.0 | 14 | 0.005876 | [1, 2, 2, 5, 0, 3, 5, 5] |
53 | 16 | 6.0 | 27 | 0.007869 | [3, 2, 3, 5, 0, 1, 5, 5] |
54 | 32 | 4.0 | 57 | 0.011020 | [3, 5, 6, 5, 5, 0, 4, 7] |
55 | 64 | 5.0 | 114 | 0.015731 | [2, 0, 3, 6, 1, 2, 1, 7] |
56 | 128 | 3.0 | 205 | 0.022950 | [2, 0, 6, 3, 5, 0, 4, 3] |
57 | 256 | 2.0 | 358 | 0.036341 | [7, 1, 3, 6, 6, 4, 0, 5] |
58 | 512 | 0.0 | 427 | 0.044455 | [7, 1, 3, 0, 6, 4, 2, 5] |
59 | 1024 | 0.0 | 427 | 0.044455 | [7, 1, 3, 0, 6, 4, 2, 5] |
And the best state associated with this is:
best_state = run_stats_best_run[["schedule_current_value", "schedule_init_temp", "schedule_min_temp"]].tail(1)
best_state
schedule_current_value | schedule_init_temp | schedule_min_temp | |
---|---|---|---|
59 | 1.999107 | 2.0 | 0.001 |
The final state is as follows:
state = literal_eval(run_stats_best_run["State"].tail(1).values[0])
print(state)
board_layout = "/".join(["".join(([str(s)] if s > 0 else []) + ["Q"] + ([str((7 - s))] if s < 7 else [])) for s in state])
board = chess.Board(board_layout)
board
[7, 1, 3, 0, 6, 4, 2, 5]
Example 2: Generating and Running Max K Color using the GA algorithm#
# Generate a new Max K problem using a fixed seed.
problem = MaxKColorGenerator().generate(seed=123456, number_of_nodes=10, max_connections_per_node=3, max_colors=3)
The input graph generated for the problem looks like this:
# create a runner class and solve the problem
ga = GARunner(
problem=problem,
experiment_name="max_k_ga",
output_directory=None, # note: specify an output directory to have results saved to disk
seed=123456,
iteration_list=2 ** np.arange(11),
population_sizes=[10, 20, 50],
mutation_rates=[0.1, 0.2, 0.5],
)
# the two data frames will contain the results
df_run_stats, df_run_curves = ga.run()
The preceding code will run the GA
algorithm nine times for at most 1024 iterations per run.
Each run is a permutation from the list of population_sizes
and mutation_rates
.
Note that the initial state parameters here are just toy values picked specifically
for this example. You will have to choose your own range of values for your
assignment. I strongly recommend you don't just copy these, or you will find
that the grading is unlikely to go the way you would like.
Really. I mean it... A mutation rate of 0.5 is little better than a pure random search.
The output in the df_run_stats
dataframe contains snapshots of the state of the algorithm at the iterations
specified in the iteration_list
passed into the runner class.
The first row (corresponding to the first run of this algorithm) are as follows:
Iteration | Fitness | FEvals | Time | State | |
---|---|---|---|---|---|
0 | 0 | 3.0 | 10 | 0.000702 | [1, 2, 2, 1, 0, 0, 0, 0, 2, 2] |
The state information is excluded from the previous output.
A sample of this is below:
Population Size | Mutation Rate | |
---|---|---|
0 | 10 | 0.1 |
So, to pick out the most performant run from the dataframe, you need to find the row with the best fitness.
As Max-K-Color is a minimization problem, you'd pick the row with the minimum fitness.
However, I'm going to look in the run_curves
(which stores minimal basic information every iteration) to
find out which input state achieved the best fitness in the fewest fitness evaluations.
best_fitness = df_run_curves["Fitness"].min()
best_runs = df_run_curves[df_run_curves["Fitness"] == best_fitness]
best_runs
Iteration | Time | Fitness | FEvals | Population Size | Mutation Rate | max_iters | |
---|---|---|---|---|---|---|---|
4 | 4 | 0.003967 | 0.0 | 57.0 | 10 | 0.1 | 1024 |
9 | 4 | 0.003967 | 0.0 | 57.0 | 10 | 0.2 | 1024 |
18 | 8 | 0.001445 | 0.0 | 101.0 | 10 | 0.5 | 1024 |
24 | 5 | 0.000106 | 0.0 | 127.0 | 20 | 0.1 | 1024 |
27 | 2 | 0.003304 | 0.0 | 64.0 | 20 | 0.2 | 1024 |
31 | 3 | 0.003797 | 0.0 | 85.0 | 20 | 0.5 | 1024 |
41 | 9 | 0.001615 | 0.0 | 511.0 | 50 | 0.1 | 1024 |
45 | 3 | 0.003797 | 0.0 | 205.0 | 50 | 0.2 | 1024 |
50 | 4 | 0.003967 | 0.0 | 256.0 | 50 | 0.5 | 1024 |
This gives us nine candidates for the best run. We are going to pick the one with
that reached the best fitness value in the fewest number of evaluations.
(We could also have chosen to use Iterations
as our criteria.)
minimum_evaluations = best_runs["FEvals"].min()
best_run = best_runs[best_runs["FEvals"] == minimum_evaluations]
The best runs using these criteria is as follows:
Iteration | Time | Fitness | FEvals | Population Size | Mutation Rate | max_iters | |
---|---|---|---|---|---|---|---|
4 | 4 | 0.003967 | 0.0 | 57.0 | 10 | 0.1 | 1024 |
9 | 4 | 0.003967 | 0.0 | 57.0 | 10 | 0.2 | 1024 |
We will arbitrarily pick the first row for this example,
which has the following identifying state information:
best_mr = best_run["Mutation Rate"].iloc()[0]
best_pop_size = best_run["Population Size"].iloc()[0]
print(f"Best Mutation Rate: {best_mr}, best Population Size: {best_pop_size}")
Best Mutation Rate: 0.1, best Population Size: 10
To map this back to the run_stats
we look at the configuration data included in
the curve data. The curve data includes at least the minimum identifying information
to determine which run each row came from.
In this case, the values we are looking for are the Mutation Rate
and Population Size
.
So, we are looking for all rows in df_run_stats
where the mutation rate and population size are equal to our best values.
run_stats_best_run = df_run_stats[(df_run_stats["Mutation Rate"] == best_mr) & (df_run_stats["Population Size"] == best_pop_size)]
run_stats_best_run[["Iteration", "Fitness", "FEvals", "Time"]]
Iteration | Fitness | FEvals | Time | |
---|---|---|---|---|
0 | 0 | 3.0 | 10 | 0.000702 |
1 | 1 | 3.0 | 21 | 0.002124 |
2 | 2 | 2.0 | 33 | 0.003304 |
3 | 4 | 0.0 | 57 | 0.003967 |
4 | 8 | 0.0 | 57 | 0.003967 |
5 | 16 | 0.0 | 57 | 0.003967 |
6 | 32 | 0.0 | 57 | 0.003967 |
7 | 64 | 0.0 | 57 | 0.003967 |
8 | 128 | 0.0 | 57 | 0.003967 |
9 | 256 | 0.0 | 57 | 0.003967 |
10 | 512 | 0.0 | 57 | 0.003967 |
11 | 1024 | 0.0 | 57 | 0.003967 |
And the best state associated with this is:
State | |
---|---|
11 | [0, 1, 2, 0, 0, 2, 0, 1, 2, 1] |
For the following node ordering:
[0, 2, 8, 1, 3, 4, 6, 7, 9, 5]
Reordering the state by ascending node number gives the following:
color_indexes = literal_eval(run_stats_best_run["State"].tail(1).values[0])
ordered_state = [color_indexes[n] for n in problem.source_graph.nodes]
print(ordered_state)
[0, 2, 2, 1, 0, 0, 0, 1, 1, 2]
Which results in a graph looking like this:
colors = ["lightcoral", "lightgreen", "yellow"]
node_color_map = [colors[s] for s in ordered_state]
nx.draw(problem.source_graph, pos=nx.spring_layout(problem.source_graph, seed=3), with_labels=True, node_color=node_color_map)
plt.show()
Example 3: Generating and Running TSP using the GA algorithm#
# Generate a new TSP problem using a fixed seed.
problem = TSPGenerator().generate(seed=123456, number_of_cities=20)
The input graph generated for the problem looks like this:
fig, ax = plt.subplots(1) # Prepare 2 plots
ax.set_yticklabels([])
ax.set_xticklabels([])
for i, (x, y) in enumerate(problem.coords):
ax.scatter(x, y, s=200, c="cornflowerblue") # plot A
node_labels = {k: str(v) for k, v in enumerate(string.ascii_uppercase) if k < len(problem.source_graph.nodes)}
for i in node_labels.keys():
x, y = problem.coords[i]
plt.text(
x,
y,
node_labels[i],
ha="center",
va="center",
c="white",
fontweight="bold",
bbox=dict(boxstyle=f"circle,pad=0.15", fc="cornflowerblue"),
)
plt.tight_layout()
plt.show()
# create a runner class and solve the problem
ga = GARunner(
problem=problem,
experiment_name="tsp_ga",
output_directory=None, # note: specify an output directory to have results saved to disk
seed=123456,
iteration_list=2 ** np.arange(11),
population_sizes=[10, 20],
mutation_rates=[0.1, 0.25, 0.5],
)
# the two data frames will contain the results
df_run_stats, df_run_curves = ga.run()
The preceding code will run the GA
algorithm nine times for at most 1024 iterations per run.
Each run is a permutation from the list of population_sizes
and mutation_rates
.
Note that the initial state parameters here are just toy values picked specifically
for this example. You will have to choose your own range of values for your
assignment. I strongly recommend you don't just copy these, or you will find
that the grading is unlikely to go the way you would like.
Really. I mean it... A mutation rate of 0.5 is little better than a pure random search.
The output in the df_run_stats
dataframe contains snapshots of the state of the algorithm at the iterations
specified in the iteration_list
passed into the runner class.
The first row (corresponding to the first run of this algorithm) are as follows:
Iteration | Fitness | FEvals | Time | State | |
---|---|---|---|---|---|
0 | 0 | 2722.031402 | 10 | 0.00062 | [19, 13, 12, 9, 5, 6, 2, 3, 18, 8, 4, 7, 0, 14... |
The state information is excluded from the previous output.
A sample of this is below:
Population Size | Mutation Rate | |
---|---|---|
0 | 10 | 0.1 |
So, to pick out the most performant run from the dataframe, you need to find the row with the best fitness.
As TSP is a minimization problem, you'd pick the row with the minimum fitness.
However, I'm going to look in the run_curves
(which stores minimal basic information every iteration) to
find out which input state achieved the best fitness in the fewest fitness evaluations.
best_fitness = df_run_curves["Fitness"].min()
best_runs = df_run_curves[df_run_curves["Fitness"] == best_fitness]
best_runs[:10]
Iteration | Time | Fitness | FEvals | Population Size | Mutation Rate | max_iters | |
---|---|---|---|---|---|---|---|
3709 | 707 | 0.296142 | 941.582778 | 14903.0 | 20 | 0.1 | 1024 |
3710 | 708 | 0.296608 | 941.582778 | 14924.0 | 20 | 0.1 | 1024 |
3711 | 709 | 0.297102 | 941.582778 | 14945.0 | 20 | 0.1 | 1024 |
3712 | 710 | 0.297565 | 941.582778 | 14966.0 | 20 | 0.1 | 1024 |
3713 | 711 | 0.298021 | 941.582778 | 14987.0 | 20 | 0.1 | 1024 |
3714 | 712 | 0.298481 | 941.582778 | 15008.0 | 20 | 0.1 | 1024 |
3715 | 713 | 0.298940 | 941.582778 | 15029.0 | 20 | 0.1 | 1024 |
3716 | 714 | 0.299380 | 941.582778 | 15050.0 | 20 | 0.1 | 1024 |
3717 | 715 | 0.299820 | 941.582778 | 15071.0 | 20 | 0.1 | 1024 |
3718 | 716 | 0.300268 | 941.582778 | 15092.0 | 20 | 0.1 | 1024 |
This gives us nine candidates for the best run. We are going to pick the one with
that reached the best fitness value in the fewest number of evaluations.
(We could also have chosen to use Iterations
as our criteria.)
minimum_evaluations = best_runs["FEvals"].min()
best_run = best_runs[best_runs["FEvals"] == minimum_evaluations]
The best runs using these criteria is as follows:
Iteration | Time | Fitness | FEvals | Population Size | Mutation Rate | max_iters | |
---|---|---|---|---|---|---|---|
3709 | 707 | 0.296142 | 941.582778 | 14903.0 | 20 | 0.1 | 1024 |
This has the following identifying state information:
best_mr = best_run["Mutation Rate"].iloc()[0]
best_pop_size = best_run["Population Size"].iloc()[0]
print(f"Best Mutation Rate: {best_mr}, best Population Size: {best_pop_size}")
Best Mutation Rate: 0.1, best Population Size: 20
To map this back to the run_stats
we look at the configuration data included in
the curve data. The curve data includes at least the minimum identifying information
to determine which run each row came from.
In this case, the values we are looking for are the Mutation Rate
and Population Size
.
So, we are looking for all rows in df_run_stats
where the mutation rate and population size are equal to our best values.
run_stats_best_run = df_run_stats[(df_run_stats["Mutation Rate"] == best_mr) & (df_run_stats["Population Size"] == best_pop_size)]
run_stats_best_run[["Iteration", "Fitness", "FEvals", "Time"]]
Iteration | Fitness | FEvals | Time | |
---|---|---|---|---|
36 | 0 | 2722.031402 | 20 | 0.000232 |
37 | 1 | 2141.868537 | 42 | 0.002583 |
38 | 2 | 2141.868537 | 63 | 0.004905 |
39 | 4 | 2141.868537 | 105 | 0.007919 |
40 | 8 | 2044.319536 | 190 | 0.012401 |
41 | 16 | 1807.055513 | 360 | 0.019740 |
42 | 32 | 1785.714484 | 697 | 0.032535 |
43 | 64 | 1479.922718 | 1376 | 0.059770 |
44 | 128 | 1151.614761 | 2731 | 0.114027 |
45 | 256 | 1081.456468 | 5421 | 0.211686 |
46 | 512 | 970.009048 | 10803 | 0.410439 |
47 | 1024 | 941.582778 | 21560 | 0.831667 |
And the best state associated with this is:
State | |
---|---|
47 | [9, 0, 19, 17, 14, 15, 16, 8, 12, 5, 1, 6, 18,... |
Which results in a graph looking like this:
ordered_state = literal_eval(run_stats_best_run["State"].tail(1).values[0])
edge_labels = {(ordered_state[i], ordered_state[(i + 1) % len(ordered_state)]): f"{str(i + 1)}➜" for i in range(len(ordered_state))}
fig, ax = plt.subplots(1) # Prepare 2 plots
ax.set_yticklabels([])
ax.set_xticklabels([])
for i, (x, y) in enumerate(problem.coords):
ax.scatter(x, y, s=1, c="green" if i == 5 else "cornflowerblue") # plot A
for i in range(len(ordered_state)):
start_node = ordered_state[i]
end_node = ordered_state[(i + 1) % len(ordered_state)]
start_pos = problem.coords[start_node]
end_pos = problem.coords[end_node]
ax.annotate(
"",
xy=start_pos,
xycoords="data",
xytext=end_pos,
textcoords="data",
c="red",
arrowprops=dict(arrowstyle="->", ec="red", connectionstyle="arc3"),
)
node_labels = {k: str(v) for k, v in enumerate(string.ascii_uppercase) if k < len(problem.source_graph.nodes)}
for i in node_labels.keys():
x, y = problem.coords[i]
plt.text(
x,
y,
node_labels[i],
ha="center",
va="center",
c="white",
fontweight="bold",
bbox=dict(boxstyle=f"circle,pad=0.15", fc="green" if i == ordered_state[0] else "cornflowerblue"),
)
plt.tight_layout()
plt.show()
And, to verify that the route is correct (or at least, the shortest one found):
all_edge_lengths = {(x, y): d for x, y, d in problem.distances}
all_edge_lengths.update({(y, x): d for x, y, d in problem.distances})
route_length = sum([all_edge_lengths[k] for k in edge_labels.keys()])
print(f"route_length: ({round(route_length, 6)}) equal to best_fitness: ({round(best_fitness, 6)})")
route_length: (941.582778) equal to best_fitness: (941.582778)
Example 4: Using the NNGSRunner with the RHC algorithm#
# Load and Split data into training and test sets
data = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
data.data, data.target, test_size=0.3, random_state=123456
)
# Normalize feature data
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# One hot encode target values
one_hot = OneHotEncoder()
y_train_hot = np.asarray(one_hot.fit_transform(y_train.reshape(-1, 1)).todense())
y_test_hot = np.asarray(one_hot.transform(y_test.reshape(-1, 1)).todense())
grid_search_parameters = {
'max_iters': [1000], # nn params
'learning_rate': [1e-2], # nn params
'activation': [mlrose.relu], # nn params
'restarts': [1], # rhc params
}
nnr = NNGSRunner(x_train=X_train_scaled, y_train=y_train_hot, x_test=X_test_scaled, y_test=y_test_hot, experiment_name='nn_test_rhc',
algorithm=mlrose.algorithms.rhc.random_hill_climb, grid_search_parameters=grid_search_parameters,
iteration_list=[1, 10, 50, 100, 250, 500, 1000], hidden_layer_sizes=[[2]], clip_max=5, n_jobs=5, seed=123456)
run_stats_df, curves_df, cv_results_df, grid_search_cv = nnr.run()
Fitting 5 folds for each of 1 candidates, totalling 5 fits
The runner returns the run_stats
and curves
corresponding to best hyperparameter combination,
as well as the cross validation results and the underlying GridSearchCV
object used in the run.
y_test_pred = grid_search_cv.predict(X_test_scaled)
y_test_accuracy = accuracy_score(np.asarray(y_test_hot), y_test_pred)
y_test_accuracy
0.3333333333333333
y_train_pred = grid_search_cv.predict(X_train_scaled)
y_train_accuracy = accuracy_score(y_train_hot, y_train_pred)
y_train_accuracy
0.3333333333333333
Run stats dataframe
current_restart | Iteration | Fitness | FEvals | Time | learning_rate | |
---|---|---|---|---|---|---|
0 | 0 | 0 | 1.306448 | 1 | 0.001905 | 0.01 |
1 | 0 | 1 | 1.290536 | 3 | 0.004327 | 0.01 |
2 | 0 | 10 | 1.246087 | 14 | 0.009143 | 0.01 |
3 | 0 | 50 | 1.115004 | 67 | 0.027937 | 0.01 |
4 | 0 | 100 | 1.044654 | 127 | 0.048129 | 0.01 |
5 | 0 | 250 | 0.877647 | 308 | 0.107263 | 0.01 |
6 | 0 | 500 | 0.739120 | 614 | 0.212110 | 0.01 |
7 | 0 | 1000 | 0.732424 | 1175 | 0.433252 | 0.01 |
8 | 1 | 0 | 1.306448 | 1175 | 0.436525 | 0.01 |
9 | 1 | 1 | 1.306448 | 1176 | 0.438441 | 0.01 |
10 | 1 | 10 | 1.247648 | 1188 | 0.444550 | 0.01 |
11 | 1 | 50 | 1.148455 | 1239 | 0.465959 | 0.01 |
12 | 1 | 100 | 1.105441 | 1298 | 0.491722 | 0.01 |
13 | 1 | 250 | 0.931072 | 1490 | 0.570107 | 0.01 |
curves dataframe
current_restart | Iteration | Fitness | FEvals | Time | learning_rate | |
---|---|---|---|---|---|---|
0 | 0 | 0 | 1.290536 | 1.0 | 0.001905 | 0.01 |
1 | 0 | 1 | 1.290536 | 3.0 | 0.004327 | 0.01 |
2 | 0 | 2 | 1.290536 | 4.0 | 0.005450 | 0.01 |
3 | 0 | 3 | 1.290536 | 5.0 | 0.005843 | 0.01 |
4 | 0 | 4 | 1.263664 | 7.0 | 0.006530 | 0.01 |
5 | 0 | 5 | 1.263664 | 8.0 | 0.006857 | 0.01 |
6 | 0 | 6 | 1.246087 | 10.0 | 0.007742 | 0.01 |
7 | 0 | 7 | 1.246087 | 11.0 | 0.008119 | 0.01 |
8 | 0 | 8 | 1.246087 | 12.0 | 0.008459 | 0.01 |
9 | 0 | 9 | 1.246087 | 13.0 | 0.008787 | 0.01 |
10 | 0 | 10 | 1.246087 | 14.0 | 0.009143 | 0.01 |
11 | 0 | 11 | 1.246087 | 15.0 | 0.010364 | 0.01 |
12 | 0 | 12 | 1.246087 | 16.0 | 0.010792 | 0.01 |
13 | 0 | 13 | 1.246087 | 17.0 | 0.011168 | 0.01 |
14 | 0 | 14 | 1.246087 | 18.0 | 0.011561 | 0.01 |
15 | 0 | 15 | 1.246087 | 19.0 | 0.011916 | 0.01 |
16 | 0 | 16 | 1.246087 | 20.0 | 0.012291 | 0.01 |
17 | 0 | 17 | 1.232142 | 22.0 | 0.013562 | 0.01 |
18 | 0 | 18 | 1.216927 | 24.0 | 0.014337 | 0.01 |
19 | 0 | 19 | 1.216927 | 25.0 | 0.014832 | 0.01 |
cv results dataframe
cv_results_df[
[
"mean_test_score",
"rank_test_score",
"mean_train_score",
"param_activation",
"param_hidden_layer_sizes",
"param_learning_rate",
"param_max_iters",
"param_restarts",
]
]
mean_test_score | rank_test_score | mean_train_score | param_activation | param_hidden_layer_sizes | param_learning_rate | param_max_iters | param_restarts | |
---|---|---|---|---|---|---|---|---|
0 | 0.333333 | 1 | 0.333333 | relu | [2] | 0.01 | 1000 | 1 |