
262 lines
9.9 KiB
Raw Normal View History

2024-02-16 00:05:14 -05:00
\definecolor{commentsColor}{rgb}{0.497495, 0.497587, 0.497464}
\definecolor{keywordsColor}{rgb}{0.000000, 0.000000, 0.635294}
\definecolor{stringColor}{rgb}{0.558215, 0.000000, 0.135316}
backgroundcolor=\color{white}, % choose the background color
basicstyle=\small\ttfamily, % the size of the fonts that are used for the code
breakatwhitespace=false, % sets if automatic breaks should only happen at whitespace
breaklines=true, % sets automatic line breaking
captionpos=b, % sets the caption-position to bottom
commentstyle=\color{commentsColor}\textit, % comment style
deletekeywords={}, % if you want to delete keywords from the given language
escapeinside={\%*}{*)}, % if you want to add LaTeX within your code
extendedchars=true, % lets you use non-ASCII characters; for 8-bits encodings only, does not work with UTF-8
%frame=tb, % adds a frame around the code
keepspaces=true, % keeps spaces in text, useful for keeping indentation of code (possibly needs columns=flexible)
keywordstyle=\color{keywordsColor}\bfseries, % keyword style
language=C++, % the language of the code (can be overrided per snippet)
otherkeywords={rank\_t, customerID\_t, distance\_t, fitness\_t}, % if you want to add more keywords to the set
numbers=left, % where to put the line-numbers; possible values are (none, left, right)
numbersep=5pt, % how far the line-numbers are from the code
numberstyle=\tiny\color{commentsColor}, % the style that is used for the line-numbers
rulecolor=\color{black}, % if not set, the frame-color may be changed on line-breaks within not-black text (e.g. comments (green here))
showspaces=false, % show spaces everywhere adding particular underscores; it overrides 'showstringspaces'
showstringspaces=false, % underline spaces within strings only
showtabs=false, % show tabs within strings adding particular underscores
stepnumber=1, % the step between two line-numbers. If it's 1, each line will be numbered
stringstyle=\color{stringColor}, % string literal style
tabsize=4, % sets default tabsize to 2 spaces
title=\lstname, % show the filename of files included with \lstinputlisting; also try caption instead of title
columns=fixed % Using fixed column width (for e.g. nice alignment)
2024-02-16 00:36:38 -05:00
2024-02-16 00:05:14 -05:00
attach boxed title to top center
= {yshift=-8pt},
colback = black!5!white,
colframe = black!75!black,
fonttitle = \bfseries,
colbacktitle = gray!85!black,
title = #2,#1,
% Title Page
\title{\textbf{COSC 4P82 Assignment 1}}
\author{\textbf{Brett Terpstra}\\ - 692021}
\section{Symbolic regression}
\subsection{Parameter Table}
\begin{tabularx}{0.8\textwidth}{ | >{\centering\arraybackslash}X | >{\centering\arraybackslash}X | }
Parameter & Value \\ [0.25ex]
Runs & 10 \\
Population Size & 5000 \\
Generations & 50 \\
Training Set & N/A \\
Testing Set & N/A \\
Crossover Operator & Subtree Crossover\\
Mutation Operator & Grow Tree, Max Depth 4 \\
Crossover Rate & 0.9 or 1.0* \\
Mutation Rate & 0.1 or 1.0* \\
Elitism & Best 2 or 0 individuals Survive* \\
Selection & Fitness Proportionate \\
Function Set & *, /, +, -, exp, log, sin, cos \\
Terminal Set & X, Ephemeral Value \\
Tree Initialization & Half and Half, Max Depth 2-6 \\
Max Tree Depth & 17 \\
Raw Fitness & See Fitness Evaluation \\
Standardized Fitness & = Raw Fitness \\
*4 Tests were run, 0.9 crossover, 0.9 mutation with 0 elitism and 2 elitism, and 1.0 crossover, 1.0 mutation with 0 elitism and 2 elitism.
\subsection{Fitness Evaluation}
Fitness is evaluated by taking the absolute value of the predicted y value minus the actual y value.
If the difference is less than a user provided (default 1.e15) value cutoff it is added to the fitness value. If the difference value is less than the float epsilon value (\~= 0) the number of hits is incremented. Lower fitness values are preferred.
\subsection{Fitness Plots}
\caption{2 Elites, 10 Runs Averaged}
\caption[]{0 Elites, 10 Runs Averaged}
2024-02-16 10:50:33 -05:00
\subsection{Analysis and Conclusion}
2024-02-16 00:05:14 -05:00
The best average fitness of all the tests was 0.19384 using 0.9 crossover and 0.1 mutation.
\section{Rice Classification}
\subsection{Parameter Table}
\begin{tabularx}{0.8\textwidth}{ | >{\centering\arraybackslash}X | >{\centering\arraybackslash}X | }
Parameter & Value \\ [0.25ex]
Runs & 10 \\
Population Size & 5000 \\
2024-02-16 10:50:33 -05:00
Generations & 51 \\
2024-02-16 00:05:14 -05:00
2024-02-16 10:50:33 -05:00
Training Set & Rice Classification (Cammeo and Osmancik) \\
2024-02-16 00:05:14 -05:00
2024-02-16 10:50:33 -05:00
Testing Set & Rice Classification (Cammeo and Osmancik) \\
2024-02-16 00:05:14 -05:00
Crossover Operator & Subtree Crossover\\
Mutation Operator & Grow Tree, Max Depth 4 \\
2024-02-16 10:50:33 -05:00
Crossover Rate & 0.9 or 0.9* \\
2024-02-16 00:05:14 -05:00
2024-02-16 10:50:33 -05:00
Mutation Rate & 0.1 or 0.9* \\
2024-02-16 00:05:14 -05:00
2024-02-16 10:50:33 -05:00
Elitism & Best 2 individuals Survive \\
2024-02-16 00:05:14 -05:00
Selection & Fitness Proportionate \\
2024-02-16 10:50:33 -05:00
Function Set & *, /, +, -, exp, log \\
2024-02-16 00:05:14 -05:00
2024-02-16 10:50:33 -05:00
Terminal Set & area, perimeter, major, minor, eccentricity, convex, extent, Ephemeral Value \\
2024-02-16 00:05:14 -05:00
Tree Initialization & Half and Half, Max Depth 2-6 \\
Max Tree Depth & 17 \\
Raw Fitness & See Fitness Evaluation \\
Standardized Fitness & = Raw Fitness \\
2024-02-16 10:50:33 -05:00
\subsection{Fitness Evaluation}
Tested on the input terminal values the GP produces a positive or negative value which is interpreted as either Cammeo (+) or Osmancik (-). Raw fitness is equal to the number of hits which is the number of correct identifications. The adjusted fitness is then calculated and subtracted from 1 in order to invert and produce the required lowest fitness better.
\subsection{Fitness Plots}
\caption{2 Elites, 10 Runs Averaged}
\caption{2 Elites, 10 Runs Averaged}
\subsection{Confusion Matrix}
\caption{0.9 Crossover 0.1 Mutation 2 Elites Best Program Results}
\caption{0.9 Crossover 0.1 Mutation 2 Elites 10 Run Average Results}
\caption{0.9 Crossover 0.9 Mutation 2 Elites Best Program Results}
\caption{0.9 Crossover 0.9 Mutation 2 Elites 10 Run Average Results}
\subsection{Analysis and Conclusion}
The best results found was a correct classification rate of 91.9\%. On average the 0.9 crossover with 0.1 mutation produced the best results with the 0.9/0.9 best result almost being equal.
2024-02-16 00:05:14 -05:00
\section{Compiling / Executing}
2024-02-16 00:36:38 -05:00
This assignment was made for linux using GCC 13.2.0, however any C++17 compliant compiler should work.
The minimum GCC version appears to be 8.5, meaning this assignment can be built on sandcastle.
cd your_path_to_this_source/
mkdir build
cd build
cmake ../
make -j 32
The actual assignment executable is called |Assignment_1| while the automatic run system is called |Assignment_1_RUNNER|. |Assignment_1_RUNNER| has a help menu with options but the defaults will work assuming you run from the build directory and are using part b only. If you want to build for Part A run |cmake -DPART_B=OFF| and run |Assignment_1_RUNNER| with |-b|
2024-02-16 00:05:14 -05:00
2024-02-16 10:50:33 -05:00
I made a few changes to lilgp, mostly memory fixes along with elitism with a number of individuals instead of a proportion. There appear to be some kind of issue in the GP, of which won't matter as assignment two will likely use my own gp system. I might look into it, but I was not aware there was an issue until compiling the stats here. My results have been generally positive, however, I did notice in the course of collecting data that at some point the Part A results stopped being consistently good however part B results have remained unchanged. Might have happened when I changed my custom random number seeder to not produce div by zero errors during testing. Could be anything. I don't like writing reports and have procrastinated on writing and instead have spent the last couple of weeks messing around with the GP. Fun fact a bunch of additions to my standard lib were made for this assignment. Next time will be better hopefully
Next assignment these will be proper. Latex is being annoying to setup for bib.\\\\\\
2024-02-16 00:05:14 -05:00