The ImageDataGenerator command essentially creates generators of batches of data/images that have been transformed in some way. In our case, used the argument “reshape” to reshape the pixel values so that the values were between 0 and 1 and no longer from 0 to 255. There are many additional arguments that you can supply as well, but they mostly deal with telling the generator how you want to transform the image. The command can also read images from the subdirectories provided to it and label the images with the appropriate labels from the appropriate subdirectory. To flow from the directory to the generated object you must use ImageDataGenerator.flow_from_directory() and specify which directory you want to get the images from. You can also specify the labels you want the images to be given, the batch size, the target size, and the class mode, which in our case was binary. As opposed to the arguments for ImageDataGenerator, the arguments for .flow_from_directory() are more about telling the computer how to classify the images provided. By setting your target size you can transform all the images so that they are outputted (sorry if this is not a real word!) in the same size and thereby easier to use, interpret, and compare. When programming the class mode it is a good idea to consider what type of/how many labels you want the generator to return in regards to your image. For instance if you have multiple possible classifications you might use class_mode = ‘categorical’ or if you only have two possible labels you might use class_mode = ‘binary.
I created three convolution and pooling layers, one flatten layer, and two dense layers, finishing off with a sigmoid activation as this is a binary classification example so the only classification options are 0 or 1 (horse or human). The convolution layers always dealt with 3x3 filters whereas the pooling layers always dealt with 2x2 layers. The image sizes decrease by about half as they were passed from each of my Conv2D layers to my MaxPooling2D layer and on to the next iteration. In my model compiler I specified the function (binary_crossentropy, since this is a binary classification problem), the optimizer function (RMSprop, which automates learning-rate tuning), and the metrics (accuracy) I want the model to keep track of.
This function is showing how each of the variables interact with each other; it looks at the relationship between one variable and another variable and compares each variable with another variable in all possible comparisons which allows you to easily see if there are any relationships between variables in your dataset to be investigated further. For instance, in this plot it seems like there is some exponential relationships amongst some of the variables. Along the diagonal access you can see how the variable relates to itself.
As seen in the above table and graphs during the last five epochs the models does not appear to be improving by any great measure. To illitrate this point, as you can see the MSE and MAE scores seem to level off towards the end of the graphs as the number of epochs increases.
As you can see by these plots the model predicts the true values relatively well. There is a normal distibution of error around 2.
By compoaring the 4 different sized models we were able to see how the size of a model affects the loss and validation scores of each model. In this case, we were able to see that smaller models had lower scores in terms of validation loss, but the larger models were overfit as seen in the fact that their training and validation loss scores diverged so drastically, with the validation loss score increasing rapidly as the trainng score increased as well.