AppliedMachineLearningPublic

Question 1: In the video, First steps in computer vision, Laurence Maroney introduces us to the Fashion MNIST data set and using it to train a neural network in order to teach a computer “how to see.” One of the first steps towards this goal is splitting the data into two groups, a set of training images and training labels and then also a set of test images and test labels. Why is this done? What is the purpose of splitting the data into a training set and a test set?

We split the data into training and test sets because we want to use the training data to help teach the model about the relationship between the data and give it the opportunity to predict answers and then be able to check these answers against the correct answers so that the model can be as accurate as possible. Once we have trained the model on the trianing data we then want to see how well it does on data that it has not seen before and thus does not actually know the correct answer for so that we can see if the model has correctly learned the relationship/rules of the data and is not too overfit (i.e has not just memorized the training data, but instead understands the rules of the data).

Question 2: The fashion MNIST example has increased the number of layers in our neural network from 1 in the past example, now to 3. The last two are .Dense layers that have activation arguments using the relu and softmax functions. What is the purpose of each of these functions. Also, why are there 10 neurons in the third and last layer in the neural network.

The activation arguments help the neural network know when to activate a node and when pass on the information to the next correct node fro analysis. The relu function essentially says that if the output of a neuron is less than zero, just set the output value to zero so that negatiev values wont skew things downstream and cancel positive values being produced elsewhere. The softmax function helps the machine find the most likely neural candidate by setting the the value of the neuron with the largest value to one and the rest of the neurons to zero.

In this example of a neural network there were 10 neurons in the last layer because there were ten possible categories that the image could be classified as, so depending on which neuron had the highest value the category that that neuron represented would be assigned to the image.

Question 3: In the past example we used the optimizer and loss function, while in this one we are using the function adam in the optimizer argument and sparse_categorical-crossentropy for the loss argument. How do the optimizer and loss functions operate to produce model parameters (estimates) within the model.compile() function?

The loss and optimizer function work together to help the model learn the rules of the data. The loss function calculats how good the machine is at guessing the relationship in the data. The result from the loss function is then used by the optimizer function to help the machine make are more accurate guess. This guess is then re-evaluated by the loss function whose result is again fed into the optimizer function in a continuous cycle until the maximum accuracy is reached.

Question 4: Using the mnist drawings dataset (the dataset with the hand written numbers with corresponding labels) answer the following questions:

1.What is the shape of the images training set (how many and the dimension of each)?

There are 60,000 images and each image is 28 by 28 pixels.

2.What is the length of the labels training set?

There are 60,000 labels in the training set.

3.What is the shape of the images test set?

The test set has 10,000 images and each image is 28 by 28 pixels.

4.Estimate a probability model and apply it to the test set in order to produce the array of probabilities that a randomly selected image is each of the possible numeric outcomes (look towards the end of the basic image classification exercises for how to do this — you can apply the same method applied to the Fashion MNIST dataset but now apply it to the hand written numbers MNIST dataset).

The array of probabilities for the 9,026 image in the test set is: ([1.0489175e-09, 2.6143069e-13, 2.9991051e-07, 1.8468669e-11, 1.7452842e-06, 2.5827162e-13, 1.3445548e-13, 2.4090318e-06,1.7083970e-09, 9.9999559e-01],)

5.Use np.argmax() with your predictions object to return the numeral with the highest probability from the test labels dataset.

The most likely numeral for the 9,026 image in the test set is 9.

6.Produce the following plot for your randomly selected image from the test dataset

number