Transfer Learning for Dog Breed classifier

7 min readSep 16, 2018

Dogs are man’s best friend and they deserve to be identified correctly. In pursuit of differentiating a Husky (Go Dawgs!) from an Alaskan Malamute, let’s learn how to use transfer learning to classify dog breeds.

Find the entire Jupyter Notebook on my GitHub.

NOTE: This project/article is based off of Udacity’s skeleton Dog Breed Classifier project as part of the AIND program with certain modifications.

Data

As always with most of my technical posts, we need to make sure we have the data we want to work with. Now this post is not for building the entire architecture from scratch and training it, I will cover that in a separate post later for now we will focus on how we can learn from someone else’s work and modify it for our use cases.

We need a lot of different data to solve this problem, let’s start by downloading them:
1. Dog Images
2. Human Images
3. ResNet50 Features
4. OpenCV Human Face Detection Model

We are using a sampled dataset, feel free to use the Stanford Dogs Data set for this purpose, just note that you might need to modify the image read functions as the folder structure is slightly different. Also, DON’T run the model on the entire Stanford Dogs Data without GPU and checkpointing your work. If you are not sure about this, use the smaller sample for now and later you can experiment using a larger data.

So now that we have the data, let’s get to the exciting part of creating the entire app. First off, we need to read the images into python, so let’s do it. We will use Keras to read most of our images instead of matplotlib’s imread() function as Keras let’s us supply a target_size parameter to our read function so all our images are standard size and we don’t have to worry about any dimension mismatch.

# Import the required packages
from sklearn.datasets import load_files       
from keras.utils import np_utils
import numpy as np
from glob import glob
import matplotlib.pyplot as plt
% matplotlib inline

# define function to load train, test, and validation datasets
def load_dataset(path):
    data = load_files(path)
    dog_files = np.array(data['filenames'])
    dog_targets = np_utils.to_categorical(np.array(data['target']), 133)
    return dog_files, dog_targets

# load train, test, and validation datasets
train_files, train_targets = load_dataset('dogImages/train')
valid_files, valid_targets = load_dataset('dogImages/valid')
test_files, test_targets = load_dataset('dogImages/test')

# load list of dog names
dog_names = [item[20:-1] for item in sorted(glob("dogImages/train/*/"))]

# print statistics about the dataset
print('There are %d total dog categories.' % len(dog_names))
print('There are %s total dog images.\n' % len(np.hstack([train_files, valid_files, test_files])))
print('There are %d training dog images.' % len(train_files))
print('There are %d validation dog images.' % len(valid_files))
print('There are %d test dog images.'% len(test_files))import random
random.seed(100)

# load filenames in shuffled human dataset
human_files = np.array(glob("lfw/*/*"))
random.shuffle(human_files)

# print statistics about the dataset
print('There are %d total human images.' % len(human_files))

Make sure the file path is where you have the downloaded files kept else nothing will work. If everything works you should see 133 different dog breeds and ~8.3k dog images.

Detection: Humans and Dogs

You must be wondering why do we need to have human images if we are building a dog classifier, well that’s just to have some fun in the end. We will make our pipeline in a way that we can detect if a picture contains humans or dogs and if its a human we will tell which dog breed they closely resemble. It will be fun with your friends.

import cv2

def face_detector(img_path, display = False):
    face_cascade = cv2.CascadeClassifier('haarcascades/haarcascade_frontalface_alt.xml')
    img = cv2.imread(img_path)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray)
    if display:
        for (x,y,w,h) in faces:
            # add bounding box to color image
            cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
            # convert BGR image to RGB for plotting
            cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

            # display the image, along with bounding box
            plt.imshow(cv_rgb)
            plt.show()
    return len(faces) > 0

# Test with a random face
face_detector('lfw/Aaron_Eckhart/Aaron_Eckhart_0001.jpg', True)

The function above should return something like the image to the left. What we have done here is used OpenCV’s ‘haarcascade’ model. This detects human faces pretty accurately in almost all pictures and also as a bonus allows us to find the (x, y) and (width, height) of the box that would contain the face in that image. It’s a great model for normalizing photos and cropping to only have faces in your dataset.

For the dog classification, we will use ResNet50 model with imagenet weights. It comes prepackaged with Keras so no need to download anything new. We do although have to process our data and create a data pipeline to convert our dog images which are of the shape of (224,224,3) to something which TensorFlow works best with (1, 224, 224, 3) — the tensor follows the following pattern (NumberOfImages, Height, Width, ColorDimensions).

from keras.applications.resnet50 import ResNet50
# Activate the resnet model with imagenet data
ResNet50_model = ResNet50(weights='imagenet')from keras.preprocessing import image                  
from tqdm import tqdm# Function to reshape the images to what TF likes
def path_to_tensor(img_path):
    # Load the image with a size of 224x224 using Keras's load_img module
    img = image.load_img(img_path, target_size=(224, 224))
    # Reshpae to (224, 224, 3) to show the RGB layer in the end
    x = image.img_to_array(img)
    # Reshape to include batches as wellfrom keras.applications.resnet50 import preprocess_input, decode_predictions# Function to predict the label
def ResNet50_predict_labels(img_path):
    img = preprocess_input(path_to_tensor(img_path))
    return np.argmax(ResNet50_model.predict(img))# Function to detect dogs
def dog_detector(img_path):
    prediction = ResNet50_predict_labels(img_path)
    # Return true if resnet says the predicted value is between 151 and 268. These are dog predictions
    return ((prediction <= 268) & (prediction >= 151))# Detect Dogs in images and find if any humans are misclassified as dogs
human_count = 0.0

for img in tqdm(human_files_short):
    isDog = dog_detector(img)
    if isDog:
        human_count += 1
    percentage = (human_count/len(human_files_short)) * 100
print('Percentage of humans misclassified as dogs:: {}%'.format(percentage))

dog_count = 0.0
for img in tqdm(dog_files_short):
    isDog = dog_detector(img)
    if isDog:
        dog_count += 1
    percentage = (dog_count/len(dog_files_short)) * 100
print('Percentage of dogs correctly classified as dogs: {}%'.format(percentage))

You should get a % of the misclassified humans and dogs by this dog finder. It should be fairly accurate in finding dogs in images as the ResNet model we are using has been trained on a very large dataset.

You must we wondering on return ((prediction <= 268) & (prediction >= 151)). This is not some mumbo jumbo but all the indices in a one-hot label of the ResNet data that classify as dogs. We are simply stating that if the argmax value if on indices which lie ≤ 268 and ≥151 then its a dog.

Transfer Learning

Remember all the weights we downloaded at the start of this exercise? We are going to use that now. The below code takes in our ResNet50 weights and creates a final dense layer with a softmax activation with 133 outputs which is exactly the number of dog breeds we have in our data. Here is the code to do that in Keras

# Get the train, test and validation features
bottleneck_features = np.load('bottleneck_features/DogResnet50Data.npz')
train = bottleneck_features['train']
valid = bottleneck_features['valid']
test = bottleneck_features['test']
# Import important functions
from keras import Sequential
from keras.layers import GlobalAveragePooling2D, Dense, Dropout, MaxPooling2D, Flatten
from keras import regularizers# Create a sequential model with the weights from the ResNet50 model as input 
model = Sequential()
model.add(Flatten(input_shape=train.shape[1:]))
model.add(Dense(len(dog_names), activation='softmax'))# Print the model summary
model.summary()

As you will see, we don’t really see the model that we imported using the ResNet50 weights. This is an unknown dimension in transfer learning and we don’t need to care about it too much here. Our model is a very simple one and it works great so let’s train it.

# Compile the model
model.compile(optimizer='rmsprop', metrics=['accuracy'], loss='categorical_crossentropy')# Save the best model to disk
from keras.callbacks import ModelCheckpointcheckpointer = ModelCheckpoint(filepath='saved_models/weights.best.hdf5', 
                               verbose=1, save_best_only=True)model.fit(train, train_targets, 
          validation_data=(valid, valid_targets),
          epochs=20, batch_size=10, callbacks=[checkpointer], verbose=1)

You will see a train accuracy reaching ~99% and a validation accuracy of ~80–82%. It’s not the best model in the world but we wrote so few lines and it is amazing to see such a great outcome. Let’s find out what our testing accuracy looks like.

# Predict the values from our model
predictions = [np.argmax(model.predict(np.expand_dims(feature, axis=0))) for feature in test]# report test accuracy
test_accuracy = 100*np.sum(np.array(predictions)==np.argmax(test_targets, axis=1))/len(predictions)
print('Test accuracy: %.4f%%' % test_accuracy)

We get somewhere around 80% accuracy which is similar to our validation accuracy. This tells us our model is definitely not over fit on our training data which is great.

Functions to consolidate the entire app

Now let’s write up a function that will wrap this whole process and provide us an easy way to predict what dog is by just providing the path to the image.

# Predict the dog breed provided an image path
def predict_breed(img_path):
    from keras.applications.resnet50 import ResNet50, preprocess_input
    bottleneck_feature = ResNet50(weights='imagenet', 
                                  include_top=False).predict(preprocess_input(path_to_tensor(img_path)))
    predicted_vector = model.predict(bottleneck_feature)
    # return dog breed that is predicted by the model
    return dog_names[np.argmax(predicted_vector)]# Predict the dog or human breed provided an image
def predict_dog_or_human_breed(img_path):
    is_dog = dog_detector(img_path)
    is_human = face_detector(img_path)
    if is_dog:
        return "Dog Breed: {}".format(predict_breed(img_path))
    elif is_human:
        return "Human Resembles: {}".format(predict_breed(img_path))
    else:
        return 'Error: Neither Human nor Dog Detected'# Function to run image scan on
def run_image_scan(img_path):
    img = cv2.imread(img_path)
    label = predict_dog_or_human_breed(img_path)
    img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    plt.imshow(img_rgb)
    plt.title(label)
    plt.show()