Exploring Computer Vision (Part II): Transfer Learning

Welcome back to our three part series on computer vision. In the previous post, we discussed convolutional neural networks (CNNs). This post will assume that you have a basic understanding of CNNs; we encourage you to reread the first post if you want a refresher on convolutional networks.

Introduction to Transfer Learning

When we start learning about programming, we tend to think of algorithms as “black boxes” that transform an input to an output. Sorting algorithms convert unordered sequences into ordered sequences, graphics algorithms convert matrices into pretty pictures, recommendation systems take a history of events and yield a prediction for the next best action, and so on. At first, it’s tempting to write long, monolithic functions to perform a single task, and this can be very valuable because now you have an automated solution! However, you can often optimize and re-use pieces of that monolithic solution. By composing a solution from reusable chunks, our algorithms can be applied in many different projects. Similarly, in the context of machine learning, given a good objective and training, we can reuse parts of a model trained in one domain to solve problems in different domains. This is called transfer learning.

Transfer Learning From CNNs

Based on the previous post, you’ve seen how stacking layers of convolution filters allows us to accurately classify the content of images. This raises the question – is it necessary to train a brand new CNN for every task? In traditional machine learning, you collect a new dataset and train a new algorithm to make predictions on those examples. This feels awfully similar to when we were learning how to program and didn’t realize that we could share parts of code in different algorithms. Which pieces of the current solution can be optimized and reused to compose new solutions?

Let’s say we have a CNN which can accurately classify images as containing either cats or dogs. This model contains a stack of convolutional kernels which learn features necessary for distinguishing between two animals (e.g., patterns of edges that represent hair, fur color, head shape, etc). Can the same features that distinguish cats from dogs be used to distinguish between other animals like horses and wombats? To test this hypothesis, we’d change our classifier to predict horses vs wombats and use the kernels from our pre trained model. This idea – that features from one model can be used to inform the classifier of another model – is the heart of transfer learning.

Visualization of transfer learning

You might still be skeptical about reusing kernels from one model to make another model. After all, if the model learns extremely specific features about cats and dogs, it will get completely confused when it sees a wombat. If your kernels/features are specific to cats and dogs, this could very well happen. That’s one reason why it’s so important to use large, diverse datasets, to encourage our models to learn generalizable (reusable) features about images.

Real World Example of Transfer Learning

If you’re still unsure about the power of transfer learning, let’s go through a concrete case study of how transfer learning can be used for image recognition tasks. There is a Kaggle competition sponsored by Yelp where you are given photos of restaurants and are asked to predict characteristics of that restaurant, such as whether or not it serves alcohol. Competitors have had success creating models from scratch on the provided training set. We trained a fresh CNN on the data and achieved a 66% accuracy rate. This seems pretty solid, but we wondered what would happen if we used transfer learning instead.

We decided to try to transfer features from Google’s CNN, inception, which they recently released to the public. In order to use this model for transfer learning, we took the training images, ran them through the inception network, and extracted the output of the network from the layer before the classifier. We then applied a new classifier, which predicted restaurant characteristics based on the features we extracted from inception. This approach increased our accuracy by nearly 7%, bringing us up to 73% accuracy, and putting us in the top 22% of contestants. Clearly, the transfer learning approach was a success. Since inception was trained on a massive dataset and has such a deep architecture, it has the capacity to extract general features about any image it sees.

Architecture of Google’s inception network

You don’t always need to train new models to solve new problems. Indeed, one strength of deep learning is that it allows us to reuse parts of one model to solve new problems without collecting a massive dataset for the new task.

Transfer Learning at indico

If you want to start tackling image recognition problems via transfer learning, indico exposes high-quality pre-trained convolutional features via the Custom Collections API and the Image Features API. The Custom Collection API enables you to quickly train a custom text or image classifier when you only have a small amount of data; the Image Features API transforms text or image samples into rich feature vectors. Under the hood, both APIs leverage a deep neural network that has been trained to classify image and text samples. Before you begin making predictions from your Custom Collection, the underlying model has already seen millions of training data examples. Since the model has already seen so much data ahead of time, you only have to send it a few new examples before it learns how to solve the classification task at hand.

A good way to understand transfer learning is to think of it like writing modular code. If you want to write fewer lines of code, you break your programs down into reusable parts. Similarly, if you want to solve classification problems and don’t have enough data or the capacity to train deep models, you can reuse parts of existing models. Next time you find yourself trying to solve a problem and only have a small amount of labeled data, we encourage you to try out our Custom Collection API or our Image Features API.

We’ve concluded Part II of our series on computer vision. In the next article, you will learn about how we used transfer learning in real life to create a smart security camera. If you have any questions around how Indico has incorporated transfer learning into our commercial Intelligent Process Automation software or need help getting started, feel free to reach out to us at contact@indico.io!

[addtoany]

Increase intake capacity. Drive top line revenue growth.

Schedule Demo

Unstructured Unlocked podcast

April 24, 2024 | E45

Unstructured Unlocked episode 45 with Daniel Faggella, Head of Research, CEO at Emerj Artificial Intelligence Research

Listen Now

April 10, 2024 | E44

Unstructured Unlocked episode 44 with Tom Wilde, Indico Data CEO, and Robin Merttens, Executive Chairman of InsTech

Listen Now

March 27, 2024 | E43

Unstructured Unlocked episode 43 with Sunil Rao, Chief Executive Officer at Tribble

Listen Now

View All

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Get Started

Industry

Use Cases

Get Started

Resources

Documentation

Customer Stories

Get Started

Get Started

Get Started

Indico Named as Major Contender and Star Performer in Everest Group's PEAK Matrix® for Intelligent Document Processing (IDP)

BLOG

Exploring Computer Vision (Part II): Transfer Learning

Introduction to Transfer Learning

Transfer Learning From CNNs

Real World Example of Transfer Learning

Transfer Learning at indico

Increase intake capacity. Drive top line revenue growth.

Related Posts

Announcements, Machine Learning

Understanding Indico’s Staggered Loop

Machine Learning, Release Notes

Release Notes – Indico Unstructured Data Platform v5.3

Citizen Developer, Machine Learning

Overcome the complexity of machine learning: get to know machine teaching

Unstructured Unlocked podcast

Unstructured Unlocked episode 45 with Daniel Faggella, Head of Research, CEO at Emerj Artificial Intelligence Research

Unstructured Unlocked episode 44 with Tom Wilde, Indico Data CEO, and Robin Merttens, Executive Chairman of InsTech

Unstructured Unlocked episode 43 with Sunil Rao, Chief Executive Officer at Tribble

Get started with Indico

Schedule1-1 Demo

Resources

Blog

Gain insights from experts in automation, data, machine learning, and digital transformation.

Unstructured Unlocked

Enterprise leaders discuss how to unlock value from unstructured data.

YouTube Channel

Check out our YouTube channel to see clips from our podcast and more.

Get our best content on intelligent automation sent to your inbox weekly!

Schedule
1-1 Demo