Should a machine learning beginner go straight for deep learning?

Let me be clear: I love deep learning. It has radically improved the scope of problems that machine learning can be practically applied to. I’ve built a company on the back of deep learning and owe quite a lot to it.
DO NOT start with deep learning. I’m not saying that you shouldn’t approach deep learning at all, or even that you shouldn’t end up exclusively studying deep learning. To pre maturely focus on deep learning though I believe would be a short-sighted decision.

Deep learning isn’t a magic bullet, and in many (common) situations is in fact a very bad fit for problems that you may be faced with. The problem is that because deep learning is such a flexible and powerful tool it’s important to learn when to not use deep learning for a problem. In almost all situations deep learning is capable of attacking a problem, but there is a much smaller set of problems for which deep learning is practically effective or useful.

Let’s use an example:

Manager: We’re looking to predict customer churn. We only have about 10,000 customer records, and for each one we have about 10 categoric variables. We want to know if we can use those categoric variables to predict their churn.

Approach 1: Hmm, categoric information isn’t a perfect fit for any of the out-of-the-box architectures I’m familiar with. I’d probably set this up as a simple fully-connected network. Now, that said we don’t have much data so I probably need to play around with some smart initialization approaches to get the network in the right neighborhood. I’ll start getting a basic network up that we can test with and then we can iterate on the architecture to get to improve the accuracy. Do we have any GPUs lying around? It would be really helpful to have a couple to run experiments on.

Approach 2: I used a Random Forest from sklearn. It took five minutes, it’s well-fit to the problem definition and I was able to predict churn with 80% precision and 60% recall.

Could Approach 1 have worked? Certainly. Would Approach 1 have given the Manager higher accuracy? Maybe, but not likely. This isn’t a particularly complex problem and it’s not one where we’re likely to effectively use a combination of factors more complex than those mapped by a Random Forest. The data is well structured and it’s not very large, so our ability to learn anything truly sophisticated is limited.

Here’s another example: Bag of Words Meets Bags of Popcorn. This tutorial created a bit of a stir when it first came out. (Note: this tutorial is NOT deep learning despite the fact that it is billed as such. It is however indicative of deep learning solutions in the NLP space) It was originally intended to be an example of how powerful word2vec was. It was billed as a taste of what deep learning could do for NLP. There is a long, detailed tutorial for how to apply word vectors in this problem and it’s a great introduction to the topic for people that aren’t familiar.

Except that’s not how you should do the problem. It created a stir because in this problem (as in many others) the more complex solution actually performs worse than the most basic, obvious approach you could think of. You would be shocked at how strong of a benchmark logistic regression is on top of tf-idf vectors. It’s a great example of a very simple solution to what may often seem a very complex task.

The problem is that if you exclusively focus on deep learning very early on you will make these mistakes. You’ll miss obvious solutions sitting in front of your face, you’ll over-complicate solutions when something very simple and straightforward would have worked, and worst of all, you probably won’t even know. You’ll probably end up shipping a very expensive model into production only to learn months or years later that the model underperforms a simple benchmark. You’ll also find yourself falling off a deep end quickly when you try to move beyond being the ML version of a script-kiddie. Deep learning didn’t invalidate historic machine learning, it just built upon it. You’ll find that in more cutting edge deep learning papers there are a lot of callbacks to old research and a lot of assumption of fundamental knowledge that you would likely never get if you exclusively studied deep learning.

View original question on Quora >

Follow Slater on Quora >>

[addtoany]

Increase intake capacity. Drive top line revenue growth.

Schedule Demo

Resources

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

November 12 at 8 AM PT | 11 AM ET

Technology

Solutions

Why Indico

By Industry

By Use Case

By Role

Services

Resources

Documentation

Customer Stories

Partners

Find a Partner

Become a Partner

Partner Portal

Company

Press & Events

Careers

BLOG

Should a machine learning beginner go straight for deep learning?

Increase intake capacity. Drive top line revenue growth.

Related Posts

Ask Slater, Machine Learning

What is a tensor in physics terminology and what’s the difference from a tensor in machine learning and AI?

Ask Slater, Machine Learning

How does the ELMo machine learning model work?

Ask Slater, Machine Learning

Should we remove duplicates from a data-set while training a Machine Learning algorithm (shallow and/or deep methods)?

See how Indico Data’s AI-driven solutions can revolutionize your decision-making processes.

Schedule
1-1 Demo

Resources

Blog

Gain insights from experts in automation, data, machine learning, and digital transformation.

Unstructured Unlocked

Enterprise leaders discuss how to unlock value from unstructured data.

YouTube Channel

Check out our YouTube channel to see clips from our podcast and more.

November 12 at 8 AM PT | 11 AM ET

Technology

Solutions

Why Indico

By Industry

By Use Case

By Role

Resources

Documentation

Customer Stories

Indico Named as Major Contender and Star Performer in Everest Group's PEAK Matrix® for Intelligent Document Processing (IDP)

BLOG

Should a machine learning beginner go straight for deep learning?

Increase intake capacity. Drive top line revenue growth.

Related Posts

Ask Slater, Machine Learning

What is a tensor in physics terminology and what’s the difference from a tensor in machine learning and AI?

Ask Slater, Machine Learning

How does the ELMo machine learning model work?

Ask Slater, Machine Learning

Should we remove duplicates from a data-set while training a Machine Learning algorithm (shallow and/or deep methods)?

See how Indico Data’s AI-driven solutions can revolutionize your decision-making processes.

Schedule1-1 Demo

Resources

Blog

Gain insights from experts in automation, data, machine learning, and digital transformation.

Unstructured Unlocked

Enterprise leaders discuss how to unlock value from unstructured data.

YouTube Channel

Check out our YouTube channel to see clips from our podcast and more.

Get our best content on intelligent automation sent to your inbox weekly!

Schedule
1-1 Demo