Is it possible to learn without data (or at least one or two samples)—like humans—in AI?

Humans do not learn without data, but as you’ve said, they can often learn with very small amounts of data. This is an extremely active area of research typically referred to as “one-shot learning” (learning from one example) or even “zero-shot learning” (learning from zero examples and just the name of the label you’re looking to apply) [helpful paper: A Comprehensive Evaluation of the Good, the Bad and the Ugly]
In general, these branches of investigation are a part of “Transfer Learning”, which focuses on leveraging information learned in previous tasks to solve new tasks with less data. It’s a bit of a misnomer to say that you’re learning without data, instead what you’re doing is taking things that you’ve learned previously and applying them to new problems. To learn the original problem you still need a massive amount of data.

Obviously, no machine learning model is going to have access to as much data as a human has (decades of training time constantly processing all of the data coming in from every sensory organ), but we can train them on relatively large image and video feeds, text data, etc… and get behavior that is very close to this.

Now comes the kicker though. How do you know if you’ve done a good job when you learn on one data point? You don’t. People often have the assumption that they’ll just look at the results and see if they “feel” right to them. The real issue is that humans generally operate without oversight and their errors are ignored. When we test humans on their consistency in a given task we often realize that different people very frequently disagree even on tasks as simple as sentiment analysis (is this positive or negative).

The issue becomes that without some data (not millions, but at least dozens or hundreds) you have no idea if it works or not. You might think that you do, but you have literally no idea whatsoever unless you create a dataset, measure multiple different humans, check what their consistency rate is, and then compare that to the accuracy of the machine.

The other problem is that learning from a single example is not useful. Why? Well because labeling a couple hundred examples is extremely cheap. Depending on the problem you can generally create a dataset of a few hundred examples in a couple hours. It takes much longer than this to create and evaluate a machine learning model (or even read the paper that I linked). If you’re not willing to label a couple hundred examples then your problem is not important.

So can you learn on zero or one example? Certainly. Transfer learning literature has no shortage of examples where people have done this. The problem is that in a real-world scenario, without some data to validate a working model it would be tremendously irresponsible to ship something. It’s very fascinating from a research perspective, and people are making more and more effective models with less data every day, but this is generally a solution to an imagined problem.

The question should be about learning in a low data environment. Learning from zero or one data point generally just means you don’t want to define the problem that you’re solving — you can’t solve it and have no idea if people are capable of evaluating it effectively.

View original question on Quora >

Follow Slater on Quora >>

[addtoany]

Increase intake capacity. Drive top line revenue growth.

Schedule Demo

Unstructured Unlocked podcast

April 10, 2024 | E44

Unstructured Unlocked episode 44 with Tom Wilde, Indico Data CEO, and Robin Merttens, Executive Chairman of InsTech

Listen Now

March 27, 2024 | E43

Unstructured Unlocked episode 43 with Sunil Rao, Chief Executive Officer at Tribble

Listen Now

March 13, 2024 | E42

Unstructured Unlocked episode 42 with Arthur Borden, VP of Digital Business Systems & Architecture for Everest and Alex Taylor, Global Head of Emerging Technology for QBE Ventures

Listen Now

View All

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Get Started

Industry

Use Cases

Get Started

Resources

Documentation

Customer Stories

Get Started

Get Started

Get Started

Indico Named as Major Contender and Star Performer in Everest Group's PEAK Matrix® for Intelligent Document Processing (IDP)

BLOG

Is it possible to learn without data (or at least one or two samples)—like humans—in AI?

Increase intake capacity. Drive top line revenue growth.

Related Posts

Ask Slater, Machine Learning

What is a tensor in physics terminology and what’s the difference from a tensor in machine learning and AI?

Ask Slater, Machine Learning

How does the ELMo machine learning model work?

Ask Slater, Machine Learning

Should we remove duplicates from a data-set while training a Machine Learning algorithm (shallow and/or deep methods)?

Unstructured Unlocked podcast

Unstructured Unlocked episode 44 with Tom Wilde, Indico Data CEO, and Robin Merttens, Executive Chairman of InsTech

Unstructured Unlocked episode 43 with Sunil Rao, Chief Executive Officer at Tribble

Unstructured Unlocked episode 42 with Arthur Borden, VP of Digital Business Systems & Architecture for Everest and Alex Taylor, Global Head of Emerging Technology for QBE Ventures

Get started with Indico

Schedule1-1 Demo

Resources

Blog

Gain insights from experts in automation, data, machine learning, and digital transformation.

Unstructured Unlocked

Enterprise leaders discuss how to unlock value from unstructured data.

YouTube Channel

Check out our YouTube channel to see clips from our podcast and more.

Get our best content on intelligent automation sent to your inbox weekly!

Schedule
1-1 Demo