ODSC 2018: Effective Transfer Learning for NLP
July 5, 2018 / Data Science, Machine Learning, Text Data Use Case
Our machine learning architect and co-founder, Madison May, was a featured presenter at the 2018 Open Data Science Conference (ODSC). His technical talk focused on transfer learning, how it can be used to deliver efficiencies in machine learning on text-based content and how Indico has incorporated transfer learning into our commercial Intelligent Process Automation software. If you missed it, you can now watch the recording below.
Transfer learning, the practice of applying knowledge gained on one machine learning task to aid the solution of another, has seen historic success in the field of computer vision. Output representations of generic image classification models trained on ImageNet have been leveraged to build models that detect the presence of custom objects in natural images. Classification tasks that might require hundreds of thousands of images can be tackled with mere dozens of training examples per class thanks to these pre-trained representations.
The field of NLP, however, has seen more modest gains from transfer learning, with most approaches limited to the use of pre-trained word representations. This session will explore parameter and data-efficient mechanisms for transfer learning on text and show practical improvements on real-world tasks. We’ll also demo Enso, a newly open-sourced library designed to simplify benchmarking of transfer learning methods on a variety of target tasks. Attendees will learn about:
- How Transfer Learning can deliver efficiencies in machine learning on text-based content.
- Simplifying machine learning benchmarking with Enso.
- Tools in Enso that enable the fair comparison of varied feature representations and target task models as the amount of training data is incrementally increased.