If you’ve been following our Founder’s Guide to Machine Learning series, you’ll know that machine learning has been around for decades, but commercial applications have only emerged recently — largely because of the high barriers to entry involved with building a machine learning model from scratch. In an effort to make the technology more accessible to smaller companies, machine learning services that provide pre-trained models have become increasingly available for businesses looking to enhance predictive analytics. Great! So now you’re trying to find a service that will help you answer all of your business’ problems with this seemingly magical technology, but it’s hard to differentiate among the numerous available options — as you can see in Shivon Zilis’ awesome graphic below.

Feeling overwhelmed by all the possible service providers? You’re not alone. Because the industry is so young, many people looking to make use of machine learning often rely on crowd wisdom, third party guidance, or even personal relationships with firms when choosing a service provider.

Don’t worry! There are a few simple ways to evaluate the pre-trained models provided by these services and then identify the ones that will bring value to your company. Generally, there are three main features to gauge a model’s performance: accuracy, speed, and scalability. Some of these models operate at 60% accuracy, while others perform at 85% or higher. Speed and scalability also vary widely across these models.

It’s logical to assume that higher accuracy means a better model. However, there’s a trade-off — higher accuracy requires greater computational power, which means it will perform at a slower speed. If you’re trying to analyze millions of tweets in real-time, it makes more sense to choose a model that can perform in real-time with 85% accuracy, instead of a model with 90% accuracy that is too slow to get the job done.

The Speed & Accuracy Trade-off in Machine Learning Models

This trade-off is just one of a few things to consider when you’re shopping around for a service. To help you navigate the complex machine learning landscape, here are six best practices for choosing the right service for your business.

Six Best Practices for Choosing a Machine Learning Service

1. Ensure appropriate data specialization.

Some services concentrate on text and image data, which encompasses anything from understanding how people feel about your company on Twitter to detecting their emotions when they see your new advertisement. Other services provide solutions for heterogeneous data, which involves analysis of many disparate sources of data such as server logs, database dumps, and customer transactions. Still others specialize even further to provide machine-powered solutions to specific problems such as recommendation, fraud detection, and ad bidding.

The key is finding the service or model that is specialized for the type of data that your business uses. Most services will use a technique called “supervised learning” to train their models, which involves using a pre-defined dataset — for instance, a million product reviews in English. If you’re trying to analyze Facebook posts in English there’s going to be a useful amount of crossover. However, if you’re trying to analyze product reviews in Spanish, then that model won’t be appropriate for you.

The crossover between the ML service's training data and your data demonstrates the utility of the service's pre-trained model.

2. Don’t forget! High speed and accuracy are important, but balance matters more.

When it comes to speed and accuracy for data analysis, there’s always a trade-off — depending on your business’ problem, one may be more important than the other. To find the right service, consider how much data you need to analyze and at what rate you need results. For example, do you need to analyze a thousand tweets every second, or a hundred customer reviews every hour?

3. Look for customization options.

For some businesses, general pre-trained models might be the right solution. However, many businesses choose pre-trained models because there is no other option available, and building models in-house is expensive. Sacrificing the needed specificity for a general solution could mean a loss in competitive advantage. Instead, you should consider a machine learning service that allows for model refinement and customization to create the best solution for your problems and available datasets.

4. Remember, ease-of-use is critical.

The most flexible way to implement machine learning is through an Application Programming Interface (API). APIs are easily integrated into software, have low initial investment and eliminate hidden costs with maintenance and hardware. The API should also support prominent coding languages, such as Python, Java, and R to make things easier for your engineers and data scientists.

APIs allow machine learning to be easily integrated into your software.

5. Make sure there’s transparency.

The most effective way to judge suitability is by testing models from two different services to determine which has the best balance between speed and accuracy for your data. Most services offer demos or free plans that allow for a certain amount of data to be processed — be wary of companies that don’t make this kind of testing available.

Pricing should also be transparent. Usage determines cost, which should be established upfront before you commit to a purchase.

6. Keep in mind: community is key.

Community plays an intrinsic role in the development of robust machine learning technology, as contributions from community members to open source projects, research and ideas make for better software. By the same token, a firm that’s supportive of its community through educational content, tutorials, and workshops enables this technology to become more seamless, friendly, and widespread. This kind of support will help you make the most out of machine learning.

The support cycle between an ML service and its community.

Summing it all up

Machine learning technology has existed for decades, and is an improved, automated and cost-effective solution for predictive analytics and business intelligence. However, because it has only recently (and rapidly) moved out of academia into the commercial sphere, it’s an overwhelming industry to navigate. This can sometimes leave smaller businesses at a competitive disadvantage against larger enterprises — but you can overcome these difficulties by looking for these six standards:

  1. Data type specialization
  2. Balanced speed and accuracy
  3. Customizability
  4. Ease-of-use
  5. Transparency
  6. Community support

 
By keeping these key principles in mind, you’ll be better equipped to make an informed decision and maximize the benefits of implementing a machine learning service. If you’re interested in learning about how indico respects these standards, visit our website or email us at contact@indico.io.

Stay tuned for our next post in the series about demystifying machine learning jargon!
Follow indico on Twitter or sign up for our blog newsletter and never miss a post.

Suggested Posts

The Simple + Practical Path to Machine Learning Capability: A Common Benchmark Task

Liberating PDF Data: Introducing Our Newest API

A Fast Method to Stream Data from Big Data Sources