Benchmarks

How will my model’s accuracy compare to other common techniques?
Especially with small amounts of data, Custom Collections should generally give a higher accuracy than common DIY algorithms. We benchmarked Custom Collections againts some common algorithms for three typical machine learning tasks to give an idea of how it compares.

Task: Sentiment Detection (Text)
Dataset: Large Movie Review Dataset
Custom Collection Domain: “sentiment”
Benchmarked Against: tfidf vectors of samples (with stop words removed) into logistic regression with a grid search for an optimal regularization parameter

Samples Custom Collection Accuracy DIY Algorithm Accuracy
100 0.89 0.58
1,000 0.93 0.82
10,000 0.94 0.86

Task: Topic Classification (Text, 4 Categories)
Dataset: Aggregated News
Custom Collection Domain: “topics”
Benchmarked Against: tfidf vectors of samples (with stop words removed) into logistic regression with a grid search for an optimal regularization parameter

Samples Custom Collection Accuracy DIY Algorithm Accuracy
100 0.81 0.60
1,000 0.86 0.82
10,000 0.88 0.88

Task: Classification (Image, 25 Categories)
Dataset: Caltech 101
Custom Collection Domain: Not set
Benchmarked Against: Logistic regression model trained on HoG features and color histograms for each sample and with a grid search for an optimal regularization parameter

Samples Custom Collection Accuracy DIY Algorithm Accuracy
100 0.82 0.22
1,000 0.95 0.55
10,000 0.94 0.67

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Map;
import io.indico.Indico;
import io.indico.api.custom.CollectionData;
import io.indico.api.custom.IndicoCollection;
​
Indico indico = new Indico("YOUR_API_KEY");
IndicoCollection newCollection = indico.custom.getCollection("collectionName");

// Add Data
newCollection.addData(new ArrayList() { {
    add(new CollectionData("text1", "label1"));
    add(new CollectionData("text2", "label2"));
    // ...
} });
​
// Training
newCollection.train();

// Telling Collection to block until ready
newCollection.waitUntilReady();

// Done! Start analyzing text
newCollection.predict("indico is so easy to use!");