Benchmarks

How will my model’s accuracy compare to other common techniques?
Especially with small amounts of data, Custom Collections should generally give a higher accuracy than common DIY algorithms. We benchmarked Custom Collections againts some common algorithms for three typical machine learning tasks to give an idea of how it compares.

Task: Sentiment Detection (Text)
Dataset: Large Movie Review Dataset
Custom Collection Domain: “sentiment”
Benchmarked Against: tfidf vectors of samples (with stop words removed) into logistic regression with a grid search for an optimal regularization parameter

SamplesCustom Collection AccuracyDIY Algorithm Accuracy
1000.890.58
1,0000.930.82
10,0000.940.86

Task: Topic Classification (Text, 4 Categories)
Dataset: Aggregated News
Custom Collection Domain: “topics”
Benchmarked Against: tfidf vectors of samples (with stop words removed) into logistic regression with a grid search for an optimal regularization parameter

SamplesCustom Collection AccuracyDIY Algorithm Accuracy
1000.810.60
1,0000.860.82
10,0000.880.88

Task: Classification (Image, 25 Categories)
Dataset: Caltech 101
Custom Collection Domain: Not set
Benchmarked Against: Logistic regression model trained on HoG features and color histograms for each sample and with a grid search for an optimal regularization parameter

SamplesCustom Collection AccuracyDIY Algorithm Accuracy
1000.820.22
1,0000.950.55
10,0000.940.67

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Map;
import io.indico.Indico;
import io.indico.api.custom.CollectionData;
import io.indico.api.custom.IndicoCollection;
​
Indico indico = new Indico("YOUR_API_KEY");
IndicoCollection newCollection = indico.custom.getCollection("collectionName");

// Add Data
newCollection.addData(new ArrayList() { {
    add(new CollectionData("text1", "label1"));
    add(new CollectionData("text2", "label2"));
    // ...
} });
​
// Training
newCollection.train();

// Telling Collection to block until ready
newCollection.waitUntilReady();

// Done! Start analyzing text
newCollection.predict("indico is so easy to use!");