What Is Labellio?

Labellio is a web service that lets you create your own image classifier in a few minutes without any programming. It is very important to give a training image dataset that describes well what you want to recognize, and Labellio helps you build up such dataset. After setting up the training dataset, you get a classifier you can use in your system for your purpose.

Tutorial

Here we walk you through the complete flow of building your own classification model.

Login

Labellio supports OAuth signin with GitHub or Google. Go to Login page and click one of the buttons to be authenticated via OAuth mechanism.

Login

My Models

This page will show all your classification models after you create ones. Click the Create Model button to create a new classification model.

Create Model

Create Model

Now we educate Labellio engine what you want to recognize. Type your model name in the Name text box, then click the Add data button.

Labellio supports three different ways to input training dataset.

Add Data

Upload ZIP

You can upload a zip file that contains your trainnig images in the following rules.

So the folder structure should look as follows.

dog/dog_01.jpg
   /dog_02.jpg
   /dog_03.jpg
   /mydog.png
cat/cat_01.jpg
   /cat_02.jpg
   /cat_03.jpg
   /yourcat.png
bird/bird_01.jpg
    /bird_02.jpg
    /bird_03.jpg
unlabeled_01.jpg
unlabeled_02.jpg
unlabeled_03.jpg

Upload TSV

You can upload a TSV file that lists up image URLs associated with labels in the following rules.

The TSV file content should look as follows.

dog<tab>http://url.to/image
dog<tab>http://url.to/image
dog<tab>http://url.to/image
cat<tab>http://url.to/image
cat<tab>http://url.to/image
cat<tab>http://url.to/image
http://url.to/unlabeled_image
http://url.to/unlabeled_image
http://url.to/unlabeled_image

There are two different Search engines that are supported.

You select one of these engines and type search keywords in the Labels input box, separating by hitting a tab, comma or enter key. The number of retrieved images are 50 for each keyword by default and can be configured up to 100. The keyword is used as a label for those images.

Image Format

The following rules are applied regarding the format of images.

Start Labelling

After the training data is retrieved, click the Continue button. You will see the labelling screen.

Start Labelling

Initially the available labels from the given dataset are displayed. If you want to re-label images, simply delete these labels, and enter your new labels. As far as there are unlablled images, you enter into the labelling screen.

In Labelling

You manually label images by clicking or dragging each image. The current label name is shown at the left top of the top box. Make sure all images to be labelled are placed in the top box and the others in the bottom box. Click the All Ok button, until you go through all the unlabelled images. As you start labelling, Labellio learns how images should be labelled.

Start Training

After labelling all images, you are led to the final training process automatically. This step may take several minutes. Note that the training process continues even if you close your browser.

Deep Visualization

Coming Soon ...

Test Model

Once the training completes, you can see how the classifier works from the Test Model tab.

Test Model

You can either upload the raw image file or type URL in the input box. The same rules for Image Format apply.

Download Model

You can download your classification model by clicking the Download Model button in the Detail tab. There are two files to download.

These links are valid for an hour.

Advanced Topics

Model File Format

The model files exported from Labellio can be used by the open source Caffe framework.

The Caffe Model file contains the following files in tgz archive.

file name content
labellio.json configuration file
caffemodel.binaryproto caffe model file
mean.binaryproto mean file
deploy.prototxt caffe's network definition file
labels.json label information

The Caffe model file and mean file what you give to Caffe program.

Labellio CLI

Labellio CLI allows you to utilize trained models on your environment.

Installation of Labellio CLI

Labellio CLI is an opensource project which can be installed by pip command. It's also available as Docker image as well as AMI on AWS.

Usage of Labellio CLI

Labellio CLI classifies images by using the caffe model which is trained and downloaded from Labellio.

Store image files under a directory.

$ ls images
alpaca1.jpg  alpaca2.jpg  sheep1.jpg  sheep2.jpg

Extract the Caffemodel file which is downloaded from Labellio.

$ ls model
caffemodel.binaryproto  labellio.json  mean.binaryproto
deploy.prototxt         labels.json

Run labellio_cli command.

$ labellio_classify model images
...
images/sheep2.jpg   sheep   [ 0.02713127  0.97286868]
images/sheep1.jpg   sheep   [  7.87437428e-04   9.99212503e-01]
images/alpaca1.jpg  alpaca  [ 0.99114448  0.00885548]
images/alpaca2.jpg  sheep   [ 0.44422832  0.55577165]

How to read the chart of training progress

Normal case

Normal

Low both loss_0 and loss_test and high accuracy.

Easy

The class is something obvious or already known by the base model.

loss_0 goes down and loss_test goes up

Over-training

It's called over-training. The model is fitting too much with the training data and doesn't work well with other data. Increasing number of training images may help.