Here at indico, we work to make it as easy as possible to translate an idea into reality and I want to show you how using indico’s predictive models can be the simplest part of your web app.

Table of Contents

1. Intro

Today we’ll be building a web app using indico’s predictive APIs and Flask, a web (micro) framework that’s great for getting acquainted with server-side code. I’m going to walk you through a bare-bones web app that displays the sentiment of your tweets over time. My focus is to teach you how to analyze a chunk of social media data. An example of how you might visualize the data is with a bar graph – check out the image below to see what the final app will look like.

Screen Shot 2015-06-05 at 8.40.51 AM

Looks nifty right? What, you wanna see it live? Well that can be arranged…

Try it out!

To give you a more concrete idea of what you’ll be building, let’s take the final product for a test drive. Firstly, we want a CSV of your tweets, which can be supplied by http://www.tweetdownload.net/. If you don’t do the Twitter thing, you’re more than welcome to download a copy of my tweets.

Once you have the CSV file, go ahead and “Open with” TextEdit on Mac or Notepad on PC (or similar text editor). Then copy all its contents and head over to our live version of the app and paste everything into the text box. When you press submit, a chart should appear (as seen in the screenshot above), and you can examine the sentiment of your (or my) last 40 tweets! If something doesn’t work, I’d appreciate it a ton if you let us know through the support contact in the bottom right of this page.

Now let’s get started!


What you’ll need

Before we dive in, I just want to outline the tools you’ll need to install if you want to follow along and hack on the project while I guide you through it. If you don’t have one or more of the following, you’ll want to get them installed with the links provided.

Although this tutorial should be helpful for anybody looking to build with indico, it’s aimed primarily at those who haven’t had a ton of experience with web development and is intentionally bare to facilitate the learning process. Using vanilla Flask for creating a proof of concept like you’ll see here is great, but using it in production is not encouraged unless you tweak a few things.

HTML, CSS and JavaScript
I’m not going to dig deeply into HTML, CSS or JavaScript in this tutorial, but there are great free resources available for getting familiar with them. Also, if you’re going to do anything web-based it’s great to know jQuery.

Python
I’m not going to cover the basic syntax of Python code, but I will explain some specific functions. I recommend Learn Python the Hard Way for an excellent introduction to Python. To work with Python, you’ll need to have it installed.

    • pip (Python Package Manager)
      pip is a program that allows you to easily install python packages. You will use pip to install the Flask and indicoio Python packages. Find install instructions here.

 

    • Flask (Python Package)
      In a minute you’ll need to install Flask. I’ll be explaining enough Flask to understand its interaction with indico, but if you want to dig deeper before moving forward, I recommend checking out Miguel Grinberg’s Flask Mega-Tutorial. It covers enough ground that if you work through all of it, you should have a solid idea of where you’d like to direct future learning.

 

    • indicoio (Python Package)
      You’ll also need to install indicoio to use our API’s sentiment model. To use the API, you’ll need to sign up and get an API key (opens in a new tab). Once signed up, you’ll land on the dashboard which will have your key waiting for you. Leave that guy open since we don’t need the key right away.

 

    • virtualenv (Python Package – optional)
      Although it isn’t necessary, you might also want to get virtualenv for isolating your Python work environment. This is very good practice and recommended for all Python projects.

 

Command line
This tutorial assumes you have some experience with the command line (terminal). If you need a refresher, use the arrow-keys to navigate this tutorial.

Git
In order to get a clean copy of this project, you’ll need to have git (install instructions here).


Let’s get set up!

Now that we have our tools, let’s build! The following steps are all commands you’ll run in the command line (terminal).

1) Install the basics: Make sure you have git, pip, and python. If they’re installed, you should be able to enter their names into your terminal and either see a prompt asking for more arguments or a Python interpreter.

2) Clone the project: Open up terminal and navigate directories until you find a good spot for the indi-flask project on your computer, then use the following to grab a copy.

git clone https://github.com/IndicoDataSolutions/indi-flask.git

3) Jump into the project: Once you have the project, run the following in your terminal and you’re home.

cd indi-flask

4) Install project requirements: Finally, use pip to install the requirements that you’ll need for this project.

pip install -r requirements.txt

Now you should have a clean copy of the template project with all of the necessary dependencies installed, so close!

5) Enter your API key: Open up the app.py file and replace “YOUR_API_KEY” with your indico API key, which you can find on your dashboard.

6) Run the project locally: Now you should have a clean copy of the template project with all of the necessary dependencies installed, which means we’re ready to rock! Run the following command,

python app.py

and then navigate to localhost::5000 in your browser to see the project live. Awesome! You can start hacking right away on the files in the project if you want, or you can read my description of the project below.


3. Our Flask App

To go along with this tutorial, I recommend opening up the project folder in a place where you can easily edit the files being referenced. For any web app it helps to sit down and look for a primary logic file (which for us is app.py) and dive into its guts to see how it interacts with other files and doing the same for those referenced files until you have nowhere left to explore. Even if you’re not sure what’s going on behind the scenes at first, having a holistic picture of how a project coordinates its pieces gets you pretty far. That’s why we’re kicking off with our app’s structure.

Starting out with structure

When checking out Flask apps, you’ll often see a folder structure that resembles the following:

ProjectName
    |-- app.py  +contains server logic
    |-- requirements.txt  +specifies python dependencies
    |
    \- static  +holds folders that contain static files
    |   \- img    
    |   \- css
    |       |--main.css  +our primary css file (not covered in tut)
    |   \- js
    |       |--scripts.js  +where frontend communicates with the server
    |
    \- templates  +holds all of your html files
        |-- layout.html
        |-- index.html

The three files we’ll mainly dig into today will be index.html, scripts.js and app.py.


Keeping HTML simple

There are only two files in the HTML department that our app is using right now: layout.html and index.html. Both are located in the project’s templates folder. layout.html is simply a template that our main page index.html will use as boilerplate around its content. Many web frameworks have some form of templating engine that allows you to include blocks of code, auto-generate HTML in a loop, and more. By default, Flask uses Jinja2, which I encourage getting to know if you like Python-based frameworks. I wouldn’t say it’s necessary to retype these files if you’re following along, but it’s good to know what’s happening here.


layout.html

Let’s take a look at the boilerplate in layout.html. In the head we define some metadata about the site (which you’re welcome to go wild with) as well as loading in our main.css file for making things a little prettier. I include Bootstrap here to make it more convenient if you’d like to use it when building off of this project; at the moment it’s doing little more than changing the font. The rest of our resources are loaded in at the bottom of the body to save time on loading since they don’t affect the display immediately.

<!doctype html>

<html lang="en">
    <head>
        <meta charset="utf-8">
        <title>My first indibuild</title>
        <meta name="description" content="Description of your app!">
        <meta name="author" content="You">
        <link rel="stylesheet" href="/static/css/main.css">
        <link rel="stylesheet" href="/static/css/external/bootstrap.min.css">
    </head>
    <body>
        {% block content %}
        {% endblock content %}
        <script src="//code.jquery.com/jquery-1.11.3.min.js"></script>

        <!-- loading the plotting library that we'll be using in scripts.js -->
        <script src="https://cdnjs.cloudflare.com/ajax/libs/Chart.js/1.0.2/Chart.js"></script>

        <!-- where all our server communication and plotting is being handled! -->
        <script type="text/javascript" src="/static/js/scripts.js"></script>
    </body>
</html>

The piece that’s unique to Jinja2 is the block content placeholder. This is filled in by pages that extend the boilerplate and include their own block, as seen in index.html


index.html

For our app, all we want to do is take a chunk of text and analyze it using indico’s API with the results displayed at the end. You can see in the code above that this involves a form with a textarea.

{% extends 'layout.html' %}

{% block content %}
<div id="form_holder">
    <form>
        <textarea type="text" rows="4" cols="50" name="tweets"></textarea>
        <br>
        <input type="submit">
    </form>
</div>
<canvas id="result"></canvas>
{% endblock %}

When the form is submitted, you can use jQuery to grab the text in the textarea and send that information to our Flask backend for crunching. The result in canvas will plot the data that comes back in our server’s response. So now you might be wondering – what does all this data posting look like?


Server communication with JavaScript

If you haven’t interacted with JavaScript before, I suggest you rename the existing scripts.js file in your local version of the project and make a new one for practice, typing in each line as shown below. If you are familiar with JS, this section shouldn’t be too tricky.

In order to intercept the form’s post action and display the response from our server, you need custom JavaScript. jQuery has a .submit() function that lets you control what happens when a user either clicks “Submit” or presses enter. Everything you need to cover when posting to the server and displaying the response is handled in static/js/scripts.js.

// overide default submit action to dictate what to do with response (often shortened to res)
$('form').submit(function() {
    ...    

    return false; // so that form doesn't try to complete post
});

Now you just need to take the values in the form and send them to our server for analysis! jQuery comes in the clutch once again with the .post(endpoint, data, [success function]) method that lets you post data to an endpoint and optionally provide a callback function to handle the server’s response.

$('form').submit(function() {
    $.post('/crunch', $(this).serialize() , function(res) {
        ...
    });

    return false;
});

$(this).serialize() takes the form at the time of submission and breaks it down into a dictionary of the input names and their corresponding values, which is then passed as the post data.

I’m going to cheat for a minute and just tell you what the returned data in the server’s response will look like so that we can dig into the plotting. It’s a dictionary with two keys: tweets and scores.

{'tweets': [string, string, string, ...], 'scores': [float, float, float, ...]}

The following code all happens inside the callback function for $.post(), working with the res from the server. First off, we have to parse the response in order to use it as an actual dictionary rather than a string.

        res = JSON.parse(res)

Next we need to grab the context of our canvas to manipulate it in the plotting process.

        var ctx = document.getElementById("result").getContext("2d");

Now we plot! For this library you create a Chart object from the ctx we just created and then pass data to the charting method we’d like to use. This will look like Chart(ctx).Bar(data, options) where options is defining the display of the overall chart.

With data we need to provide both the values we’d like to plot and their related labels. Included with each dataset you pass (only one for now) are options specific to that set.

        var data = {
            labels: res['tweets'],
            datasets: [
                {
                    label: "Tweet Sentiment",
                    fillColor: "white",
                    data: res['scores']
                }
            ]
        };

Once we have data, we push it to the graph!

        var myLineChart = new Chart(ctx).Bar(data, {
            showScale: false
        });

You can customize the chart with the individual and overall plotting options. I started out with a pretty bare bar graph but there’s a lot you can do with Charts.js (and I wholly encourage you to explore data visualization, it’s some really cool stuff). Anyway, it’s time to see what’s going on server-side.


Writing some back-end logic

 

Using indico

Using indico’s text analytics is simple. Here’s some example code:

import indicoio

indicoio.sentiment("If it wasn't for you meddling kids and your dog!", api_key="YOUR_API_KEY")
# 0.07985420537062864

If you don’t have one already, make sure you get an API key to pass along data.
Using our API server-side is just as simple, so the only logic you have to worry about is unpacking the data from the post and specifying the model to use. That said, let’s take a look at app.py. Again, if you haven’t played with Flask before, it’s often very helpful to type through a file on your own to help internalize what’s going on and, more importantly, to recognize any bits that you may not quite understand.


Importing and instantiating

Let’s start with our imports.

import json

from flask import Flask, render_template, request
import indicoio

First, import the standard json library to encode the result of your text analysis so it can be passed back to the client. Now let’s take a look at the critical pieces of Flask that you’ll need. Flask is used for instantiating the main app object, render_template for responding with an HTML page to be rendered, and request for parsing data from the form you’ll submit. Finally, import indicoio to handle the text analysis.

Next comes the instantiation of the app, which handles the server logic with routing. The first example is as easy as we can make it, using the route(rule, **options) decorator to determine what happens when a user hits the base endpoint /. Whatever function comes next defines the behavior of that rule, and here you’re just returning templates/index.html (Flask knows to look in the base directory for a templates folder by default).

app = Flask(__name__)

@app.route('/')
def index():
    return render_template('index.html')

So now, while the server is running, go to localhost:5000/ and you’ll see the page outlined earlier. Nice!

This next route contains the actual analysis! First, let’s decide which endpoint we want to post to. I put /crunch for now, but feel free to make it whatever you like – just make sure you update scripts.js to post to the same place if you do change it.

@app.route('/crunch', methods=['POST'])
def send_to_indico():

First off, you need to specify the POST method in the route, otherwise it defaults to only allowing GET requests.

In this next chunk you unpack the data submitted by the form.

    tweets_csv_string = request.form.get('tweets')

request.form is the dictionary that you saw serialized in JavaScript earlier. .get() is a way to get a value if it exists, and None if it doesn’t, rather than getting a KeyError.


A lil’ bit of data wrangling

Now we need to do a little bit a data wrangling to get the text we want to analyze out of our big chunk of CSV. For any project involving data analysis, there tends to be some amount of formatting required to get the information to a workable state. There are great libraries for doing this – a very popular one being pandas – but for this tutorial I’m going to keep things to vanilla Python.
To give you an idea of the process, let’s see what our incoming data is, and how we want it to look for analysis:

14/04/2015,21:37,Installing IE on my mac for some frontend testing. This is not the good kind of wrong.
02/04/2015,16:00,RT @FORTIFIEDbike: Brand new on @Kickstarter: FORTIFIED: We Caught A Bike Thief. This is Payback. http://t.co/C52yJeBDXp
05/03/2015,19:53,"Shout out to microwaves! From cooking meals to heating coffee, you're the truth."
28/02/2015,22:02,Loving Slaughter-House Five http://t.co/Dr8WoIGrXQ

This can be viewed as:

data,time,tweet

We want to go through each line and only grab the tweets. You could go a step further and see what sentiment looks like at different times of day but for now let’s keep things simple. At the end of the day we want a list of tweets that we can throw at the batch_sentiment function.

Right off the bat we want to remove any carriage return characters (\r) from the text and then split on new lines (\n).

    csv_list = tweets_csv_string.replace('\r', '').splitlines()

This means we end up with a list of each line in the csv chunk. Next, I put a limit on how many of these lines would be taken into account since the bar graph gets a little out of hand if you have too many. I’m drawing the line at 40, but this is something you can tweak to see what you like best.

    if len(csv_list) > 40:
        csv_list = csv_list[0:40]

Those brackets are part of Python’s slice notation. In this instance, we’re saying that we’d like to start from the beginning and get every element of the list up to 40 (not including 40). It’s generally good to know how you can manipulate strings if you intend to work with text data.


Getting a list of tweets

Next up is actually grabbing the tweet part of each line. The nice thing here is that we can just split on commas and then take every element from the 3rd on. This is to ensure that if a tweet contains commas, we’re not slicing it up more than we want. However, this means we need to rejoin the tweet with commas in case it’s been cut up.

    tweet_list = []
    for csv_tweet in csv_list:
        tweet_only = csv_tweet.split(',')[2:]
        tweet_list.append(','.join(tweet_only))

Sweet, now we have the list of tweets we wanted! If you’d like the order to be chronological from left to right, you’ll need to reverse the list. This can be done using slice notation by taking the whole list and iterating through it with a -1 step, meaning it’s working backwords.

    tweet_list = tweet_list[::-1]

Now we’re ready to analyze the tweets! All we need to do is pass the list to batch_sentiment and we’ll get a list of values back in corresponding order.

    tweet_scores = indicoio.batch_sentiment(tweet_list, api_key="YOUR_API_KEY")

There we have it. In this case, I passed my API key as an argument to the function but there are other ways to configure your key that are more convenient.

Finally, we just send back the tweets and values in the expected format that I discussed above…

    return json.dumps({'scores': tweet_scores, 'tweets': tweet_list})

…and boom! That’s all there is to it.

At the end we declare that if this file is called with python app.py, the server should run in debug mode. Flask defaults to port 5000, and we’re running locally on 127.0.0.1 so that’s where localhost:5000 comes from.

if __name__ == '__main__':
    app.run(debug=True)
Screen Shot 2015-06-03 at 8.31.28 PM

4. What’s next?

Now you’ve hopefully got a better grasp of how you could hack an app together using indico. Your next step might be to integrate social data from Facebook or Twitter using their APIs. Try changing the number of tweets you show in the chart or maybe try sorting from least to most positive. Also, consider taking a look at our other models and see if any of them interest you. The world is yours!

If you’d like to push a site live, give Heroku a try. It’s a free hosting provider with a nice free plan and easy scaling if you need it.

If you have any questions, please feel free to contact us as we’re always happy to help!

Suggested Posts

Building Better Search

HackPrinceton & HackUMass: Anti-Cyber-Bullying, Computer Vision on Skittles, and Improving Our APIs

Deep Advances in Generative Modeling