Email, a tool invented over 45 years ago, remains the most trusted form of online interaction as it stands decentralized in a world of social applications. With a little help from the indico Sentiment API, you can quickly go from having a large corpus of written emails to a visualization of how the sentiment in your writing has changed over time.

Before diving into the analysis you can get an email.json file with your personal emails by following these simple steps:

  1. Set up your email with Context.IO
  2. Enter your credentials in the script as strings
  3. Wait 30min for your email to sync with Context.IO
  4. Run the script

Now let’s dissect the file used to analyze your emails starting with the imports. Before continuing, make sure you have registered at to get an API key. This will set you up with 1 million free calls per month; more than enough to complete this tutorial with plenty of calls to spare.

The email_reply_parser lets you grab the last reply in an ongoing thread. The other libraries, pickle and json, will be used for I/O.

import indicoio
indicoio.config.api_key = 'YOUR_API_KEY'

from email_reply_parser import EmailReplyParser
import pickle
import json

The sentiment_sliding function looks at 1,000-word windows from your sent emails, shifting forward by 20 words at a time (first window would be 0-1,000 and the next window would be 20-1,020, etc.). The moving window approach works well for smoothing out short term fluctuations while revealing long term trends.

def sentiment_sliding(messages, window=1000, shift=20):
    allwords = []
    data = {}
    for m in messages:
        if '\\Sent' not in m.get('folders', tuple()):
        if not m.get('body') or not m['body'].get('content'):

    allwords = " ".join(allwords)
    allwords = allwords.encode('ascii','ignore')
    allwords = allwords.split()

    current_window = 0
    next_window = window
    print 'number of words', len(allwords)
    while True:
        if len(allwords) < next_window:
            print 'sliding-sentiment reached end at lengths:%s' % len(allwords)
        print 'sliding-sentiment start:%s end:%s' % (current_window, next_window)
        data[current_window] = " ".join(allwords[current_window:next_window])
        data[current_window] = indicoio.sentiment(data[current_window])
        print data[current_window]
        current_window += shift
        next_window += shift
    return data

Finally we have the I/O code that lets you run the script to generate a pickle file with your metadata.

f = open('email.json','r')
rawemails = json.load(f)
print 'number of emails', len(rawemails)
data = sentiment_sliding(rawemails)
with open('data.pkl', 'wb') as f:
    pickle.dump(data, f)

Lastly, boot up IPython Notebook (now known as Jupyter Notebook) and run each of the following cells from viz_sentiment.ipynb to produce a seaborn scatter plot with your sentiment values.

%matplotlib inline
import pandas as pd
import seaborn as sns
metadata ='data.pkl')
munge = []
for w, s in metadata.items():
    munge.append({'wordnum': w, 'sentiment': s})
df = pd.DataFrame.from_records(munge)
sns.regplot('wordnum', 'sentiment', df, color='seagreen')

While I don’t want to jump to any conclusions, here is Madison’s take on the findings…

Screen Shot 2015-04-16 at 4.30.45 PM

All code from this tutorial is housed at

For more on how our models work, refer to

Suggested Posts

Is Artificial Intelligence Within Reach?

The Good, Bad, & Ugly of TensorFlow

Neural Image Captioning for Mortals