Thursday, 22 August 2013

Measuring Twitter with Universal Analytics

Background

Like most people working within the Web Analytics space the announcement of Universal Analytics excited me, in fact so much that I still have the Raspberry Pi, RFID readers and tokens ready to start doing some real world tracking after seeing the Loves Data inspirational video what seems like an age ago.

For those of you who don't know about Universal Analytics - it is really the next evolution of Google Analytics providing a means of being able to make your analytics more User centric rather than Visit centric. This of course means being able to look at journeys that not only cross devices but potentially bridge off and online.

Universal Analytics is built on something called the Measurement Protocol. This protocol was developed to allow a third party to be able to send data to Google / Universal Analytics. Obviously, the Universal Analytics JavaScript library wrappers this nicely for you, however, I would still advise people to read and become familiar with this.

Loves Data used the Measurement Protocol to send data to Universal Analytics using Arduino boards and RFID tags. This should excite any retailer with a loyalty card and I'm going to say Smart EPOS as technically it means you can send offline sale data back into Analytics. Of course, you don't need to have a loyalty card, Raspberry Pi or Ardino board to send data back. In fact, there are many other events that happen off your website that you may wish to record in Analytics and the Measurement Protocol makes this possible.

In September, a number of my colleagues are doing a Digital Hike (4PsHike) where over 5 days they will be walking the length of Hadrians Wall stopping off for some Google Hangouts discussing various developments within the Digital Marketing space. Last week one of my colleagues Charlotte, approached me to ask about Twitter Monitoring in particular looking at the Scottish Independent debate, primarily to try and see the number of tweets, active users and hashtags. We use a tool called Brandwatch but I thought could I quickly and easily do this using Universal Analytics and the answer of course is Yes.

There are some things I need to improve, like making hashtags lowercase so they are de-duped and also investigating the limits (of 500 hits per session) but as you'll see below it is possible and relatively straight forwards. I'd also like to look at using the uid= rather than cid= for identifying a visitor as we then could emulate Visits in Google Analytics a little more closely.

How?

Step 1 - Create a new account
So firstly, we need to create a new account, very easy although the new look and feel of Analytics remember this is under Admin and then in the Account drop down. I made a new Universal Analytics account for my particular experiment - you then need to note the UA number.

Step 2 - Install PHP / MySQL
I downloaded a WAMP stack called XAMPP as I wanted to use PHP as my Twitter monitoring library. XAMPP includes Apache, PHP and MySQL. You can use any tool of your choose provided you are able to edit the code and add the necessary Measurement Protocol requests. The library I used is was from 140Dev you can download it here - http://140dev.com/free-twitter-api-source-code-library/

Step 3 - Create Twitter Application
In order to use the PHP monitoring library you need to have a Twitter Application. You can create this by signing in at https://dev.twitter.com/. Click My Applications:

Create your application and after you've done this you will need to note the Consumer Key, Consumer Secret, Access Token, Access Token Secret.

Step 4 - Start Monitoring
So, now we've got our Twitter application we can begin monitoring, in the 140dev package you need to modify a few files, firstly the db_config.php:
$db_host = 'MySQL Host Here';
$db_user = 'Put your MySQL username here';
$db_password = 'Put the MySQL password here';
$db_name = 'Put the database name here';

Then you need to edit the 140dev_config.php file:
define('DB_CONFIG_DIR', 'PHYSICAL PATH OF CONFIGURATION PAGES');

// Server path for scripts within the framework to reference each other
define('CODE_DIR', PHYSICAL PATH OF CODE FILES');

// External URL for Javascript code in browsers to call the framework with Ajax
define('AJAX_URL', ''');

// OAuth settings for connecting to the Twitter streaming API
// Fill in the values for a valid Twitter app
define('TWITTER_CONSUMER_KEY','Your Consumer Key');
define('TWITTER_CONSUMER_SECRET','Your Consumer Secret');
define('OAUTH_TOKEN','YOUR OAUTH TOKEN');
define('OAUTH_SECRET','YOUR OAUTH SECRET');

define('STREAM_ACCOUNT', 'TWITTER LOGIN');
define('STREAM_PASSWORD', 'TWITTER PASSWORD');

After that, edit the get_tweets.php file to monitor what you need:
$stream->setTrack(array('Term to Track','Term to Track'));

This script should now run from the command line by running 'php get_tweets.php'. This will populate a cache file of tweets, there is a second part which extracts the data into MySQL and adds our Measurement Protocol request, this is the parse_tweets.php file. If you edit the file adding the bold rows where appropriate:

To track WHO is tweeting:
// Add the new tweet
// The streaming API sometimes sends duplicates,
// so test the tweet_id before inserting
if (! $oDB->in_table('tweets','tweet_id=' . $tweet_id )) {

// The entities JSON object is saved with the tweet
// so it can be parsed later when the tweet text needs to be linkified
$field_values = 'tweet_id = ' . $tweet_id . ', ' .
'tweet_text = "' . $tweet_text . '", ' .
'created_at = "' . $created_at . '", ' .
'geo_lat = ' . $geo_lat . ', ' .
'geo_long = ' . $geo_long . ', ' .
'user_id = ' . $user_id . ', ' .
'screen_name = "' . $screen_name . '", ' .
'name = "' . $name . '", ' .
'entities ="' . base64_encode(serialize($entities)) . '", ' .
'profile_image_url = "' . $profile_image_url . '"';

$oDB->insert('tweets',$field_values);

$strUA = "http://www.google-analytics.com/collect?v=1&tid=UA-43297900-1&cid=" . $user_id . "&t=pageview&dp=/users/" . $screen_name;
$strData = file_get_contents($strUA);
}

To track MENTIONS:
// The mentions, tags, and URLs from the entities object are also
// parsed into separate tables so they can be data mined later
foreach ($entities->user_mentions as $user_mention) {
$where = 'tweet_id=' . $tweet_id . ' ' .
'AND source_user_id=' . $user_id . ' ' .
'AND target_user_id=' . $user_mention->id;

if(! $oDB->in_table('tweet_mentions',$where)) {

$field_values = 'tweet_id=' . $tweet_id . ', ' .
'source_user_id=' . $user_id . ', ' .
'target_user_id=' . $user_mention->id;
$oDB->insert('tweet_mentions',$field_values);
}

$strUA = "http://www.google-analytics.com/collect?v=1&tid=UA-43297900-1&cid=" . $user_id . "&t=pageview&dp=/mentions/" . $user_mention->screen_name;
$strData = file_get_contents($strUA);
}

To track HASHTAGS:
foreach ($entities->hashtags as $hashtag)
{

$where = 'tweet_id=' . $tweet_id . ' ' .
'AND tag="' . $hashtag->text . '"';

if(! $oDB->in_table('tweet_tags',$where)) {
$field_values = 'tweet_id=' . $tweet_id . ', ' .
'tag="' . $hashtag->text . '"';
$oDB->insert('tweet_tags',$field_values);
}

$strUA = "http://www.google-analytics.com/collect?v=1&tid=UA-43297900-1&cid=" . $user_id . "&t=pageview&dp=/hashtag/" . $hashtag->text;
$strData = file_get_contents($strUA);

}

To track URL mentions:
foreach ($entities->urls as $url) {
if (empty($url->expanded_url)) {
$url = $url->url;
} else {
$url = $url->expanded_url;
}

$where = 'tweet_id=' . $tweet_id . ' ' .
'AND url="' . $url . '"';

if(! $oDB->in_table('tweet_urls',$where)) {
$field_values = 'tweet_id=' . $tweet_id . ', ' .
'url="' . $url . '"';

$oDB->insert('tweet_urls',$field_values);
}

$strUA = "http://www.google-analytics.com/collect?v=1&tid=UA-43297900-1&cid=" . $user_id . "&t=pageview&dp=/urls/" . $url;
$strData = file_get_contents($strUA);
}

After doing this again leave the parse_tweets.php file running on a command prompt by entering 'php parse_tweets.php.

Results

The reporting interface of Google Analytics is actually very effective at monitoring Twitter as you are able to look in Real Time, use Dashboards, or custom reports.

The Real Time Analytics is fantastic at showing how active the things your are monitoring on Twitter is. If you just look at the Real Time overview as this screenshot shows:

You can use Dashboards to report on key areas of interest and apply whatever filtering you need, the dashboard below just shows the key hashtags, users, users mentioned and urls shared:

Custom Reporting also allows us to produce charts such as what times of the day users were active:



If you are interested in using the Measurement Protocol, Google Analytics or Universal Analytics or have any comments or feedback then I'd love to hear from you!

14 comments:

  1. Thanks a lot Matt.

    I don't find the "process_tweets.php" file. Is it the correct name?

    ReplyDelete
  2. Hi Mehdi,

    Thank you for pointing that out - I meant parse_tweets.php - this is the PHP which actively processes the tweets! I've corrected the post now.

    Let me know how you get on.

    M

    ReplyDelete
  3. Hi Matt,

    What a terrific post! I'll definitely be looking at implementing this in the short-term.
    From a marketing perspective, there's a lot more value here than just getting interaction data out too ;)

    ReplyDelete
  4. Thanks for the work and for sharing it, Matt. It´s great and with a lot of potential!

    ReplyDelete
  5. Could you do a video of this. I got lost on step 4, I guess you assumed somethings from that step.

    ReplyDelete
  6. Hi Matt, how to setup it on Google Analytic? because i don't get any data on my Google Analytic

    ReplyDelete
  7. Thank you everyone for your comments.

    I think I will do a screen record as this post does make some assumptions as to people understanding with Apache or IIS, MySQL and PHP (LAMP / MAMP, WAMP) - and then running PHP from the console. I will try and sort this out in the next few days.

    In terms of passing data to Google Analytics, make sure you have setup a Universal Analytics account and you are using the right UA number. The scripts also should be run from a command line, so using SSH on a Linux box, or CMD on Windows.

    ReplyDelete
  8. Hi. I Also don't get any data in the Google Universal Account. I'v also double checked the UA number. Any further idea?

    ReplyDelete
  9. This is cool. I set mine up and it's working great! Thanks.

    ReplyDelete
  10. This comment has been removed by the author.

    ReplyDelete
  11. Hello Matt,

    nice guide for setting up 140dev on xampp. Can you please post your settings for db_config_dir and code_dir. my settings are:
    define('DB_CONFIG_DIR', 'h:\xampp\htdocs\140dev\db\');
    and define('CODE_DIR', 'h:\xampp\htdocs\140dev\');
    but I always get an error with the db_test.php
    Parse error: syntax error, unexpected 'CODE_DIR' (T_STRING) in H:\xampp\htdocs\140dev\db\140dev_config.php on line 17

    Maybe you can give me a new point of view. Thanks

    ReplyDelete
  12. Hey Matt, great post! I've got this working for the most part but I'm having trouble seeing anything in Google Analytics other than real-time data. How exactly do you get it to display the information like in your screenshots?

    Also, does this only track keywords or is it possible to track a list of specific users? Thank you very much! Keep it up!

    ReplyDelete
  13. Matt, I've had this blog post opened on a tab for a couple months now. What you've done is great and this morning I tried to take a stab at it. Like some others, I got a little lost between steps 3 and 4. Wondering if you are still tracking to do a screencast or not. I would love to set this up for some clients without going to a developer.

    ReplyDelete