Sentiment analysis and adaptive phrase match learning with PHP

I have recently released a sentiment analysis PHP class under the GPL licence that both analyses the sentiment of text as well as matches text with previously analysed phrases that are positive, negative or neutral. In other words, it learns from your input and becomes more accurate over time. Below I will outline the general concept.

Simple example

include ('sentiment_analyser.class.php');
$sa = new SentimentAnalysis();
$sa->initialize();
$sa->analyse("Thank you. This was the best customer service I have ever received.");
$score = $sa->return_sentiment_rating();
var_dump($score);

Concept

This class serves three purposes:

  1. Estimate the sentiment for a string based on emotion words, booster words, emoticons and polarity changers
  2. Allow you to save analysed data into positive, negative or neutral datasets
  3. Identify if we have any phrase matches on previously analysed positive, negative and neutral phrases

Should there be any high quality phrase matches, it would take precedent over the sentiment analysis and return the phrase match rating instead.

Sentiment Analysis

Strings are broken into tokenised arrays of single words. These words are analysed against TXT files that contain emotion words with ratings, emoticons with ratings, booster words with ratings and possible polarity changers.

A score is then calculated based on this analyse and this forms the “Sentiment analysis score”.

Phrase Analysis

This function is key to identifying whether the phrase in questions can be compared to phrases that we have analysed and stored before. It uses Levenshtein distance to calculate distance between 4,5,6,7,8,9 and 10 word length phrases against the dataset we already have. We also make use of PHP’s similar_text to double verify proximity.

This means that the more phrases we have analysed previously improves the entire dataset and allows phrases to be more accurately scored against historical data.

  1. The phrase is broken up into ngram lengths
  2. The array is reverse sorted so we compare 10 word length phrases first, then 9, and so on
  3. Phrases are matched against positive, negative and neutral phrases in the relevant TXT files
  4. Only matches that meet the minimum levenshtein_min_distance and similiarity_min_distance are kept

The Evolution of myScoop

It has been over 4 years since I created myScoop, the once called “the real time South African blog aggregator”. Let’s face it, aggregation is dead. As with all things in life, change should be the only constant and this is the reason myScoop has been completely overhauled into something new – a premium blog network that offers bloggers the opportunity to monetise their blogs and offers advertisers the opportunity to get their message out to millions of South Africans in a relevant and unique manner.

The idea

I have been playing with this idea for quite some time now. Just like AdDynamo’s “sponsored tweets” concept, the concept of “paid blogging” is relatively new to South Africa and there have been a few discussions around it over the past year. I have been working full tilt to get the system up and running and I’m finally proud to announce the relaunch of myScoop.

How it works

It’s quite a simple tool at the moment. Bloggers have their choice of which campaigns they would like to be a part of and get paid according to how popular their blog is. Advertisers create campaigns within myScoop and then choose which blogs they would like to have writing about their brand, product or service. For ethical reasons, all sponsored posts need to be fully disclosed on the blogs. All of this is controlled through the myScoop App.

myScoop has over 1000 South African blogs registered giving potential advertisers the luxury of choice as to when and how their message is spread throughout the millions of South African internet users.

 

Introducing PingPong, site uptime and performance monitor

I’ve been feverishly working on my new startup, PingPong. It’s a website uptime and performance monitoring tool that sends you an SMS and Email as soon as your website goes down.

I believe this tool to be absolutely vital for any online business, agencies, and e-commerce websites. If you’re making money via a website or have a strong online presence it’s essential to reduce the amount of time your website is offline. It’s inevitable that your site will go down but knowing when it happens allows you to act quicker and get your site operational again. Hence the reason for creating PingPong.

PingPong is a site monitoring tool that:

  • Alerts you via SMS & email when your website(s) go down
  • Provides performance metrics
  • Provides error logs to help you debug your website downtime

There’s a few benefits of having such a tool:

  • If you’re in ecommerce and you’re site goes down, you obviously lose money.
  • If you are running AdWords campaigns and your site goes down, all your ads run the risk of being disapproved.
  • If you’re an agency, it’s helpful to know that a client’s site is down before they know. This allows you to act quicker.

PingPong is still very new and there is a host of features I plan on releasing over the next few months:

  • Automatic weekly/monthly reports (emailed to you)
  • Better, more modern user interface
  • More “checks” (such as checking if your mail server is up, checking if your MySQL server is up, etc)
  • Public page (allow others to see your downtime and performance reports)

I’d love to hear your thoughts on my new startup.

Copy Compass – a South African SEO Plugin

Copy Compass - Content Analysis WordPress PluginIf you dont already know, Copy Compass is the latest project under the wing of the newly branded digital marketing agency, Talooma. Copy Compass is a WordPress plugin that analyses your copy to best search engine optimisation practices. it provides you with in overall score as well as highlights key areas that may need optimising within your copy.

Copy Compass is the ideal plugin for your blog or website. It allows you to optimise your copy in line with generally accepted SEO best practices.

It is ideal for the casual blogger or the more serious blog master or website owner who wants to ensure his content always ranks well within the search engine results pages.

For many months I have been wanting to release a meaningful WordPress plugin that can actually compete with the international big boys and I believe Copy Compass may just be able to do it.  Many hours worth of research, practice and a whole heap of overall brain cell usage has gone into this plugin. Copy Compass was originally developed in my spare time as an in-house tool for Talooma which was meant to increase the quality of copy, in terms of SEO, and to help our customers rank higher in the search engine results pages. Upon using Copy Compass internally, most of our articles and content have climbed the ranks substantially and much progress has been made.

The complete plugin is not available as yet and only a scaled down version of the plugin has been released into the market. This version includes functionality such as the following:

  • Monitor your reading ease score: This is an indication of how easy or difficult your text is to read.
  • Monitor your gunning fog index:  A useful representation of what grade level you need in order to comprehend the text (eg: Grade 7, 8 etc)
  • Keyword Density: All online marketers know the importance of keyword density and how it can affect your rankings.
  • Basic SEO principles: Copy Compass monitors some basic principles needed to do well in the rankings.

If you have any ideas, suggestions and/or comments about Copy Compass, please do not hesitate to make use of the form on the site. All feedback is welcome.

myScoop’s first WordPress Plugin – “myScoop Rank Display”

Its been quite some time that I’ve been wanting to develop a WordPress widget for myScoop users. I have finally put something together. Once activated, this plugin will let you be able to select a Widget to display on your blog. This Widget will automatically connect to myScoop and display your blogs rank history for the past 14 days.

Its been quite some time that I’ve been wanting to develop a WordPress widget for myScoop users. I have finally put something together. Once activated, this plugin will let you be able to select a Widget to display on your blog. This Widget will automatically connect to myScoop and display your blogs rank history for the past 14 days.

Some background information:

  • The plugin comes packaged with the source for Open Flash charts.
  • The widget connects to myScoop to retrieve a data file and displays information relevant to your DOMAIN.
  • This could potentially be a very processor-intensive widget for myScoop (not your blog). Therefore I am looking at creating a cache file with your blog’s rank information for less server stress and quicker display on your side. This will be available in the next version.
  • There are currently no configurable settings for this widget. Simply install and activate – Once activated the widget will be available for use.
  • In future versions I would like to let the user be able to select whether or not they would like to display the graph or their current rank within myScoop.

Before officially releasing this I would need to do some testing and stability checks. Are there any bloggers that would be willing to help with this? All I would need from you is to install the plugin and give me feedback on any bugs or errors that you find, as well as any suggestions.

myScoop WordPress Rank Widget
myScoop WordPress Rank Widget