Thoughts on Google Priority Inbox, The How and the Why

Google Priority Inbox launched this week and there has been a huge fuss about it on sites such as TechCrunch and Wired, even the BBC is getting in on the action. Most people are calling what Google is doing a miracle but that’s not necessarily the case.

A lot of the stuff google is doing has been around for a while in fact during the final year of my undergraduate computer science course I created a system myself to help deal with email overload. The key to the system was determining which emails were important to the user and which are not. The basics of how a system such as priority mail works are fairly simple, that’s not to say that what google has done is impressive but more on that later. I figured I would share some of the knowledge that I learnt while I created my project all those years ago.

Determining the importance of an Email

So how do we determine how important an email is? For a while now spam has been a real issue for those of us that receive email. For a long time people spent huge amounts of time determining the characteristics of an email that meant it was a spam message. Things like coloring the message red were amongst the first key indicators. The thing was the spammers soon caught on to this and a new solution needed to be found. In in 2002 Paul Graham (of HackerNews and YCombinator fame) released an article called a plan for spam and later in 2003 better baysian filtering.

The articles outlined a method that could be trained to determine which messages were spam and which were important. Those articles were hugely interesting for me, not only because they were my first introduction to baysian algorithms but also because of the potential to allow users to give feedback to better understand which messages were spam.

Without going into too much detail the basic premise of these articles is that you take each word within the email and look at it. If the email is a good email then you put a good mark next to that word. If the email is bad you put a bad mark next to this word. You can then see how likely it is that a given word is contained in a good email or a bad email. When a new email arrives you look at the words of that email. If the email contains words that are in more bad messages than good ones then the changes are the new email is a spam one.

This concept led to a huge leap in spam filters but its easy to see that you can take this concept further. Instead of treating words as good or bad you can treat the words as important or unimportant. That’s exactly how that piece of my final year project worked. But why stop there? What if instead of counting good and bad words you created a bunch of different categories? You could automatically filter emails into any category you choose.

Of course this all gets far cleverer when you look at how people place things as a whole and how they classify them individually. These algorithms have also gone much much further than I have described often using ontologies and other techniques to improve things.

Why is Google Priority Email so great then and What else can I do?

Well this is the killer. SPAM filters existed far before Google launched Gmail but somehow they just managed to get it right. Switching to Google apps I personally saw my 100 spam messages per day (that’s 100 getting through from 4,000) cut down to maybe one or two a week. Google has access to a lot of data, and more specifically a lot of email accounts. As more and more people use this sort of system the data can be refined. It’s for this reason that I think Google’s Priority email could be a real killer app and a real asset in our current overloaded times. However, its not the only option.

In case you aren’t aware though there are a couple of other things that you can do to help take control of your inbox. The first is to take a look at OtherInbox. These guys have been doing a lot of what Priority Inbox is offering for a long time now and by doing it I mean doing it well. The OtherInbox system can even do things like pull out delivery company notices to tell you when your parcel is going to arrive. It’s basically Google’s Priority Inbox on steroids.

If you’re a dev then you could also look at the SPAMBayes project, its open source and has a load of different options. The other key is training. Of course there’s a lot of email coming from external advertising sources but have you ever considered that you might be part of someone else’s problem? A lot of the research conducted at the university I was at showed that the sender felt that an email was far more important than the recipient felt it was. It also showed that people are often to cavalier in their attitudes to email, including too many other people in their emails or sending email when it really wasn’t necessary. There’s a bunch of research on the subject of information overload on Tom Jackson’s website that’s really worth taking a look at if your serious about reducing the problem for your business.

So yes Google Priority inbox is cool, but take a look at the other options. Above all think about how you are contributing to the solution rather than the problem of Information Overload.

MySQL, Snow Leopard and Rails 2.2.x, where has my Gem gone?

A couple of days ago I was updating some legacy code on one of our old sites. The setup was using MySQL and Rails 2.2.x. When trying to run one of the rake tasks I received the error:

!!! The bundled mysql.rb driver has been removed from Rails 2.2. Please install the mysql gem and try again: gem install mysql.

I thought it was quite odd that I had never seen this before but it turns out I had installed a new version of MySQL for a different project and this seems to have caused some issues. So no problem just run:

gem install mysql # right?

Well no. Running that code will give you the following on SnowLeopard:

Building native extensions. This could take a while...
ERROR: Error installing mysql:
ERROR: Failed to build gem native extension.

/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby extconf.rb
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lm... yes
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lz... yes
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lsocket... no
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lnsl... no
checking for mysql_query() in -lmysqlclient... no
checking for main() in -lmygcc... no
checking for mysql_query() in -lmysqlclient... no

I remembered seeing something like this before with MySQL, its because the libraries needed to compile the native extensions against are placed in a non-standard location on Snow Leopard. The following command should help though:

sudo env ARCHFLAGS="-arch x86_64" gem install mysql -- --with-mysql-config=/usr/local/mysql/bin/mysql_config

I did receive a bunch of compilation info/errors but ultimately the gem installed fine and works as it should. Hopefully this should help anyone else that comes into these problems.

Oh and it seems it was bundler that was the root cause of my issues along with a reinstallation/upgrade of MySQL.

Automatically prepending url’s with http://

Recently we added functionality that allowed users to include links to images that they uploaded to one of our sites. In order to make the experience as easy as possible for users we allowed them to enter the url with or without the protocol (http:// or https://).

In order to make sure that any of our models that stored the information would always return a link with the protocol in it I wanted to create a simple mixin that would override the existing link method returned from the database and prepend http:// to it if it needed to.

Checking for the protocol and inserting it
This is actually quite a simple method. The following code was used to override the source_url method that was returning the link from the database.

def source_url
  link = super
  "#{link.match(/(http|https):\/\//i) ? '' : 'http://'}#{link}"
end

Since I was going to add this to a number of models it made sense to convert this to a mixin that could be used on any of the modules.

module Protocolize
  def self.included(klass)
    klass.class_eval do
      def self.protocolize(link_method)
        define_method link_method.to_sym do
          link = super()
          return nil if link.blank?
          "#{link.match(/(http|https):\/\//i) ? '' : 'http://'}#{link}"
        end
      end
    end
  end
end

This can then be called using the following in your model:

include Protocolize
protocolize :method_name

Notice that you have to explicitly call super() with params and not just super when you use it within define_method. If you don’t you will get the following error:

implicit argument passing of super from method defined by define_method() is not supported. Specify all arguments explicitly.

Just a tiny snippet that might be useful to people to ensure their links work correctly.

Rails 3, Rake and url_for

Before I start I just want to make it clear that I know the arguments against using url_for in models and even in rake tasks. Sometimes however it makes sense to use_url for in a rake task. In my case I am trying to query another site’s api which requires the URI of the page on my site that I want to gather information about.

The approach in Rails 2.x

task :collect_stats => :environment do
  include ActionController::UrlWriter

  default_url_options[:host] = 'www.example.com'
  url = url_for(:controller => 'foo', :action => 'bar')
end

Notice that because there is no current request you have to specify the

default_url_options[:host]

as the helper has no idea what the host will be otherwise.

Doing the same thing in Rails 3

The following code does the same thing in Rails 3.

task :collect_stats => :environment do
  include ActionDispatch::Routing::UrlFor
  #include ActionController::UrlFor  #requires a request object
  include ActionController::PolymorphicRoutes
  include Rails.application.routes.url_helpers

  default_url_options[:host] = 'www.example.com'
  url = url_for(post)
end

There are two key points to notice here.

  1. The first is that I have included ActionDispatch::Routing::UrlFor rather than ActionController::UrlFor. The latter requires a request object and will attempt to automatically fill in the host name. Since we are in a rake task there is no request and the method will fail.
  2. The second thing is that I have also included two additional includes. The will allow you to work with polymorphic routes and named routes, giving a bit more flexibility.

Just a short snippet that might be of use to people but if there are any improvements out there then please let me know and I will update this. You can of course hard code the routes but there are scenarios where it makes much more sense to make use of the helpers provided, especially when using polymorphic routes.

Update: 08/06/2010

In the comments Jakub has stated that in the latest version of Rails you don’t need to include the polymorphic routes.

include ActionController::PolymorphicRoutes

Render ‘Rails Style’ Partials in Sinatra

We love Sinatra. Not only does it make a great framework in its own right but in addition it can be used to mimic parts of rails in a real simple environment for front-end designers. Instead of having to get them set up and explain the whole of rails they just get a nice simple app to work on without having to worry about creating different controllers or even models.

Although there is not a 1 to 1 translation between a rails app and a sinatra one, it does allow these developers to work with things like haml in a really easy to work with environment.

One of the features that I was asked for recently though was “How do you render a partial in sinatra?”

Rendering Partials in Sinatra

Sinatra is a super-lightweight framework. Because of this it doesn’t have the notion of partials built into it. However, a partial, in its simplest form, is nothing more than a call out to render the template as a string and then embed that string into your page.

A quick look at the sinatra sites FAQs shows that partials can be rendered in the following way in erb.

<%= erb(:mypartial, :layout => false) %>

In haml you could use exactly the same thing but call haml like so.

= haml(:mypartial, :layout => false)

Notice that

:layout => false

is set to ensure that the layout is not also rendered.

Going a little further

The FAQs also recommend using the code in the following gist.

http://gist.github.com/119874

The code shows a helper method called partial. This helper method can be used to render a partial from your code. The helper also allows you to pass collections and is a really cool and useful piece of code.

Making things work the rails way

The above helpers are great and really useful for sinatra. However, what if you want to render a partial the ‘rails way’? In our situation we were using sinatra as a mock up of what would eventually be brought into a rails app. Rails allows partials to be included like so:

<%= render :partial => 'partial_name' %>

By overriding the built in render method in Sinatra it is actually possible to mimic the rails partials. I came up with the following helper to quickly mock things up. The helper checks to see if the first argument passed to is a hash and if that contains they key :partial. If so it renders the partial, if not it just uses the default render method.

  helpers do
    def render(*args)
      if args.first.is_a?(Hash) && args.first.keys.include?(:partial)
        return haml "_#{args.first[:partial]}".to_sym, :layout => false
      else
        super
      end
    end
  end

The helper could easily by extended to allow for collections etc but for now it does the job. Any better solutions?

Rounded Corners with an arrow on a UIView with the iPhone SDK 3.0

This article is a follow-up to a previous article Rounded Corners on a UIView with the iPhone SDK 3.0.

In the previous post I included the code to create a UIView with rounded corners the easy way using the iPhone SDK 3.0. However the next challenge came when I wanted to also add an arrow to the edge of the rounded box.

In order to add the arrow I would be manually creating the path to draw the new type of box. I added a variables:

  • (CGFloat) pointY to specify where on the left hand edge the point starts.
  • (CGFloat) pointWidth to specify the width of the point.
  • (CGFloat) pointHeight for the height.
  • (UIColor) rectColor to set the color of the new rectangle (I actually copy the value of self.backgroundColor and set the background color to [UIColor clearColor]

I added these variables so they can be set on the view but I will leave these for you to implement however you like.

Once you have created the variables pointY, pointWidth, pointHeight and rectColor you can then use the following code.

- (void)drawRect:(CGRect)rect {
  // Drawing code

  CGContextRef context = UIGraphicsGetCurrentContext();
  CGFloat radius = self.layer.cornerRadius;

  // Make sure corner radius isn't larger than half the shorter side
  if (radius &gt; self.bounds.size.width/2.0) radius = self.bounds.size.width/2.0;
  if (radius &gt; self.bounds.size.height/2.0) radius = self.bounds.size.height/2.0;

  CGFloat minx = CGRectGetMinX(self.bounds) + self.pointWidth;
  CGFloat midx = CGRectGetMidX(self.bounds);
  CGFloat maxx = CGRectGetMaxX(self.bounds);
  CGFloat miny = CGRectGetMinY(self.bounds);
  CGFloat midy = CGRectGetMidY(self.bounds);
  CGFloat maxy = CGRectGetMaxY(self.bounds);

  /*
  CGContextMoveToPoint(context, minx, midy);
  CGContextAddArcToPoint(context, minx, miny, midx, miny, radius);
  CGContextAddArcToPoint(context, maxx, miny, maxx, midy, radius);
  CGContextAddArcToPoint(context, maxx, maxy, midx, maxy, radius);
  CGContextAddArcToPoint(context, minx, maxy, minx, midy, radius);
  */

  CGContextMoveToPoint(context, minx, miny + pointY);
  CGContextAddArcToPoint(context, minx, miny, midx, miny, radius);
  CGContextAddArcToPoint(context, maxx, miny, maxx, midy, radius);
  CGContextAddArcToPoint(context, maxx, maxy, midx, maxy, radius);
  CGContextAddArcToPoint(context, minx, maxy, minx, miny + pointY + pointHeight, radius);
  CGContextAddLineToPoint (context, minx, miny + pointY + pointHeight);
  CGContextAddLineToPoint (context, minx - pointWidth, miny + pointY + (pointHeight / 2));

  CGContextClosePath(context);

  [self.rectColor setFill];
  CGContextDrawPath(context, kCGPathFill);
}

Rounded Corners on a UIView with the iPhone SDK 3.0

An iPhone app with programmatically set cornerRadiusI’ve been doing a bit more iPhone development recently and one of the challenges we encountered was to control the background color of a UIView programmatically. Now this is a pretty simple thing to do until you decide you also want to round the corners of that view.

One option is of course to use an image as a background but this wouldn’t allow us to programmatically change the background color of the UIView.

After a bit of digging I finally came across the method cornerRadius delcared as the following:

@property CGFloat cornerRadius

The corner radius is a property of the layer which is a CALayer. In order to use this though you must be using iPhone SDK Version 3.0 or above and you must include QuartzCore/QuartzCore.h

Another tip was that the corner radius does not work until you set masksToBounds to true. With all of this combined the following snippet should allow you to create rounded rectangles on your UIViews.

//Includes
#import "QuartzCore/QuartzCore.h"

//To set the rounded corners
self.layer.masksToBounds = YES;
self.layer.cornerRadius = 5.0;

Using Rack::Rewrite to remove the www

Rack::Rewrite is a pretty useful tool. It allows you to change the location of a url or perform 301 or 302 redirects using rack. The real benefit of this is that the logic of your routes is contained within your app and server agnostic. Sure there is a performance hit compared to using mod-rewrite etc. in Apache, Nginx and so on but we feel its worth it in most cases. If you are using caching systems like varnish or squid the redirect can be cached too.

Using Rack::Rewrite to go NoWWW

There has been a bit of a movement for a while on the web to deprecate www. in front of domain names called no-www. As the site says the title www isn’t really of any use anymore. As well as the no-www movement having a single host can also be useful for things like analytics and cleaning up search results.  There are a couple of rails solutions to doing this including a peice of rack middleware at this site http://almosteffortless.com/2009/11/05/no-www-rack-middleware/.

If you are already using Rack::Rewrite though you may as well just use that to make the change. The following is a rule to make the no-www change using rack middleware.

r301 /.*/,  Proc.new {|path, rack_env| "http://#{rack_env['SERVER_NAME'].gsub(/www\./i, '') }#{path}" },
    :if => Proc.new {|rack_env| rack_env['SERVER_NAME'] =~ /www\./i}

ID free pretty permalink based URL’s in Rails

Slug based URLS

I love the way that rails allows you to make use of REST and create meaningful urls for your web applications. The only problem is that, although these urls are meaningful, having to use a model’s id in the URL can be pretty ugly.

http://site.com/users/123

Replacing the model’s id with a meaningful title can be useful to both users and for SEO purposes. A lot of approaches in rails make use of the fact that calling to_i on a string will start at the begining of a string and work forward until there are no numbers like so:

"1".to_i # => 1
"123four".to_i => 123

This has led to a number of rails plugins that allow you to make id and permalink based urls like the following:

http://site.com/users/123-username

acts_as_friendly_param by Chris Farms is one of the plugins that you can use to achieve this.

Rails Slugs Are Bad

Slugs are bad kids. Okay? (There’s actually nothing wrong with slugs that contain an ID and while I don’t have any objection to creating URL’s like the above for some things they just seem unnecessary).

There is however another approach that allows you to use a unique reference for each record in your models. If you store a unique permalink for each model then you can use that link in your models finder in order to find that record like so:

User.find_by_permalink(params[:id])

The only problem with this approach is you have to edit all of your code to use the different finder method. Enter the slugs_are_bad plugin. I created the plugin to override rails’ default finder methods. The plugin is based on a couple of other plugins acts_as_friendly_param by Chris Farms and permalink_fu by Technoweenie but was adapted to suit my needs a little more.

Using the plugin

The slugs_are_bad plugin forms a drop-in replacement to create pretty, id-free, urls. In most cases you don’t have to make any changes from the default rails scaffold apart from adding the slugs_are_bad plugin.

app/models/user.rb
  slugs_are_bad(:permalink_attribute, :generate_from)

  # ie. slugs_are_bad(:permalink, :title)
  # will automatically generate a slug-less permalink from the title attribute and store it in the model.

Then in your view
  link_to 'User', User.new(:name => 'foo bar') # nothing new needed here
  # Generates /users/foo-bar instead of /users/1

Controllers
  # create the user as usual
  User.create!(:name => 'foo bar')

  # To find the model with its nothing else is required
  User.find(params[:id]) # nothing needed here (where id will be 'foo-bar')

  # you could also manually specify the permalink in this instance if wanted.
  User.create!(:name => 'foo-bar', :permalink => 'foo')

However, there are still a few things missing. The plugin is not quite as flexible as I would like it to be and also there are no tests for it at the moment. I have a set of tests that I use in a production site to ensure the plugin works for that project but I havent created any tests for the standalone plugin itself yet. I should also note that the plugin was been created for rails 2.3.x, I know that rails 2.3 is obsolete now :P but hey some people might find it useful.

If anyone wants to expand upon the plugin then fork away on github and ping me the changes.

Using the new Rails 3.0 Gem bundler with Passenger, Mongrel and Heroku

UPDATE 19th Feb 2010: Bundler is moving pretty fast! For the most up to date information I’d checkout http://github.com/carlhuda/bundler and specifically this gist (http://gist.github.com/302406) to find out the latest!

Recently we were unsure whether we would be deploying a site to our own hosted system or heroku. I love heroku but there are times when it just doesnt suit the project and a bit more fine grained control is nessesary.

In order to use heroku it was suggested that you move over to the new gem bundler that Yahuda has been working on as part of rails 3.0. However it seems there are a couple of different ways to get the bundler running. The default recomended way works great for mongrel and heroku but didn’t play so nice with passenger.

The default is to place all of the following code into config/preinitializer.rb:

require "#{File.dirname(__FILE__)}/../vendor/bundler_gems/environment"

class Rails::Boot
  def run
    load_initializer
    extend_environment
    Rails::Initializer.run(:set_load_path)
  end

  def extend_environment
    Rails::Initializer.class_eval do
      old_load = instance_method(:load_environment)
      define_method(:load_environment) do
        Bundler.require_env RAILS_ENV
        old_load.bind(self).call
      end
    end
  end
end

After a bit of searching around it seems that in some spawn methods this does not work great with passenger.

The solution

Instead use the following:

In your config/preinitializer.rb just include this part

require "#{File.dirname(__FILE__)}/../vendor/bundler_gems/environment"

Then in config/boot.rb place this just before the last Rails.boot! line like so:

# for bundler
class Rails::Boot
  def run
    load_initializer
    extend_environment
    Rails::Initializer.run(:set_load_path)
  end

  def extend_environment
    Rails::Initializer.class_eval do
      old_load = instance_method(:load_environment)
      define_method(:load_environment) do
        Bundler.require_env RAILS_ENV
        old_load.bind(self).call
      end
    end
  end
end

# All that for this:
Rails.boot!

That should allow you to easily boot from either server and works great with heroku.

Thanks to Mathew Todd for giving the solution based upon gem cutters commit here.

Follow

Get every new post delivered to your Inbox.

Join 284 other followers