Twitter Wordle for 2011

This morning I published (two) Wordles based on the content of my Twitter timeline for 2011 which I’ve been archiving to a SQLite database since July 2010.

Basic method:

  1. Export tweets
  2. Process into words
  3. Count word frequency
  4. Upload to Wordle

Wordle accepts data input in the form:

word1:55
word2:23
...

First output was:

Raw Wordle for 2011

This was a little skewed towards the various travel and weather related feeds I follow (@SEtrafficnews, @nationalrailenq, @NRE_SEastern, @SEplaying, @KentWeatherObs) so I then excluded them…

Much better...

And finally… a Wordle of my TwitteringsRamblings:

fooflington

Actual method:

sqlite3 save.db "
  select text
  from tweets
  where substr(created_at, 27, 4) = '2011'
      and user_screen_name not in 
         ('SEtrafficnews', 'nationalrailenq', 
          'NRE_SEastern', 'SEplaying', 'KentWeatherObs')
" |
perl -e '
my %d;
while (<>) {
    chomp;
    s/http:\S+//g; # exclude links
    my  = split(/\W+/);
    for () {
        $d{ lc $_ }++;
    }
}
for ( sort { $d{$a} <=> $d{$b} } keys %d ) {
    print "$_:$d{$_}\n" if $d{$_} > 250 and $_ ne "";
}
'

Leave a Reply

Your email address will not be published. Required fields are marked *