Thursday, 30 December 2010

Postprocessing scanned crayon images

Postprocessing crayon drawings

I have some scans of crayon drawings, but the colours are washed out (well, the colours are pretty light on paper too):


First, some due dilligence. Google for solutions. (Actually, first fire up GIMP and try a few ideas. But let's pretend I hit the books before hitting the lab).

Lit review:

  1. google'd: post processing scanned crayon drawings - no help, but kids drawing reenacted was entertaining.

  2. google'd: scanning crayon drawings - better, results in a yahoo answers entry... which suggests GIMP. Also a couple of references warning about crayon wax sticking to scanners (I didn't have this problem).

  3. searched flickr for examples in the hope of finding a discussion in comments. Photos tagged with crayon art was the best of the flickr searches. Several people, including Steve brandon, used a camera, rather than a scanner.


Ok, that's enough searching. Let's play with some balances. Here's screenshots of the settings (from GIMP's colour menu) and the changed versions of the image.

original


take 1






take 2





I bumped up blue similarly.


take 3





Applied these settings after inverting the image.


take 4






take 5





I expanded this settings box out to make fine control easier.



In the end I went with the first take, as the least abstract of the bunch.


Tuesday, 28 December 2010

data: list of top sites from alexa

Alexa has a free list of the top 1m websites: http://s3.amazonaws.com/alexa-static/top-1m.csv.zip

sample:

1,google.com
2,facebook.com
3,youtube.com
4,yahoo.com
5,live.com

A few curiosities:

  • while most entries have are just domains, 10007 have path information:

    2760,feedproxy.google.com/~r
    5824,mail.qip.ru/~Inbox
    7108,xhamster.com/user/video
    7634,journeyplanner.tfl.gov.uk/user/XSLT_TRIP_REQUEST2


  • Two of the entries with path info contain commas:

    490727,pomoho.com/user/cmuser,1
    936298,intranet.espace-privilege.leclercvoyages.com/user/eleclerc-voyages,2

    which causes weirdness when using R's parse.csv() command.

    Script I used to find where the ranks diverged from the indexes (in before I found the CSV had unescaped commas):

    #the data from the CSV file is in scores
    onem = seq(1, 1000002)
    head(onem[scores$rank != onem])
    scores[490728,]




Oh, and here are the two extra rows that parse.csv silently created:

> scores[scores$domain == "",]
rank domain
490728 1
936300 2

Wednesday, 1 December 2010

The Hive Brisbane, 2010-11-30 - interesting people with exciting projects

I went to a The Hive event today yesterday, featuring Richard Slatter from We Are Hunted.

I enjoyed the event. They talk was good but the best bit of the evening was talking to some interesting people about their exciting projects:

  • The speaker, Richard Slatter, about We Are Hunted (music charts based on online chatter) and the advantages of RERO.

  • Alice and Leo from Davinway Marketing, who are applying agile methodologies to marketing (which is an idea that appeals to me).

  • Mike Boyd, part of The Hive's Brisbane team, who's working on Cupstart, a project that will let you "Order your coffee online using Cupstart and collect it as you arrive". Oh, he also has a survey.