MagickWand and OCR

The problem with photographs and OCR is they contain lots of noise and colour that is really not helpful.  Levels and Posterise alongside other tools in the GIMP or Photoshop help but they’re hardly automatic and can’t be compiled into code so I looked at ImageMagick.  It turns out it has a new simpler API called MagickWand.  One of the examples on the site is a Sigmoidal contrast enhancer, whatever that really does.  It helps,  I decided to take that code and call an adaptive threshold function in MagicWand that takes a number of pixels as a square to apply it’s algorithm to.  Thinking there would be a sweet spot where it occluded most of the noise but picked up on the grid and the pen strokes of the letters.  I was right.  It’s anywhere between 40 and 100 pixels as the sequence below shows, mouse over for a bit of info.

Now I just need to implement the idea I have, which as ever is cribbed.   We detect the grid and the boundaries of each cell.  Then we detect the edges of the contents of that cell and quarter it.  We should therefore have in 1/4’s our number.  By comparing the shape of each quarter to a database (nearest match?) we can determine the number.  I’ll post some images as examples below later.  You’ll note that these posts are very light on the math. 😛

This entry was posted in Programming and tagged , , , , , , . Bookmark the permalink.