Books, the digitising and text-to-speechifying thereof

A couple of books projects for you today. One is simple, practical and of great use to the visually-impaired. The other is over-complicated, and a little bit nuts; nonetheless, we think it’s rather wonderful; and actually kind of useful if you’ve got a lot of patience.

We’ll start with the simple and practical one first: Kolibre is a Finnish non-profit making open-source audiobook software so you can build a reader with very simple controls. This is Vadelma, an internet-enabled audio e-reader. It’s very easy to put together at home with a Raspberry Pi: you can find full instructions and discussion of the project at Kolibre’s website.

The overriding problem with automated audio e-readers is always the quality of the text-to-speech voice, and it’s the reason that books recorded with real, live actors reading them are currently so much more popular; but those are expensive, and it’s likely we’ll see innovations in text-to-speech as natural language processing research progresses (it’s challenging: people have been hammering away at this problem for half a century), and as this stuff becomes easier to automate and more widespread.

How easy is automation? Well, the good people at Dexter Industries decided that what the Pi community (which, you’ll have noticed, has a distinct crossover with the LEGO community) really needed was a ┬árobot that could use optical character recognition (OCR) to digitise the text of a book, Google Books style. They got that up and running with a Pi and a camera module, using the text on a Kindle as proof of concept pretty quickly.

But if you’re that far along, why stop there? The Dexter team went on to add Lego features, until they ended up with a robot capable of wrangling real paper books, down to turning pages with one of those rubber wheels when the device has finished scanning the current text.

So there you have it: a Google Books project you can make at home, and a machine you can make to read the books to you when you’re done. If you want to read more about what Dexter Industries did, they’ve made a comprehensive writeup available at Makezine. Let us know how you get on if you decide to reduce your own library to bits.


AndrewS avatar

With the addition of an extra few buttons, I guess the first project could be combined with ?

Richard Mullens avatar

Any idea what OCR package is used for this ?

Mike Redrobe avatar

I used tesseract ocr on the pi last year:

But found it too slow (5 mins for an a4 page)
Have things improved?

simon avatar

I guess you could use the RPi as the digitiser controller, then subcontract the OCR to a PC or farm-of. Less portable, but much quicker.

nathanael avatar

For extra geek points you could used the lego device to digitise the part in Chapter 2 of Dirk Gently’s Holistic Detective Agency that reads ‘… The Electric Monk was a labor-saving device, like a dishwasher or a video recorder. Dishwashers washed tedious dishes for you, thus saving you the bother of washing them yourself, video recorders watched tedious television for you, thus saving you the bother of looking at it yourself. Electric Monks believed things for you, thus saving you what was becoming an increasingly onerous task, that of believing all the things the world expected you to believe… ‘ – and then never actually read it.

Liz Upton avatar

100 internet points to that man.

simon avatar

BTW: new website hasn’t got a “Like” button for comments.

jclerman avatar

Please use a font with more contrast!
Otherwise some of us cannot read it. We can enlarge the text but my tablet does not allow me to change fonts.

Mark Swope avatar

This has probably been said before, but I find it ironic that I can’t watch the video on a native Raspian distro… :-)

tvjon avatar

You CAN watch all these videos, so hopefully it hasn’ been said before, thus perhaps discouraging readers.

I’ve just downloaded both above videos onto the RPi I’m typing this reply on so that I can show it to a friend who has no internet access.

sudo apt-get install cclive

Top right near the i symbol, left-click the symbol to its left. A popup will appear that highlights the video’s URL.

In a RPi terminal window, type:


then press your middle mouse button which will paste in the URL.

When download finishes type

omxplayer name_of_file_you_just_downloaded

then watch & listen on your RPI in highj quality soun & video.

Malachi avatar

The real issue with scanning and OCRing is proofreading. There’s always one more error. If anyone can come up with a Pi project to address that we’d love to hear from them.

Liz Upton avatar

Lord – tell me about it. Even crowdsourcing, a la Project Gutenberg, appears to be a poor way to approach the problem (and I say this as an inveterate submitter of corrections to PG). When someone comes up with another way to fix it, I’ll be over the moon.

Steve Lovell avatar

Anyone else wonder if the two projects could be combined to outsource the reading of, say, bedtime stories? ;-)

Comments are closed