MonReader is a mobile document digitization experience by AI LABS for the blind, for researchers and for anyone else who needs fully automated, fast and high-quality scanning of documents in bulk.
When browsing through a book, the app determines when the browsing process is idle for a short time. In this moment of pausing, the text is captured using optical character recognition (OCR). At the end of the browsing process, the contents of the book are summarized for the user.
Here a sub-process for detecting the idle moment is accessible as a live demo. With a click on a video frame, the analysis process is triggered and a traffic light icon shows whether the selected frame is accessible for an OCR analysis.
MobileNet was used here as a neural network for the initial image analysis, followed by a support vector machine (SVM) for the final determination of the browsing condition.