OCR system

OCR automation

Can a machine read?

About a century ago, Emanuel Goldberg, a well-known physicist, developed a reading machine for blind individuals. This invention marked the beginning of optical character recognition (OCR) technology. In the 1970s, OCR technology further advanced and was able to read omni-font text.

Now, this cutting-edge technology is used in web-based services, cloud computing, smartphones etc. The classic use case of OCR is converting printed books and documents into digital format. When a paper document is scanned, first a gray color image is created. Then it undergoes various steps such as preprocessing, feature extraction, classification, and post processing.

The OCR technology now reads almost anything. (well, almost!)

The modern OCR technologies also allow capturing irregular texts from notes, video and other modalities. For example, street ads painted on banners, graffiti written on buildings or cars, etc. Today a variety of industrial sectors such as retail, transport & logistics, education, healthcare, telecom, and manufacturing use the OCR technology to expedite their various business processes.

For instance, in the retail food industry, pre-packaged food products have a product name, ingredients, and ‘best by’ dates printed on their packages. These details are critical to ensure that food is distributed to stores properly and is safe for consumers to eat. Human or machine errors in reading these labels can lead to health hazards. OCR, along with other ML technologies, is used to facilitate the timely distribution of food.

OCR technology is widely used in the banking and financial sectors. One of the well-known uses of OCR in banking is for automatic cheque processing. Without OCR technology, processing a cheque was time-consuming and prone to human errors. Financial frauds would become more common. Now with the advent of OCR technology, cheques can be easily deposited through an app. The OCR technology helps in reading details such as amount, date, signature, etc. When OCR is deployed with 2D Convolution Neural Network (CNN), the detection of letters and other details was improved.

The most noteworthy use of OCR technology is done in the healthcare industry. Unstructured data such as patient reports, doctor’s notes, etc. can be read using the OCR technology and then it can be converted into a structured dataset. Medical insurance claims processing may often require tedious paperwork. OCR technology can be used to read doctor/care team notes, convert them to electronic health records, and then expedite insurance paperwork

OCR for different languages and different scripts

As the OCR technology advances, some new exciting possibilities are opening up. For instance, in the recent past, handwritten notes were not read by a machine but today machines are able to read handwritten notes with the help of OCR technology. Furthermore, the machines can convert handwritten notes into text and store them in a dataset. Another possibility is to use OCR technology for a variety of languages that may have different scripts e.g., right to left or top to bottom.

OCR technology has gone global. In 2019, a market survey estimated that the global OCR Market was valued at US$ 6.8 billion and was expected to grow at a rate of 13.6% each year to reach US$ 16.60 billion by 2027. The report also notes that a key driver of this market is start-ups in various sectors. As start-up companies develop and adopt this technology, they experience challenges such as talent acquisition, retaining trained personnel, etc. To circumvent these issues, established companies and start-ups choose to use external tools such as BitRefine Heads platform.

We’ve got more video analytics to share. Let’s be friends. But you first...