Image To TextImage to Text Conversion

 Status : Completed

Tags: Keras Tesseract OpenCV Python



AIM

To develop an OCR application capable of extracting important details of a candidate from his/ her resume.


COMPONENTS AND TECHNOLOGIES USED

  • Streamlit

  • Python

  • OpenCV

  • Tesseract

  • CSS

  • Keras


OVERVIEW

 

Project Report

Image to Text Conversion

 

C:\Users\LENOVO\Downloads\imgto txt.jpeg

 

CONTRIBUTORS –

Name

Branch

Reg. no.

Utkarsh Jha

CSE

20214138

Urbi Das 

ME

20213057

 

MENTORS –

  1. Anurag Gupta ECE - 20195168
  2. Shashank Singh EE - 20202085

Aim –

To develop an OCR application capable of extracting important details of a candidate from his/ her resume.

Tech stack –

Streamlit, Python, OpenCV, Tesseract, CSS, Keras

 

OpenCV - YouTubePengenalan Streamlit untuk Membuat Machine Learning AppsCSS logo and symbol, meaning, history, PNG  Faster python app - Vladimir Cicovic Blog Krishcorp

Introduction –

The OCR application will take the picture of a candidate’s resume as its input and return the important details after extraction as its output.

Components –

  • Streamlit – Used for developing the web app
  • Tesseract – The core OCR engine
  • CSS – For improving the UI of the web app
  • Python – The main programming language

 

Working of the web application –

  • Care has been taken to make the application as user friendly as possible with an appealing GUI.
  • A file upload button has been provided for the user to upload the image of a resume.
  • Then the application, using the OCR engine and some post processing steps, extracts the important details like the phone number, email, address, etc. of the candidate and displays it as its output.

Methodology –

 

  • The project was divided into three phases to develop a good understanding of basic computer vision techniques to be able to take on the bigger challenge.
  • In the first phase, we learnt about basic neural networks and worked on the popular MNIST dataset where we managed to get a good validation accuracy using the simple architecture described in the tutorials.
  • In the second phase, we learnt about Convolutional Neural Networks (CNN), and applied our knowledge on the EMNIST dataset using the popular Keras library, achieving a good accuracy here as well.
  • In the third and the final phase, we made ourselves familiar with different OCR models, like EasyOCR, Tesseract, KerasOCR, then finally used the Tesseract engine to build our web application for information extraction from resumes.
  • Our final web application relies on the technique of using bounding boxes to mark regions containing the data to be extracted. 
  • Using bounding boxes, we can run our OCR engine on the separate regions to extract the particular information with high accuracy. 
  • It also helps us to know which piece of information belongs to which particular category making the model highly accurate in extracting the correct details.
  • The bounding boxes for any document can be manually found using applications like GIMP. This can then be fed into our application to retrieve the required details from a document of that type.

Convolutional Neural Networks: Architectures, Types & ExamplesConvolutional Neural Network (CNN) In Deep Learning | by Chetan Yeola |  Python in Plain English

Source Code – 

https://github.com/roboclub-mnnit/Image_to_text_conversion-2022-23-Project

 

Resources –

https://www.youtube.com/watch?v=ZVKaWPW9oQY&pp=ygUIZWFzeSBvY3I%3D

https://www.youtube.com/watch?v=ZVKaWPW9oQY&pp=ygUIZWFzeSBvY3I%3D

https://www.youtube.com/watch?v=ZVKaWPW9oQY&pp=ygUIZWFzeSBvY3I%3D

https://victorzhou.com/blog/keras-cnn-tutorial/

https://medium.com/nanonets/a-comprehensive-guide-to-ocr-with-tesseract-opencv-and-python-fd42f69e8ca8

 

Real-life applications –

  • Our application aims to reduce the hassle while filling candidate details in online job portals or related platforms. Why fill all the details separately when our application can extract all of that from a single image that can be uploaded with a click.
  • Other than that, with a bit of modification it can be used to convert printed text into editable documents with a quick scan.
  • The processing of business documents like cheques, bank statements, invoices, etc. can be made faster, saving both time and money.
  • It can be used to digitize physical records for better access and preservation.

 

Problems faced –

  • Initially we were tackling the problem of unstructured documents, but we found that applying OCR as it is resulted in poor results, hence we focused our work on structured documents thereon.
  • In case of structed documents, we were working to develop an application that would work on pictures clicked from a camera. But the issue we faced there was that the technique of using bounding boxes to mark the regions relies on the correct alignment of the image in order with the template image. But non-optimal lighting conditions and resolution made it difficult to apply image processing techniques for aligning the images correctly.
  • It is also important to note that some color combinations might make it difficult to correctly distinguish the text from the background and noise might be present in the output.

 

Thank you, 

Team Image-To-Text-Conversion