CONTRIBUTORS –

Name	Branch	Reg. no.
Utkarsh Jha	CSE	20214138
Urbi Das	ME	20213057

MENTORS –

Anurag Gupta ECE - 20195168
Shashank Singh EE - 20202085

Aim –

To develop an OCR application capable of extracting important details of a candidate from his/ her resume.

Tech stack –

Streamlit, Python, OpenCV, Tesseract, CSS, Keras

OpenCV - YouTube Pengenalan Streamlit untuk Membuat Machine Learning Apps CSS logo and symbol, meaning, history, PNG Faster python app - Vladimir Cicovic Blog Krishcorp

Introduction –

The OCR application will take the picture of a candidate’s resume as its input and return the important details after extraction as its output.

Components –

Streamlit – Used for developing the web app
Tesseract – The core OCR engine
CSS – For improving the UI of the web app
Python – The main programming language

Working of the web application –

Care has been taken to make the application as user friendly as possible with an appealing GUI.
A file upload button has been provided for the user to upload the image of a resume.
Then the application, using the OCR engine and some post processing steps, extracts the important details like the phone number, email, address, etc. of the candidate and displays it as its output.

Methodology –

The project was divided into three phases to develop a good understanding of basic computer vision techniques to be able to take on the bigger challenge.
In the first phase, we learnt about basic neural networks and worked on the popular MNIST dataset where we managed to get a good validation accuracy using the simple architecture described in the tutorials.
In the second phase, we learnt about Convolutional Neural Networks (CNN), and applied our knowledge on the EMNIST dataset using the popular Keras library, achieving a good accuracy here as well.
In the third and the final phase, we made ourselves familiar with different OCR models, like EasyOCR, Tesseract, KerasOCR, then finally used the Tesseract engine to build our web application for information extraction from resumes.
Our final web application relies on the technique of using bounding boxes to mark regions containing the data to be extracted.
Using bounding boxes, we can run our OCR engine on the separate regions to extract the particular information with high accuracy.
It also helps us to know which piece of information belongs to which particular category making the model highly accurate in extracting the correct details.
The bounding boxes for any document can be manually found using applications like GIMP. This can then be fed into our application to retrieve the required details from a document of that type.

Convolutional Neural Networks: Architectures, Types & Examples Convolutional Neural Network (CNN) In Deep Learning | by Chetan Yeola | Python in Plain English

Source Code –

https://github.com/roboclub-mnnit/Image_to_text_conversion-2022-23-Project

Resources –

https://www.youtube.com/watch?v=ZVKaWPW9oQY&pp=ygUIZWFzeSBvY3I%3D

https://victorzhou.com/blog/keras-cnn-tutorial/

https://medium.com/nanonets/a-comprehensive-guide-to-ocr-with-tesseract-opencv-and-python-fd42f69e8ca8

Real-life applications –

Our application aims to reduce the hassle while filling candidate details in online job portals or related platforms. Why fill all the details separately when our application can extract all of that from a single image that can be uploaded with a click.
Other than that, with a bit of modification it can be used to convert printed text into editable documents with a quick scan.
The processing of business documents like cheques, bank statements, invoices, etc. can be made faster, saving both time and money.
It can be used to digitize physical records for better access and preservation.

Problems faced –

Initially we were tackling the problem of unstructured documents, but we found that applying OCR as it is resulted in poor results, hence we focused our work on structured documents thereon.
In case of structed documents, we were working to develop an application that would work on pictures clicked from a camera. But the issue we faced there was that the technique of using bounding boxes to mark the regions relies on the correct alignment of the image in order with the template image. But non-optimal lighting conditions and resolution made it difficult to apply image processing techniques for aligning the images correctly.
It is also important to note that some color combinations might make it difficult to correctly distinguish the text from the background and noise might be present in the output.

Thank you,

Team Image-To-Text-Conversion

Image To TextImage to Text Conversion

AIM

COMPONENTS AND TECHNOLOGIES USED

OVERVIEW