Aadhar Card Entity Extraction

Overview

Aadhar Card is the official photo ID document in India. This project aims to build an easy to use method to extract the key information from a picture of an Aadhar Card, mainly the Aadhar Number, Name of the Person, Date of Birth and Address. This is made possible using a Text Detection model and a Text Recognition Model.

The Text Detection model is tasked with detecting the four entities on an image of aadhar card and getting the location of bounding boxes for the entities.We have used the YOLOv8 model provided by Ultralytics. The reason to choose YOLOv8 over some other detection model is simply the ease of use throughoout the lifecycle of the model. Ultralytics provides us a low code approach to fine-tune and deploy a YOLOv8 model. Also we want our model to have a low impact on CPU and memory so we opted for the Nano variant of the model.

The Text Recognition model is tasked with identifying text that is contained inside a bounding box. YOLOv8 model provides the bounding box locations which is piped into the text recognition model where it identifies the text in each box and returns the overall output in a JSON object. example:

For the front-side, expect the following output:

{
    "AADHAR_NUMBER": "123456789012",
    "NAME": "John Doe",
    "DATE_OF_BIRTH": "01-01-1997",
}

For the back-side, expect the following output:

{
    "ADDRESS": "Some Random Address",
    "AADHAR_NUMBER": "123456789012"
}

How to use

There are a two ways that you can use this model:

Using as a Gradio web app.
Using as a FastAPI Server.

Run from Source Code

Create a virtual environment using venv or conda whichever you prefer. Please use Python 3.9, 3.10 or 3.11

Install dependencies

$ pip install ultralytics supervision huggingface_hub easyocr fastapi uvicorn python-multipart

Run the server

$ uvicorn "src.server:server" --host=127.0.0.1 --port=8000 --workers=4

Using Docker

Docker

# build the image
$ docker build -t server:latest .

# run container
$ docker run -itd --rm --name server -p 8000:8000 server:latest

Docker Compose

$ docker compose up -d

Opening the Web UI

Once the FastAPI server is running, head over to http://localhost:8000/ to open the Gradio UI.

Links

Fine-Tuned Model: Huggingface
EasyOCR: GitHub
Notebook used for Fine-Tuning: Link

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
notebooks		notebooks
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Fine-Tune.ipynb		Fine-Tune.ipynb
Langchain_Day_4.ipynb		Langchain_Day_4.ipynb
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aadhar Card Entity Extraction

Overview

How to use

Run from Source Code

Using Docker

Opening the Web UI

Links

About

Languages

arnabd64/Aadhar-Card-Entity-Extract

Folders and files

Latest commit

History

Repository files navigation

Aadhar Card Entity Extraction

Overview

How to use

Run from Source Code

Using Docker

Opening the Web UI

Links

About

Topics

Resources

Stars

Watchers

Forks

Languages