Aadhar Card is the official photo ID document in India. This project aims to build an easy to use method to extract the key information from a picture of an Aadhar Card, mainly the Aadhar Number, Name of the Person, Date of Birth and Address. This is made possible using a Text Detection model and a Text Recognition Model.
The Text Detection model is tasked with detecting the four entities on an image of aadhar card and getting the location of bounding boxes for the entities.We have used the YOLOv8 model provided by Ultralytics. The reason to choose YOLOv8 over some other detection model is simply the ease of use throughoout the lifecycle of the model. Ultralytics provides us a low code approach to fine-tune and deploy a YOLOv8 model. Also we want our model to have a low impact on CPU and memory so we opted for the Nano variant of the model.
The Text Recognition model is tasked with identifying text that is contained inside a bounding box. YOLOv8 model provides the bounding box locations which is piped into the text recognition model where it identifies the text in each box and returns the overall output in a JSON object. example:
For the front-side, expect the following output:
{
"AADHAR_NUMBER": "123456789012",
"NAME": "John Doe",
"DATE_OF_BIRTH": "01-01-1997",
}
For the back-side, expect the following output:
{
"ADDRESS": "Some Random Address",
"AADHAR_NUMBER": "123456789012"
}
There are a two ways that you can use this model:
- Using as a Gradio web app.
- Using as a FastAPI Server.
Create a virtual environment using venv
or conda
whichever you prefer. Please use Python 3.9, 3.10 or 3.11
Install dependencies
$ pip install ultralytics supervision huggingface_hub easyocr fastapi uvicorn python-multipart
Run the server
$ uvicorn "src.server:server" --host=127.0.0.1 --port=8000 --workers=4
Docker
# build the image
$ docker build -t server:latest .
# run container
$ docker run -itd --rm --name server -p 8000:8000 server:latest
Docker Compose
$ docker compose up -d
Once the FastAPI server is running, head over to http://localhost:8000/
to open the Gradio UI.
- Fine-Tuned Model: Huggingface
- EasyOCR: GitHub
- Notebook used for Fine-Tuning: Link