Check https://github.com/wey-gu/NebulaGraph-Fraud-Detection-GNN/tree/main/notebooks/Train_GraphSAGE.ipynb for details.
- Input: Graph of Historical Yelp Reviews
- Output: a GraphSAGE Node Classification Model, could be inductive
ββββββββββββββββββββββββββββββββββββββββββββββββ
β ββββββββββββββββββββββββββββββββββββββββ β
β β Graph of Historical Reviews β β
β ββββββββββββββββββββββββββββββββββββββββ β
β .β. . β
β ( )ββββββββββββ( ) β
β `β' ' β
β . .β. β² β β
β ( )βββββ( ) β² . β± β
β ' `β' β² ( )β± β
β β² β β² ' β
β β² . β± β β
β β( )β± .β. .β. β
β ' ( )βββββββ( ) β
β `β' `β' β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββ
β (Nebula-DGL: NebulaLoader)
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ββββββ β β β β β β β β β β β β β β β β β β β
β βGNN β β β β
β ββββββ β β β
β β β β
β β β β β β
β . . β±.β. β . . β±.β. β β
β β ( )βββββ±( ) β ( )βββββ±( ) β
β . .β. 'β± ' `β' β ββββββββββ 'β± ' `β' β . .β. β
β( )βββββ( ) β . .β. β ReLU β±β β . .β. ( )βββββ( )β
β ' `β' ( )βββββ( ) β β β± β ( )βββββ( ) β ' `β' β
β β² β βββΆ β ' `β' β β± β β ' `β' ... βββΆ β² β β
β β² . β± β² β β ββββββ β β² β β β² . β± β
β β( )β± β β² . β± ββββββββββ β β² . β± β( )β± β
β ' β( )β± β β( )β± β ' β
β β ' β ' β
β . .β. β . .β. β β
β β ( )βββββ( ) β ( )βββββ( ) β
β ' `β' β ' `β' β β
β β β² β β β² β β
β β² . β± β β² . β± β β
β β β( )β± β β( )β± β
β ' β ' β β
β β β β β β β β β β β β β β β β β β β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββ
β Ξ β
βββββ΄ββ GNN Model β± β² ββββ΄βββ
βββββββ€ β± β² βββββββ€
βββββββΌβββββββββββββΆ ββββββββββββΆβββββββ€
βββββββ€ β² β± βββββββ€
βββββ¬ββ β² β± ββββ¬βββ
β V β
ββββββββββββββββββββββββββββββββββββ
-
For how it works, check notebooks/Inference_API.ipynb for details.
-
For its refererence implementation, see src/fraudd_backend, src/fraudd_frontend
- Input: a new review
- Output: is_fraud prediction
- Flow:
0. A review will be inserted to NebulaGraph
- A SubGraph Query will be called
- SubGraph will be sent to Inference API
- Inference API will predict its
is_fraud
label on the trained model
βββββββββββββββββββββββ βββββββββββββββββββ
β β β β
ββββββΆβ Transaction Record βββββββ2. Fraud Risk ββββββΆβ Inference API βββββββ
β ββββββPrediction with ββββββ€ β β
β β Sub Graph. β β β
βββββββββββββββββββββββ βββββββββββββββββββ β
β β² β β
β β β β
0. Insert 1. Get New 3.req: Node β
Record. Record Sub Classification. β
β Graph. β β
βΌ β β β
ββββββββββββββββββββββββ΄ββββββββββββββββββ ββββββββββββββββββββββ 3.resp: β
ββββββββββββββββββββββββββββββββββββββββββ β Predictedβ
ββ Graph of Historical Transactions ββ β Risk. β
ββββββββββββββββββββββββββββββββββββββββββ β β
β .β. . β β β
β ( )ββββββββββββ( ) β β β
β `β' ' β β ββββββββββββββββββββββββ β
β . .β. β² β β β β GNN Model Ξ β β
β ( )βββββ( ) β² β± β β βββββ΄ββ β± β² ββββ΄βββ β
β ' `β' β² . β± β β βββββββ€ β± β² βββββββ€ β
β β² β β² ( ) β βββΆβββββββΌββββββΆβ ββββββββββββ€βββ
β β² . β± β ' β βββββββ€ β² β± βββββββ€
β β( )β± .β. .β. β βββββ¬ββ β² β± ββββ¬βββ
β ' ( )βββββββ( )β β V β
β `β' `β' β ββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββ
As the review request being sent to Graph Database and Inference API, when fraud predict is responded to the Inference API caller, in parallel, the result will be broadcast to Real Time Fraud Monitor Dashboards, too.
The dashbard are tables subscribing to the flow of reviews sending in, and when some of the records are highlighted with hi risk in fraud, corresponding party will be notified and inovlved for follow-up actions.
Demo Video ππ»:
GraphSAGE_FraudDetection_demo.mov
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Real-Time Online Fraud Monitor Web Service β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββ¬βββββ¬βββββββ¬βββββ¬βββββ¬βββββ¬βββββ¬βββββ¬βββββ¬βββββ¬βββββββ β
β β β β β β β β β β β β OK β β
β ββββββΌβββββΌβββββββΌβββββΌβββββΌβββββΌβββββΌβββββΌβββββΌβββββΌβββββββ€ β
β β β β β β β β β β β β OK β β
β ββββββΌβββββΌβββββββΌβββββΌβββββΌβββββΌβββββΌβββββΌβββββΌβββββΌβββββββ€ β
β β β β β β β β β β β β NOK β β
β ββββββ΄βββββ΄βββββββ΄βββββ΄βββββ΄βββββ΄βββββ΄βββββ΄βββββ΄βββββ΄βββββββ β
βββββββββββββββββββββββββββββββββββ²βββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββ β
β New Review/Requests β β
βGenerated Continuously β ββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββ β
β β
β β
β βββββββββββββββββββββββ ββββββββββ»βββββββββ
β β β β β
β β β β β
ββΆβ Transaction Record βββββββ2. Fraud Risk ββββββΆβ Inference API βββββββ
β ββββββPrediction with ββββββ€ β β
β β Sub Graph β β β
βββββββββββββββββββββββ βββββββββββββββββββ β
...
We will leverage Yelp-Fraud dataset comes from Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters.
There will be one type of node and three types of edges:
- Node: review on restaurant, hotel. With Label and Feature Properties:
is_fraud
to be the label- 32 features being feature-engineered
- Edge: in 3-typed between the nodes. Without Properties:
- R-U-R: share same reviewer, named
shares_user_with
- R-S-R: share same rate for same object, named
shares_restaurant_rating_with
- R-T-R: share same review submitting month for same object, named
shares_restaurant_in_one_month_with
- R-U-R: share same reviewer, named
Before the project, I made the playground to ingest the Yelp Data Graph into NebulaGraph, see more from https://github.com/wey-gu/nebulagraph-yelp-frauddetection.
You could quickly run the following lines to make it ready:
# Deploy NebulaGraph for Playground
curl -fsSL nebula-up.siwei.io/install.sh | bash
# Clone the data downloader repo
git clone https://github.com/wey-gu/nebulagraph-yelp-frauddetection && cd nebulagraph-yelp-frauddetection
# Install requirement, then download the data ready for NebulaGraph
python3 -m pip install -r requirements.txt
python3 data_download.py
# Import it to NebulaGraph
docker run --rm -ti \
--network=nebula-net \
-v ${PWD}/yelp_nebulagraph_importer.yaml:/root/importer.yaml \
-v ${PWD}/data:/root \
vesoft/nebula-importer:v3.1.0 \
--config /root/importer.yaml
Then refer to notebooks/* for Training Model and the Real-time Fraud Detection Web Service itself, and refer to src/* for the Real-time Fraud Detection Web Service reference implementation.
Follow this, you should be able to run the Real-time Fraud Detection Web Service with my trained model being loaded.
Get your machine's IP (not the 127.0.0.1), say it's 10.0.0.5
.
export MY_IP="10.0.0.5"
Run Backend:
git clone https://github.com/wey-gu/NebulaGraph-Fraud-Detection-GNN.git
cd NebulaGraph-Fraud-Detection-GNN/src
# ADD MY_IP into CORS & Frontend file, nginx.conf
sed -i "s/nebula-demo.siwei.io/$MY_IP/g" fraudd_backend/fraudd/__init__.py
sed -i "s/nebula-demo.siwei.io/$MY_IP/g" fraudd_frontend/src/components/Table.vue
sed -i "s/nebula-demo.siwei.io/$MY_IP/g" nginx.conf
# install dep of backend
python3 -m pip install -r requirements.txt
export NG_ENDPOINTS="127.0.0.1:9669";
export FLASK_ENV=development;
export FLASK_APP=wsgi;
# run backend
cd fraudd_backend
python3 -m flask run --reload --host=0.0.0.0
# verify
$ curl localhost:5000/api
{
"status": "ok"
}
From another terminal, build frontend:
cd NebulaGraph-Fraud-Detection-GNN/src
cd fraudd_frontend
# sudo apt install npm
npm install
npm run build
From another terminal, run Nginx:
cd NebulaGraph-Fraud-Detection-GNN/src
docker-compose up -d
# end-to-end verify backend
curl -X POST localhost:15000/api/add_review \
-d '{"vertex_id": "2049"}' \
-H 'Content-Type: application/json'
# return value
{
"is_fraud": false
}
From web browser ππ» http://10.0.0.5:8080/
You could check my demo: http://nebula-demo.siwei.io:8080