This project implements an online learning approach for handling class imbalance in streaming data using Adaptive Weight Kernel Density Estimation (AWKDE). The implementation includes various baseline methods and comparative analysis tools.
- Python 3.9+
- TensorFlow 2.x
- NumPy
- Scikit-learn
- Pandas
- Clone the repository:
git clone https://github.com/danielledaeun/obawkde.git
cd obawkde
- Set up Python environment:
conda create --name obawkde python=3.12
conda activate obawkde
pip install -r requirements.txt
- Install the AWKDE package:
git submodule update --init
cd lib/awkde
conda install -c conda-forge compilers # Required for compilation on some systems
pip install -e .
cd ../..
obawkde/
├── data/ # Directory for generated datasets
├── lib/
│ └── awkde/ # AWKDE implementation
├── notebooks/ # Analysis notebooks
│ ├── generate_data.ipynb # Data generation scripts
│ ├── table_results.ipynb # Results analysis
│ └── figure_results.ipynb # Visualization tools
├── main.py # Main execution script
├── proposing.py # Proposed method implementation
├── base.py # Base models for online learning
└── README.md
Before running experiments, you need to generate the required datasets:
- Create necessary data files:
jupyter notebook notebooks/generate_data.ipynb
This will generate various synthetic datasets (sea, sine, circle) with different characteristics in the data/
directory.
Note: While the data/
directory is maintained in the repository structure, the generated .csv
files are not tracked by git.
- Generate data (if not already done):
jupyter notebook notebooks/generate_data.ipynb
- Execute experiments:
python main.py
- Analyze results:
jupyter notebook notebooks/table_results.ipynb
Results are stored in the res/
directory with the following structure:
res/
├── sea/
│ ├── noise/
│ │ ├── 0/ # Imbalance Ratio: 0.1%
│ │ ├── 1/ # Imbalance Ratio: 1%
│ │ └── 10/ # Imbalance Ratio: 10%
│ ├── safe/
│ └── borderline/
Note: Result files are not included in the repository.
Contributions are welcome! Please feel free to submit a Pull Request.