-
Notifications
You must be signed in to change notification settings - Fork 49
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* update README
- Loading branch information
Adam Kosiorek
committed
Jul 10, 2017
1 parent
be61fed
commit 8bd4c59
Showing
5 changed files
with
73 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,25 +1,50 @@ | ||
# Hierarchical Attentive Object Tracking | ||
|
||
This codebase (in progress) implements the system described in the paper: | ||
This is an official Tensorflow implementation of single object tracking in videos by using recurrent attentive recurrent neural networks, as presented in the following paper: | ||
|
||
Hierarchical Attentive Object Tracking | ||
[A. R. Kosiorek](https://www.linkedin.com/in/adamkosiorek/?locale=en_US), [A. Bewley](http://ori.ox.ac.uk/mrg_people/alex-bewley/), [I. Posner](http://ori.ox.ac.uk/mrg_people/ingmar-posner/), ["Hierarchical Attentive Object Tracking", arXiv preprint arxiv:1706.09262](https://arxiv.org/abs/1706.09262). by | ||
|
||
[Adam R. Kosiorek](https://www.linkedin.com/in/adamkosiorek/?locale=en_US), [Alex Bewley](http://ori.ox.ac.uk/mrg_people/alex-bewley/), [Ingmar Posner](http://ori.ox.ac.uk/mrg_people/ingmar-posner/) | ||
* **author**: Adam Kosiorek, Oxford Robotics Institue, University of Oxford | ||
* **email**: adamk(at)robots.ox.ac.uk | ||
* **paper**: https://arxiv.org/abs/1706.09262 | ||
* **webpage**: http://ori.ox.ac.uk/ | ||
|
||
See [the paper](https://arxiv.org/abs/1706.09262) for more details. Please contact Adam Kosiorek (adamk@robots.ox.ac.uk) if you have any questions. | ||
## Installation | ||
Install [Tensorflow v1.1](https://www.tensorflow.org/versions/r1.1/install/) and the following dependencies | ||
(using `pip install -r requirements.txt` (preffered) or `pip install [package]`): | ||
* matplotlib==1.5.3 | ||
* numpy==1.12.1 | ||
* pandas==0.18.1 | ||
* scipy==0.18.1 | ||
|
||
## Demo | ||
The notebook `scripts/demo.ipynb` contains a demo, which shows how to evaluate tracker on an arbitrary image sequence. By default, it runs on images located in `imgs` folder. | ||
Before running the demo please download AlexNet weights first (described in the Training section). | ||
|
||
|
||
# Training on KITTI | ||
## Data preparation | ||
## Data | ||
|
||
1. Download KITTI dataset from [here](http://www.cvlibs.net/datasets/kitti/eval_tracking.php). We need [left color imagesi](http://www.cvlibs.net/download.php?file=data_tracking_image_2.zip) and [tracking labels](http://www.cvlibs.net/download.php?file=data_tracking_label_2.zip). | ||
2. Unpack data into a data folder; images should be in an image folder and labels should be in a label folder. | ||
3. Resize all the images to (621, 187). | ||
1. Download KITTI dataset from [here](http://www.cvlibs.net/datasets/kitti/eval_tracking.php). We need [left color images](http://www.cvlibs.net/download.php?file=data_tracking_image_2.zip) and [tracking labels](http://www.cvlibs.net/download.php?file=data_tracking_label_2.zip). | ||
2. Unpack data into a data folder; images should be in an image folder and labels should be in a label folder. | ||
3. Resize all the images to `(heigh=187, width=621)` e.g. by using the `scripts/resize_imgs.sh` script. | ||
|
||
## Training | ||
|
||
1. Download the AlexNet weights from [here](http://www.cs.toronto.edu/~guerzhoy/tf_alexnet/bvlc_alexnet.npy) and put them file in the `checkpoints` folder. | ||
2. Run `python scripts/train_kitti.py --img_folder=path/to/image/folder --label_folder=/path/to/label/folder`. | ||
1. Download the AlexNet weights: | ||
* Execute `scripts/download_alexnet.sh` or | ||
* Download the weights from [here](http://www.cs.toronto.edu/~guerzhoy/tf_alexnet/bvlc_alexnet.npy) and put the file in the `checkpoints` folder. | ||
2. Run | ||
|
||
The training script will save model checkpoints in the `checkpoints` folder and report train and test scores every couple of epochs. You can run tensorboard in the `checkpoints` folder to visualise training progress. Training should converge in about 400k iterations, which should take about 3 days. It might take a couple of hours between logging messages, so don't worry. | ||
python scripts/train_kitti.py --img_folder=path/to/image/folder --label_folder=/path/to/label/folder | ||
|
||
The training script will save model checkpoints in the `checkpoints` folder and report train and test scores every couple of epochs. You can run tensorboard in the `checkpoints` folder to visualise training progress. Training should converge in about 400k iterations, which should take about 3 days. It might take a couple of hours between logging messages, so don't worry. | ||
|
||
## Evaluation on KITTI dataset | ||
The `scripts/eval_kitti.ipynb` notebook contains the code necessary to prepare (IoU, timesteps) curves for train and validation set of KITTI. Before running the evaluation: | ||
* Download AlexNet weights (described in the Training section). | ||
* Update image and label folder paths in the notebook. | ||
|
||
|
||
## Release Notes | ||
**Version 1.0** | ||
* Original version from the paper. It contains the KITTI tracking experiment. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
matplotlib==1.5.3 | ||
numpy==1.12.1 | ||
pandas==0.18.1 | ||
scipy==0.18.1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#!/usr/bin/env bash | ||
|
||
wget http://www.cs.toronto.edu/~guerzhoy/tf_alexnet/bvlc_alexnet.npy | ||
mv bvlc_alexnet.npy checkpoints/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
#!/usr/bin/env bash | ||
|
||
# This scripts finds images in a directory tree, resizes them and put them into a new directory | ||
# by preserving the directory structure. | ||
# | ||
# Usage: | ||
# ./resize_imgs.sh input_dir output_dir | ||
# | ||
|
||
input_dir=$1 | ||
output_dir=$2 | ||
|
||
echo $input_dir | ||
echo $output_dir | ||
|
||
for img_path in $(find $input_dir -iname *.png); do | ||
echo "processing $img_path" | ||
output_path=$output_dir/$img_path | ||
|
||
if [ ! -d $output_path ]; then | ||
mkdir -p $output_path | ||
fi | ||
|
||
convert -resize 621x187 $input_dir/$img_path | ||
done |