Skip to content

Commit

Permalink
release v0.2.0
Browse files Browse the repository at this point in the history
--fixup
  • Loading branch information
KiddoZhu committed Oct 12, 2019
1 parent e45d0a0 commit abb6036
Show file tree
Hide file tree
Showing 86 changed files with 4,004 additions and 3,295 deletions.
34 changes: 34 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
Change log
==========

Here list all notable changes in GraphVite library.

v0.2.0 - 2019-10-11
-------------------
- Add scalable multi-GPU prediction for node embedding and knowledge graph embedding.
Evaluation on link prediction is 4.6x faster than v0.1.0.
- New demo dataset `math` and entity prediction evaluation for knowledge graph.
- Support Kepler and Turing GPU architectures.
- Automatically choose the best episode size with regrad to RAM limit.
- Add template config files for applications.
- Change the update of global embeddings from average to accumulation. Fix a serious
numeric problem in the update.
- Move file format settings from graph to application. Now one can customize formats
and use comments in evaluation files. Add document for data format.
- Separate GPU implementation into training routines and models. Routines are in
`include/instance/gpu/*` and models are in `include/instance/model/*`.

v0.1.0 - 2019-08-05
-------------------
- Multi-GPU training of large-scale graph embedding
- 3 applications: node embedding, knowledge graph embedding and graph &
high-dimensional data visualization
- Node embedding
- Model: DeepWalk, LINE, node2vec
- Evaluation: node classification, link prediction
- Knowledge graph embedding
- Model: TransE, DistMult, ComplEx, SimplE, RotatE
- Evaluation: link prediction
- Graph & High-dimensional data visualization
- Model: LargeVis
- Evaluation: visualization(2D / 3D), animation(3D), hierarchy(2D)
75 changes: 55 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Here is a summary of the training time of GraphVite along with the best open-sou
implementations on 3 applications. All the time is reported based on a server with
24 CPU threads and 4 V100 GPUs.

Node embedding on [Youtube] dataset.
Training time of node embedding on [Youtube] dataset.

| Model | Existing Implementation | GraphVite | Speedup |
|------------|-------------------------------|-----------|---------|
Expand All @@ -50,24 +50,24 @@ Node embedding on [Youtube] dataset.
[2]: https://github.com/tangjianpku/LINE
[3]: https://github.com/aditya-grover/node2vec

Knowledge graph embedding on [FB15k] dataset.
Training / evaluation time of knowledge graph embedding on [FB15k] dataset.

| Model | Existing Implementation | GraphVite | Speedup |
|-----------------|-------------------------------|-----------|---------|
| [TransE] | [1.31 hrs (1 GPU)][3] | 14.8 mins | 5.30x |
| [RotatE] | [3.69 hrs (1 GPU)][4] | 27.0 mins | 8.22x |
| Model | Existing Implementation | GraphVite | Speedup |
|-----------------|-----------------------------------|--------------------|---------------|
| [TransE] | [1.31 hrs / 1.75 mins (1 GPU)][3] | 13.5 mins / 54.3 s | 5.82x / 1.93x |
| [RotatE] | [3.69 hrs / 4.19 mins (1 GPU)][4] | 28.1 mins / 55.8 s | 7.88x / 4.50x |

[FB15k]: http://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data.pdf
[TransE]: http://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data.pdf
[RotatE]: https://arxiv.org/pdf/1902.10197.pdf
[3]: https://github.com/DeepGraphLearning/KnowledgeGraphEmbedding
[4]: https://github.com/DeepGraphLearning/KnowledgeGraphEmbedding

High-dimensional data visualization on [MNIST] dataset.
Training time of high-dimensional data visualization on [MNIST] dataset.

| Model | Existing Implementation | GraphVite | Speedup |
|--------------|-------------------------------|-----------|---------|
| [LargeVis] | [15.3 mins (CPU parallel)][5] | 15.1 s | 60.8x |
| [LargeVis] | [15.3 mins (CPU parallel)][5] | 13.9 s | 66.8x |

[MNIST]: http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf
[LargeVis]: https://arxiv.org/pdf/1602.00370.pdf
Expand All @@ -85,19 +85,15 @@ Installation

### From Conda ###

GraphVite can be installed through conda with only one line.

```bash
conda install -c milagraph graphvite cudatoolkit=x.x
conda install -c milagraph graphvite cudatoolkit=$(nvcc -V | grep -Po "(?<=V)\d+.\d+")
```

where `x.x` is your CUDA version, e.g. 9.2 or 10.0.

If you only need embedding training without evaluation, you can use the following
alternative with minimal dependencies.

```bash
conda install -c milagraph graphvite-mini cudatoolkit=x.x
conda install -c milagraph graphvite-mini cudatoolkit=$(nvcc -V | grep -Po "(?<=V)\d+.\d+")
```

### From Source ###
Expand All @@ -113,6 +109,24 @@ cd build && cmake .. && make && cd -
cd python && python setup.py install && cd -
```

### On Colab ###

```bash
!wget -c https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
!chmod +x Miniconda3-latest-Linux-x86_64.sh
!./Miniconda3-latest-Linux-x86_64.sh -b -p /usr/local -f

!conda install -y -c milagraph -c conda-forge graphvite \
python=3.6 cudatoolkit=$(nvcc -V | grep -Po "(?<=V)\d+\.\d+")
!conda install -y wurlitzer ipykernel
```

```python
import site
site.addsitedir("/usr/local/lib/python3.6/site-packages")
%reload_ext wurlitzer
```

Quick Start
-----------

Expand All @@ -126,10 +140,14 @@ Typically, the example takes no more than 1 minute. You will obtain some output

```
Batch id: 6000
loss = 0.371641
loss = 0.371041
------------- link prediction --------------
AUC: 0.899933
macro-F1@20%: 0.236794
micro-F1@20%: 0.388110
----------- node classification ------------
macro-F1@20%: 0.242114
micro-F1@20%: 0.391342
```

Baseline Benchmark
Expand All @@ -139,13 +157,30 @@ To reproduce a baseline benchmark, you only need to specify the keywords of the
experiment. e.g. model and dataset.

```bash
graphvite baseline [keyword ...] [--no-eval] [--gpu n] [--cpu m]
graphvite baseline [keyword ...] [--no-eval] [--gpu n] [--cpu m] [--epoch e]
```

You may also set the number of GPUs and the number of CPUs per GPU.

Use ``graphvite list`` to get a list of available baselines.

Custom Experiment
-----------------

Create a yaml configuration scaffold for graph, knowledge graph, visualization or
word graph.

```bash
graphvite new [application ...] [--file f]
```

Fill some necessary entries in the configuration following the instructions. You
can run the configuration by

```bash
graphvite run [config] [--no-eval] [--gpu n] [--cpu m] [--epoch e]
```

High-dimensional Data Visualization
-----------------------------------

Expand All @@ -156,8 +191,8 @@ GraphVite.
graphvite visualize [file] [--label label_file] [--save save_file] [--perplexity n] [--3d]
```

The file can be either in numpy dump or text format. For the save file, we recommend
to use a `png` format, while `pdf` is also supported.
The file can be either a numpy dump `*.npy` or a text matrix `*.txt`. For the save
file, we recommend to use `png` format, while `pdf` is also supported.

Contributing
------------
Expand Down
3 changes: 2 additions & 1 deletion conda/graphvite-mini/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
package:
name: graphvite-mini
version: 0.1.0
version: 0.2.0

source:
path: ../..
Expand Down Expand Up @@ -39,6 +39,7 @@ requirements:
- easydict
- six
- future
- psutil

build:
string:
Expand Down
3 changes: 2 additions & 1 deletion conda/graphvite/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
package:
name: graphvite
version: 0.1.0
version: 0.2.0

source:
path: ../..
Expand Down Expand Up @@ -40,6 +40,7 @@ requirements:
- six
- future
- imageio
- psutil
- scipy
- matplotlib
- pytorch
Expand Down
1 change: 1 addition & 0 deletions conda/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ conda-forge::easydict
six
future
imageio
psutil
scipy
matplotlib
pytorch
Expand Down
40 changes: 40 additions & 0 deletions config/demo/math.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
application:
knowledge graph

resource:
gpus: [0]
cpu_per_gpu: 8
dim: 512

graph:
file_name: <math.train>

build:
optimizer:
type: Adam
lr: 5.0e-3
weight_decay: 0
num_partition: auto
num_negative: 8
batch_size: 100000
episode_size: 100

train:
model: RotatE
num_epoch: 2000
margin: 9
sample_batch_size: 2000
adversarial_temperature: 2
log_frequency: 100

evaluate:
task: link prediction
file_name: <math.test>
filter_files:
- <math.train>
- <math.valid>
- <math.test>
target: tail

save:
file_name: rotate_math.pkl
15 changes: 11 additions & 4 deletions config/quick_start.yaml → config/demo/quick_start.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ resource:
cpu_per_gpu: 8
dim: 128

format:
delimiters: " \t\r\n"
comment: "#"

graph:
file_name: <blogcatalog.train>
as_undirected: true
Expand All @@ -30,10 +34,13 @@ train:
log_frequency: 1000

evaluate:
task: node classification
file_name: <blogcatalog.label>
portions: [0.2]
times: 1
- task: link prediction
file_name: <blogcatalog.test>
filter_file: <blogcatalog.train>
- task: node classification
file_name: <blogcatalog.label>
portions: [0.2]
times: 1

save:
file_name: line_blogcatalog.pkl
2 changes: 1 addition & 1 deletion config/graph/deepwalk_flickr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ resource:
dim: 128

graph:
file_name: <flickr.train>
file_name: <flickr.graph>
as_undirected: true

build:
Expand Down
2 changes: 1 addition & 1 deletion config/graph/deepwalk_friendster-small.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ resource:
dim: 128

graph:
file_name: <friendster.small_train>
file_name: <friendster.small_graph>
as_undirected: true

build:
Expand Down
2 changes: 1 addition & 1 deletion config/graph/deepwalk_friendster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ resource:
dim: 96

graph:
file_name: <friendster.train>
file_name: <friendster.graph>
as_undirected: true

build:
Expand Down
2 changes: 1 addition & 1 deletion config/graph/deepwalk_youtube.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ resource:
dim: 128

graph:
file_name: <youtube.train>
file_name: <youtube.graph>
as_undirected: true

build:
Expand Down
2 changes: 1 addition & 1 deletion config/graph/line_flickr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ resource:
dim: 128

graph:
file_name: <flickr.train>
file_name: <flickr.graph>
as_undirected: true

build:
Expand Down
2 changes: 1 addition & 1 deletion config/graph/line_friendster-small.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ resource:
dim: 128

graph:
file_name: <friendster.small_train>
file_name: <friendster.small_graph>
as_undirected: true

build:
Expand Down
2 changes: 1 addition & 1 deletion config/graph/line_friendster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ resource:
dim: 96

graph:
file_name: <friendster.train>
file_name: <friendster.graph>
as_undirected: true

build:
Expand Down
2 changes: 1 addition & 1 deletion config/graph/line_youtube.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ resource:
dim: 128

graph:
file_name: <youtube.train>
file_name: <youtube.graph>
as_undirected: true

build:
Expand Down
2 changes: 1 addition & 1 deletion config/graph/node2vec_youtube.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ resource:
dim: 128

graph:
file_name: <youtube.train>
file_name: <youtube.graph>
as_undirected: true

build:
Expand Down
4 changes: 2 additions & 2 deletions config/knowledge_graph/complex_fb15k-237.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
application:
knowledge_graph
knowledge graph

resource:
gpus: []
Expand All @@ -12,7 +12,7 @@ graph:
build:
optimizer:
type: Adam
lr: 5.0e-4
lr: 2.0e-5
weight_decay: 0
num_partition: auto
num_negative: 64
Expand Down
Loading

0 comments on commit abb6036

Please sign in to comment.