great job #3

ucasiggcas · 2020-06-23T01:37:22Z

hi,dear
well done
will try to reproduce the rp
btw,any metrics for the Recall term

thx

xuetf · 2020-06-23T03:44:47Z

hi,dear
well done
will try to reproduce the rp
btw,any metrics for the Recall term

thx

Hi,
The tianchi forum provides the official evaluation scripts, you can refer to https://tianchi.aliyun.com/forum/postDetail?spm=5176.12586969.1002.3.6c3f29e8qbwsHt&postId=102089.

But out rp hasn't provided the offline evaluation code so far. We will provide offline evaluation code as soon as possible.

xuetf · 2020-06-25T01:50:31Z

hi,dear
well done
will try to reproduce the rp
btw,any metrics for the Recall term
thx

Hi,
The tianchi forum provides the official evaluation scripts, you can refer to https://tianchi.aliyun.com/forum/postDetail?spm=5176.12586969.1002.3.6c3f29e8qbwsHt&postId=102089.

But out rp hasn't provided the offline evaluation code so far. We will provide offline evaluation code as soon as possible.

hi, you can reproduce the rp by the 'offline' branch now. Just Read the 'Evaluation' Part in the updated README.md file.

ucasiggcas · 2020-06-28T09:33:52Z

大佬，这个是在线的还是离线的啊？
怎么部署啊？

xuetf · 2020-06-29T01:38:48Z

大佬，这个是在线的还是离线的啊？
怎么部署啊？

您好，可以参考README文件。有详细说明环境和运行方式。

ucasiggcas · 2020-07-07T06:21:42Z

大佬请教下session-id是啥意思啊？训练的模型里面的

ucasiggcas · 2020-07-07T06:31:01Z

每个user_id有一个session_id ??
另外请问下phase是啥意思啊？模型也按这个来存储的
请教下咋计算P值啊，我看sr-gnn是求的P值，
召回阶段能求AUC吗？？
多谢大佬

ucasiggcas · 2020-07-08T03:15:01Z

哈喽，大佬，sr-gnn的召回结果怎么评价啊？
按照readme的介绍
You can reproduce these results by checkout the 'offline' branch by git checkout offline and run python3 code/sr_gnn_main.py and python3 code/recall_main.py in sequence.
我先执行的sr_gnn_main.py,后执行的recall_main.py
现在recall_main.py还没结束，如下

train/validate split done...
create offline eval answer done...
begin read item df...
begin compute similarity using faiss...
108916
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
(2643000, 4)
(1223242, 4)
using multi_processing
phase: 7
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test, target_phase=7
user-cf user-sim begin
bi-graph item-sim begin
item-cf item-sim begin
swing item-sim begin
100%|██████████████████████████████████████████████████████████████████████| 18004/18004 [00:00<00:00, 239463.23it/s]
100%|███████████████████████████████████████████████████████████████████████| 45190/45190 [00:02<00:00, 18915.69it/s]
100%|███████████████████████████████████████████████████████████████████████| 18004/18004 [00:00<00:00, 18381.86it/s]
user-cf user-sim-pair done, pair_num=18004
100%|████████████████████████████████████████████████████████████████████████| 45190/45190 [00:07<00:00, 6030.94it/s]
 27%|███████████████████▋                                                    | 12385/45190 [00:08<00:21, 1540.94it/s]swing item-sim-pair done, pair_num=45060
100%|████████████████████████████████████████████████████████████████████████| 45190/45190 [00:30<00:00, 1458.91it/s]
bi-graph item-sim-pair done, pair_num=45190
100%|█████████████████████████████████████████████████████████████████████████| 18004/18004 [01:03<00:00, 281.99it/s]
100%|███████████████████████████████████████████████████████████████████████| 45190/45190 [00:04<00:00, 10404.96it/s]
item-cf item-sim-pair done, pair_num=45190
current_len=0
current_len=1
current_len=2
current_len=3
drop duplicates...
recall-source-num=4
do recall for swing
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
do recall for user-cf
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
do recall for bi-graph
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
do recall for item-cf
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test

这个是正常的吗？似乎也没看懂ndcg啥的啊
多谢

xuetf · 2020-07-08T15:27:00Z

大佬请教下session-id是啥意思啊？训练的模型里面的

session_id只是个模型保存名称的前缀，在sr_gnn_main.py中指定。

xuetf · 2020-07-08T15:28:29Z

哈喽，大佬，sr-gnn的召回结果怎么评价啊？
按照readme的介绍
You can reproduce these results by checkout the 'offline' branch by git checkout offline and run python3 code/sr_gnn_main.py and python3 code/recall_main.py in sequence.
我先执行的sr_gnn_main.py,后执行的recall_main.py
现在recall_main.py还没结束，如下

train/validate split done...
create offline eval answer done...
begin read item df...
begin compute similarity using faiss...
108916
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
(2643000, 4)
(1223242, 4)
using multi_processing
phase: 7
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test, target_phase=7
user-cf user-sim begin
bi-graph item-sim begin
item-cf item-sim begin
swing item-sim begin
100%|██████████████████████████████████████████████████████████████████████| 18004/18004 [00:00<00:00, 239463.23it/s]
100%|███████████████████████████████████████████████████████████████████████| 45190/45190 [00:02<00:00, 18915.69it/s]
100%|███████████████████████████████████████████████████████████████████████| 18004/18004 [00:00<00:00, 18381.86it/s]
user-cf user-sim-pair done, pair_num=18004
100%|████████████████████████████████████████████████████████████████████████| 45190/45190 [00:07<00:00, 6030.94it/s]
 27%|███████████████████▋                                                    | 12385/45190 [00:08<00:21, 1540.94it/s]swing item-sim-pair done, pair_num=45060
100%|████████████████████████████████████████████████████████████████████████| 45190/45190 [00:30<00:00, 1458.91it/s]
bi-graph item-sim-pair done, pair_num=45190
100%|█████████████████████████████████████████████████████████████████████████| 18004/18004 [01:03<00:00, 281.99it/s]
100%|███████████████████████████████████████████████████████████████████████| 45190/45190 [00:04<00:00, 10404.96it/s]
item-cf item-sim-pair done, pair_num=45190
current_len=0
current_len=1
current_len=2
current_len=3
drop duplicates...
recall-source-num=4
do recall for swing
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
do recall for user-cf
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
do recall for bi-graph
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
do recall for item-cf
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test

这个是正常的吗？似乎也没看懂ndcg啥的啊
多谢

这个是正常的，复赛总共有3个phase，7,8,9。你目前在跑phase 7。3个phase跑完后，会跑官网给的评估代码进行评估。ndcg和hitrate。

xuetf · 2020-07-08T15:31:23Z

哈喽，大佬，sr-gnn的召回结果怎么评价啊？
按照readme的介绍
You can reproduce these results by checkout the 'offline' branch by git checkout offline and run python3 code/sr_gnn_main.py and python3 code/recall_main.py in sequence.
我先执行的sr_gnn_main.py,后执行的recall_main.py
现在recall_main.py还没结束，如下

train/validate split done...
create offline eval answer done...
begin read item df...
begin compute similarity using faiss...
108916
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
(2643000, 4)
(1223242, 4)
using multi_processing
phase: 7
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test, target_phase=7
user-cf user-sim begin
bi-graph item-sim begin
item-cf item-sim begin
swing item-sim begin
100%|██████████████████████████████████████████████████████████████████████| 18004/18004 [00:00<00:00, 239463.23it/s]
100%|███████████████████████████████████████████████████████████████████████| 45190/45190 [00:02<00:00, 18915.69it/s]
100%|███████████████████████████████████████████████████████████████████████| 18004/18004 [00:00<00:00, 18381.86it/s]
user-cf user-sim-pair done, pair_num=18004
100%|████████████████████████████████████████████████████████████████████████| 45190/45190 [00:07<00:00, 6030.94it/s]
 27%|███████████████████▋                                                    | 12385/45190 [00:08<00:21, 1540.94it/s]swing item-sim-pair done, pair_num=45060
100%|████████████████████████████████████████████████████████████████████████| 45190/45190 [00:30<00:00, 1458.91it/s]
bi-graph item-sim-pair done, pair_num=45190
100%|█████████████████████████████████████████████████████████████████████████| 18004/18004 [01:03<00:00, 281.99it/s]
100%|███████████████████████████████████████████████████████████████████████| 45190/45190 [00:04<00:00, 10404.96it/s]
item-cf item-sim-pair done, pair_num=45190
current_len=0
current_len=1
current_len=2
current_len=3
drop duplicates...
recall-source-num=4
do recall for swing
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
do recall for user-cf
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
do recall for bi-graph
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
do recall for item-cf
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test

这个是正常的吗？似乎也没看懂ndcg啥的啊
多谢

这个是正常的，复赛总共有3个phase，7,8,9。你目前在跑phase 7。3个phase跑完后，会跑官网给的评估代码进行评估。ndcg和hitrate。

哈喽，大佬，sr-gnn的召回结果怎么评价啊？
按照readme的介绍
You can reproduce these results by checkout the 'offline' branch by git checkout offline and run python3 code/sr_gnn_main.py and python3 code/recall_main.py in sequence.
我先执行的sr_gnn_main.py,后执行的recall_main.py
现在recall_main.py还没结束，如下

train/validate split done...
create offline eval answer done...
begin read item df...
begin compute similarity using faiss...
108916
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
(2643000, 4)
(1223242, 4)
using multi_processing
phase: 7
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test, target_phase=7
user-cf user-sim begin
bi-graph item-sim begin
item-cf item-sim begin
swing item-sim begin
100%|██████████████████████████████████████████████████████████████████████| 18004/18004 [00:00<00:00, 239463.23it/s]
100%|███████████████████████████████████████████████████████████████████████| 45190/45190 [00:02<00:00, 18915.69it/s]
100%|███████████████████████████████████████████████████████████████████████| 18004/18004 [00:00<00:00, 18381.86it/s]
user-cf user-sim-pair done, pair_num=18004
100%|████████████████████████████████████████████████████████████████████████| 45190/45190 [00:07<00:00, 6030.94it/s]
 27%|███████████████████▋                                                    | 12385/45190 [00:08<00:21, 1540.94it/s]swing item-sim-pair done, pair_num=45060
100%|████████████████████████████████████████████████████████████████████████| 45190/45190 [00:30<00:00, 1458.91it/s]
bi-graph item-sim-pair done, pair_num=45190
100%|█████████████████████████████████████████████████████████████████████████| 18004/18004 [01:03<00:00, 281.99it/s]
100%|███████████████████████████████████████████████████████████████████████| 45190/45190 [00:04<00:00, 10404.96it/s]
item-cf item-sim-pair done, pair_num=45190
current_len=0
current_len=1
current_len=2
current_len=3
drop duplicates...
recall-source-num=4
do recall for swing
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
do recall for user-cf
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
do recall for bi-graph
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
do recall for item-cf
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test

这个是正常的吗？似乎也没看懂ndcg啥的啊
多谢

单独评估sr-gnn的话，直接把cf_methods = {'item-cf', 'bi-graph', 'swing', 'user-cf'}改为cf_methods = {}，这样只会读取sr-gnn的结果

ucasiggcas · 2020-07-09T08:52:35Z

大佬，train_click.csv这种数据怎么解读啊？能不能解释下啊？如下示例

2255,18,0.984280312637961
18349,35,0.9841110719125039
4489,35,0.9842627512424618
16846,66,0.984069476211683
1888,66,0.9842584372310226
21919,80,0.9842570189950302

ucasiggcas · 2020-07-10T13:16:32Z

执行recall_main.py怎么出现下面的错误呢？

train/validate split done...
create offline eval answer done...
begin read item df...
108916
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
(2643000, 4)
(1223242, 4)
using multi_processing
phase: 7
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test, target_phase=7
drop duplicates...
recall-source-num=0
0
read sr-gnn results....
sr-gnn begin...
sr-gnn rec path=user_data/sr-gnn/offline/7/data/standard_rec.txt
Traceback (most recent call last):
  File "my_sr_gnn_eval2.py", line 62, in <module>
    recall_methods={'sr-gnn'})
  File "/data1/xulm1/debiasing_rush/code/recall/do_recall_multi_processing.py", line 115, in do_multi_recall_results_multi_processing
    standard_sr_gnn_recall_item_dict = read_sr_gnn_results(phase, prefix='standard', adjust_type=adjust_type)
  File "/data1/xulm1/debiasing_rush/code/recall/sr_gnn/read_sr_gnn_results.py", line 54, in read_sr_gnn_results
    with open(sr_gnn_rec_path) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'user_data/sr-gnn/offline/7/data/standard_rec.txt'

我只用的sr-gnn的v1版本
我看了下运行的时候是（展示部分代码）

def sr_nn_version_1(phase, item_cnt):
    model_path = './models/v1/{}/{}'.format(mode, phase)
    file_path = '{}/{}/data'.format(sr_gnn_root_dir, phase)
    sr_gnn_lib_path = 'code/recall/sr_gnn/lib'
    if os.path.exists(model_path):
        print('model_path={} exists, delete'.format(model_path))
        shutil.rmtree(model_path)
    if not os.path.exists(model_path):
        os.makedirs(model_path)

    os.system("python3 {sr_gnn_lib_path}/my_main_.py --task train --node_count {item_cnt} "
              "--checkpoint_path {model_path}/session_id --train_input {file_path}/train_item_seq_enhanced.txt "
              "--test_input {file_path}/test_item_seq.txt --gru_step 2 --epochs 10 "
              "--lr 0.001 --lr_dc 2 --dc_rate 0.1 --early_stop_epoch 3 "
              "--hidden_size 256 --batch_size 256 --max_len 20 --has_uid True "
              "--feature_init {file_path}/item_embed_mat.npy --sigma 8 ".format(sr_gnn_lib_path=sr_gnn_lib_path,
                                                                                item_cnt=item_cnt,
                                                                                model_path=model_path,
                                                                                file_path=file_path))
    # generate rec
    checkpoint_path = find_checkpoint_path(phase, version='v1')
    prefix = 'standard_'

    rec_path = '{}/{}rec.txt'.format(file_path, prefix)
    print("WOC"*20)
    print(rec_path)
    os.system("python3 {sr_gnn_lib_path}/my_main_.py --task recommend --node_count {item_cnt} "
              "--checkpoint_path {checkpoint_path} --item_lookup {file_path}/item_lookup.txt "
              "--recommend_output {rec_path} --session_input {file_path}/test_user_sess.txt "
              "--gru_step 2 --hidden_size 256 --batch_size 256 --rec_extra_count 50 --has_uid True "
              "--feature_init {file_path}/item_embed_mat.npy "
              "--max_len 10 --sigma 8".format(sr_gnn_lib_path=sr_gnn_lib_path,
                                              item_cnt=item_cnt, checkpoint_path=checkpoint_path,
                                              file_path=file_path, rec_path=rec_path))

    for phase in range(start_phase, now_phase+1):
        print('phase={}'.format(phase))
        sr_nn_version_1(phase, phase_item_cnt_dict[phase])

其中的rec_path是online的，而eval的时候是offline的
user_data/sr-gnn/online/7/data/standard_rec.txt
所以哪里是不是需要改一下？这里？？
is_use_whole_click = True if mode == 'online' else False # True if online

xuetf · 2020-07-10T14:35:14Z

跑srgnn_main.py的时候conf里头mode设为offline

…

---Original--- From: "VideoRecSys"<notifications@github.com> Date: Fri, Jul 10, 2020 21:16 PM To: "xuetf/KDD_CUP_2020_Debiasing_Rush"<KDD_CUP_2020_Debiasing_Rush@noreply.github.com>; Cc: "xuetf"<476122294@qq.com>;"Comment"<comment@noreply.github.com>; Subject: Re: [xuetf/KDD_CUP_2020_Debiasing_Rush] great job (#3) 执行recall_main.py怎么出现下面的错误呢？ train/validate split done... create offline eval answer done... begin read item df... 108916 train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test (2643000, 4) (1223242, 4) using multi_processing phase: 7 train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test, target_phase=7 drop duplicates... recall-source-num=0 0 read sr-gnn results.... sr-gnn begin... sr-gnn rec path=user_data/sr-gnn/offline/7/data/standard_rec.txt Traceback (most recent call last): File "my_sr_gnn_eval2.py", line 62, in <module> recall_methods={'sr-gnn'}) File "/data1/xulm1/debiasing_rush/code/recall/do_recall_multi_processing.py", line 115, in do_multi_recall_results_multi_processing standard_sr_gnn_recall_item_dict = read_sr_gnn_results(phase, prefix='standard', adjust_type=adjust_type) File "/data1/xulm1/debiasing_rush/code/recall/sr_gnn/read_sr_gnn_results.py", line 54, in read_sr_gnn_results with open(sr_gnn_rec_path) as f: FileNotFoundError: [Errno 2] No such file or directory: 'user_data/sr-gnn/offline/7/data/standard_rec.txt' 我只用的sr-gnn的v1版本我看了下运行的时候是（展示部分代码） def sr_nn_version_1(phase, item_cnt): model_path = './models/v1/{}/{}'.format(mode, phase) file_path = '{}/{}/data'.format(sr_gnn_root_dir, phase) sr_gnn_lib_path = 'code/recall/sr_gnn/lib' if os.path.exists(model_path): print('model_path={} exists, delete'.format(model_path)) shutil.rmtree(model_path) if not os.path.exists(model_path): os.makedirs(model_path) os.system("python3 {sr_gnn_lib_path}/my_main_.py --task train --node_count {item_cnt} " "--checkpoint_path {model_path}/session_id --train_input {file_path}/train_item_seq_enhanced.txt " "--test_input {file_path}/test_item_seq.txt --gru_step 2 --epochs 10 " "--lr 0.001 --lr_dc 2 --dc_rate 0.1 --early_stop_epoch 3 " "--hidden_size 256 --batch_size 256 --max_len 20 --has_uid True " "--feature_init {file_path}/item_embed_mat.npy --sigma 8 ".format(sr_gnn_lib_path=sr_gnn_lib_path, item_cnt=item_cnt, model_path=model_path, file_path=file_path)) # generate rec checkpoint_path = find_checkpoint_path(phase, version='v1') prefix = 'standard_' rec_path = '{}/{}rec.txt'.format(file_path, prefix) print("WOC"*20) print(rec_path) os.system("python3 {sr_gnn_lib_path}/my_main_.py --task recommend --node_count {item_cnt} " "--checkpoint_path {checkpoint_path} --item_lookup {file_path}/item_lookup.txt " "--recommend_output {rec_path} --session_input {file_path}/test_user_sess.txt " "--gru_step 2 --hidden_size 256 --batch_size 256 --rec_extra_count 50 --has_uid True " "--feature_init {file_path}/item_embed_mat.npy " "--max_len 10 --sigma 8".format(sr_gnn_lib_path=sr_gnn_lib_path, item_cnt=item_cnt, checkpoint_path=checkpoint_path, file_path=file_path, rec_path=rec_path)) for phase in range(start_phase, now_phase+1): print('phase={}'.format(phase)) sr_nn_version_1(phase, phase_item_cnt_dict[phase]) 其中的rec_path是online的，而eval的时候是offline的 user_data/sr-gnn/online/7/data/standard_rec.txt 所以哪里是不是需要改一下？这里？？ is_use_whole_click = True if mode == 'online' else False # True if online — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

ucasiggcas · 2020-07-11T02:39:35Z

改成offline，相关代码及结果如下：

mode = 'offline' # offline/online: offline validation or online submission
start_phase = 7
now_phase = 9

2020-07-11 10:28:15,286 main:INFO:The passed save_path is not a valid checkpoint: ./models/v1/offline/7/session_id
2020-07-11 10:28:15,528 main:INFO:Total Batch: 852
2020-07-11 10:28:16.024855: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-07-11 10:28:16,340 main:INFO:Batch 0, Loss: 10.65137
2020-07-11 10:28:22,253 main:INFO:Batch 200, Loss: 10.16564
2020-07-11 10:28:28,126 main:INFO:Batch 400, Loss: 10.05143
2020-07-11 10:28:34,057 main:INFO:Batch 600, Loss: 9.96646
2020-07-11 10:28:39,952 main:INFO:Batch 800, Loss: 9.89508
2020-07-11 10:28:42,776 main:INFO:Test Loss: 9.6436  @50, Recall: 0.1406  MRR: 0.0150
2020-07-11 10:28:43,895 main:INFO:Test Loss: 9.6414  @50, Recall: 0.1562  MRR: 0.0261
2020-07-11 10:28:44,964 main:INFO:Test Loss: 9.5721  @50, Recall: 0.1211  MRR: 0.0142
2020-07-11 10:28:46,034 main:INFO:Test Loss: 9.5152  @50, Recall: 0.0938  MRR: 0.0220
2020-07-11 10:28:47,104 main:INFO:Test Loss: 9.5149  @50, Recall: 0.1250  MRR: 0.0227
2020-07-11 10:28:48,171 main:INFO:Test Loss: 9.4880  @50, Recall: 0.1406  MRR: 0.0246
2020-07-11 10:28:48,404 main:INFO:Test Loss: 9.6964  @50, Recall: 0.1273  MRR: 0.0224
2020-07-11 10:28:48,405 main:INFO:Epoch: 0 Train Loss: 9.8782 Test Loss: 9.5816 Recall: 0.1295 MRR: 0.0208
2020-07-11 10:28:48,405 main:INFO:Best Recall and MRR: 0.1295,  0.0208  Epoch: 0,  0
2020-07-11 10:28:49,054 main:INFO:Total Batch: 852
2020-07-11 10:28:49,084 main:INFO:Batch 0, Loss: 8.70335
2020-07-11 10:28:55,017 main:INFO:Batch 200, Loss: 8.60159
2020-07-11 10:29:00,918 main:INFO:Batch 400, Loss: 8.63617
2020-07-11 10:29:06,821 main:INFO:Batch 600, Loss: 8.67724
2020-07-11 10:29:12,725 main:INFO:Batch 800, Loss: 8.70898
2020-07-11 10:29:15,292 main:INFO:Test Loss: 9.5713  @50, Recall: 0.1406  MRR: 0.0197
2020-07-11 10:29:16,364 main:INFO:Test Loss: 9.5316  @50, Recall: 0.1680  MRR: 0.0376
2020-07-11 10:29:17,434 main:INFO:Test Loss: 9.4835  @50, Recall: 0.1367  MRR: 0.0195
2020-07-11 10:29:18,505 main:INFO:Test Loss: 9.4644  @50, Recall: 0.1172  MRR: 0.0227
2020-07-11 10:29:19,573 main:INFO:Test Loss: 9.4060  @50, Recall: 0.1445  MRR: 0.0252
2020-07-11 10:29:20,643 main:INFO:Test Loss: 9.3890  @50, Recall: 0.1719  MRR: 0.0272
2020-07-11 10:29:20,876 main:INFO:Test Loss: 9.7295  @50, Recall: 0.1273  MRR: 0.0285
2020-07-11 10:29:20,876 main:INFO:Epoch: 1 Train Loss: 8.7163 Test Loss: 9.5107 Recall: 0.1458 MRR: 0.0254
2020-07-11 10:29:20,876 main:INFO:Best Recall and MRR: 0.1458,  0.0254  Epoch: 1,  1
2020-07-11 10:29:21,360 main:INFO:Total Batch: 852
2020-07-11 10:29:21,391 main:INFO:Batch 0, Loss: 7.77480
2020-07-11 10:29:27,353 main:INFO:Batch 200, Loss: 7.65744
2020-07-11 10:29:33,264 main:INFO:Batch 400, Loss: 7.64605
2020-07-11 10:29:39,178 main:INFO:Batch 600, Loss: 7.63783
2020-07-11 10:29:45,073 main:INFO:Batch 800, Loss: 7.63648
2020-07-11 10:29:47,642 main:INFO:Test Loss: 9.6038  @50, Recall: 0.1328  MRR: 0.0203
2020-07-11 10:29:48,716 main:INFO:Test Loss: 9.5496  @50, Recall: 0.1680  MRR: 0.0392
2020-07-11 10:29:49,792 main:INFO:Test Loss: 9.5350  @50, Recall: 0.1367  MRR: 0.0207
2020-07-11 10:29:50,866 main:INFO:Test Loss: 9.5222  @50, Recall: 0.1172  MRR: 0.0261
2020-07-11 10:29:51,935 main:INFO:Test Loss: 9.4398  @50, Recall: 0.1445  MRR: 0.0253
2020-07-11 10:29:53,007 main:INFO:Test Loss: 9.4178  @50, Recall: 0.1719  MRR: 0.0300
2020-07-11 10:29:53,237 main:INFO:Test Loss: 9.8248  @50, Recall: 0.1273  MRR: 0.0292
2020-07-11 10:29:53,238 main:INFO:Epoch: 2 Train Loss: 7.6361 Test Loss: 9.5561 Recall: 0.1446 MRR: 0.0270
2020-07-11 10:29:53,238 main:INFO:Best Recall and MRR: 0.1458,  0.0270  Epoch: 1,  2
2020-07-11 10:29:53,725 main:INFO:Total Batch: 852
2020-07-11 10:29:53,756 main:INFO:Batch 0, Loss: 7.37897
2020-07-11 10:29:59,665 main:INFO:Batch 200, Loss: 7.46712
2020-07-11 10:30:05,581 main:INFO:Batch 400, Loss: 7.47876
2020-07-11 10:30:11,496 main:INFO:Batch 600, Loss: 7.48664
2020-07-11 10:30:17,376 main:INFO:Batch 800, Loss: 7.49818
2020-07-11 10:30:19,940 main:INFO:Test Loss: 9.6434  @50, Recall: 0.1367  MRR: 0.0196
2020-07-11 10:30:21,010 main:INFO:Test Loss: 9.5648  @50, Recall: 0.1602  MRR: 0.0399
2020-07-11 10:30:22,100 main:INFO:Test Loss: 9.5620  @50, Recall: 0.1367  MRR: 0.0205
2020-07-11 10:30:23,174 main:INFO:Test Loss: 9.5489  @50, Recall: 0.1172  MRR: 0.0260
2020-07-11 10:30:24,245 main:INFO:Test Loss: 9.4694  @50, Recall: 0.1406  MRR: 0.0253
2020-07-11 10:30:25,320 main:INFO:Test Loss: 9.4457  @50, Recall: 0.1719  MRR: 0.0299
2020-07-11 10:30:25,551 main:INFO:Test Loss: 9.8384  @50, Recall: 0.1273  MRR: 0.0286
2020-07-11 10:30:25,551 main:INFO:Epoch: 3 Train Loss: 7.5000 Test Loss: 9.5818 Recall: 0.1433 MRR: 0.0269
2020-07-11 10:30:25,552 main:INFO:Best Recall and MRR: 0.1458,  0.0270  Epoch: 1,  2
2020-07-11 10:30:25,579 main:INFO:Total Batch: 852
2020-07-11 10:30:25,610 main:INFO:Batch 0, Loss: 7.22226
2020-07-11 10:30:31,536 main:INFO:Batch 200, Loss: 7.33525
2020-07-11 10:30:37,459 main:INFO:Batch 400, Loss: 7.34088
2020-07-11 10:30:43,366 main:INFO:Batch 600, Loss: 7.34335
2020-07-11 10:30:49,260 main:INFO:Batch 800, Loss: 7.34473
2020-07-11 10:30:51,829 main:INFO:Test Loss: 9.6634  @50, Recall: 0.1328  MRR: 0.0196
2020-07-11 10:30:52,900 main:INFO:Test Loss: 9.5799  @50, Recall: 0.1562  MRR: 0.0396
2020-07-11 10:30:53,970 main:INFO:Test Loss: 9.5829  @50, Recall: 0.1250  MRR: 0.0204
2020-07-11 10:30:55,039 main:INFO:Test Loss: 9.5698  @50, Recall: 0.1133  MRR: 0.0265
2020-07-11 10:30:56,108 main:INFO:Test Loss: 9.4902  @50, Recall: 0.1406  MRR: 0.0250
2020-07-11 10:30:57,176 main:INFO:Test Loss: 9.4646  @50, Recall: 0.1641  MRR: 0.0284
2020-07-11 10:30:57,408 main:INFO:Test Loss: 9.8624  @50, Recall: 0.1273  MRR: 0.0289
2020-07-11 10:30:57,408 main:INFO:Epoch: 4 Train Loss: 7.3447 Test Loss: 9.6019 Recall: 0.1383 MRR: 0.0266
2020-07-11 10:30:57,409 main:INFO:Best Recall and MRR: 0.1458,  0.0270  Epoch: 1,  2
2020-07-11 10:30:57,435 main:INFO:Total Batch: 852
2020-07-11 10:30:57,466 main:INFO:Batch 0, Loss: 7.36846
2020-07-11 10:31:03,410 main:INFO:Batch 200, Loss: 7.32874
2020-07-11 10:31:09,330 main:INFO:Batch 400, Loss: 7.33265
2020-07-11 10:31:15,241 main:INFO:Batch 600, Loss: 7.32835
2020-07-11 10:31:21,188 main:INFO:Batch 800, Loss: 7.33017
2020-07-11 10:31:23,767 main:INFO:Test Loss: 9.6705  @50, Recall: 0.1328  MRR: 0.0195
2020-07-11 10:31:24,840 main:INFO:Test Loss: 9.5850  @50, Recall: 0.1562  MRR: 0.0393
2020-07-11 10:31:25,911 main:INFO:Test Loss: 9.5887  @50, Recall: 0.1250  MRR: 0.0210
2020-07-11 10:31:26,980 main:INFO:Test Loss: 9.5768  @50, Recall: 0.1133  MRR: 0.0263
2020-07-11 10:31:28,049 main:INFO:Test Loss: 9.4961  @50, Recall: 0.1406  MRR: 0.0251
2020-07-11 10:31:29,119 main:INFO:Test Loss: 9.4713  @50, Recall: 0.1641  MRR: 0.0289
2020-07-11 10:31:29,349 main:INFO:Test Loss: 9.8707  @50, Recall: 0.1273  MRR: 0.0287
2020-07-11 10:31:29,349 main:INFO:Epoch: 5 Train Loss: 7.3303 Test Loss: 9.6085 Recall: 0.1383 MRR: 0.0268
2020-07-11 10:31:29,349 main:INFO:Best Recall and MRR: 0.1458,  0.0270  Epoch: 1,  2
2020-07-11 10:31:29,350 main:INFO:After 3 epochs not improve, early stop
2020-07-11 10:31:29,350 main:INFO:Best Recall and MRR: 0.1458,  0.0270  Epoch: 1,  2
CheckPoint: ./models/v1/offline/7/session_id-2556

eval阶段还是有问题啊

train/validate split done...
create offline eval answer done...
begin read item df...
108916
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
(2643000, 4)
(1223242, 4)
using multi_processing
phase: 7
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test, target_phase=7
drop duplicates...
recall-source-num=0
0
read sr-gnn results....
sr-gnn begin...
sr-gnn rec path=user_data/sr-gnn/offline/7/data/standard_rec.txt
read sr-gnn done, num=1600
160000
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test, target_phase=7
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
(2643000, 4)
(1223242, 4)
       user_id  item_id      time  phase
2847         1    47611  0.983887      0
17907        1    76240  0.983770      0
18017        1    78142  0.983742      0
18604        1    89568  0.983763      0
19045        1    97795  0.983877      0
group done
num=159301, filter_num=699
read standard sr-gnn results done....
sr-gnn begin...
sr-gnn rec path=user_data/sr-gnn/offline/7/data/pos_node_weight_rec.txt
Traceback (most recent call last):
  File "my_sr_gnn_eval2.py", line 62, in <module>
    recall_methods={'sr-gnn'})
  File "/data1/xulm1/debiasing_rush/code/recall/do_recall_multi_processing.py", line 119, in do_multi_recall_results_multi_processing
    adjust_type=adjust_type)
  File "/data1/xulm1/debiasing_rush/code/recall/sr_gnn/read_sr_gnn_results.py", line 54, in read_sr_gnn_results
    with open(sr_gnn_rec_path) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'user_data/sr-gnn/offline/7/data/pos_node_weight_rec.txt'

这是为啥呢？

xuetf · 2020-07-11T02:49:16Z

改成offline，相关代码及结果如下：

mode = 'offline' # offline/online: offline validation or online submission
start_phase = 7
now_phase = 9

2020-07-11 10:28:15,286 main:INFO:The passed save_path is not a valid checkpoint: ./models/v1/offline/7/session_id
2020-07-11 10:28:15,528 main:INFO:Total Batch: 852
2020-07-11 10:28:16.024855: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-07-11 10:28:16,340 main:INFO:Batch 0, Loss: 10.65137
2020-07-11 10:28:22,253 main:INFO:Batch 200, Loss: 10.16564
2020-07-11 10:28:28,126 main:INFO:Batch 400, Loss: 10.05143
2020-07-11 10:28:34,057 main:INFO:Batch 600, Loss: 9.96646
2020-07-11 10:28:39,952 main:INFO:Batch 800, Loss: 9.89508
2020-07-11 10:28:42,776 main:INFO:Test Loss: 9.6436  @50, Recall: 0.1406  MRR: 0.0150
2020-07-11 10:28:43,895 main:INFO:Test Loss: 9.6414  @50, Recall: 0.1562  MRR: 0.0261
2020-07-11 10:28:44,964 main:INFO:Test Loss: 9.5721  @50, Recall: 0.1211  MRR: 0.0142
2020-07-11 10:28:46,034 main:INFO:Test Loss: 9.5152  @50, Recall: 0.0938  MRR: 0.0220
2020-07-11 10:28:47,104 main:INFO:Test Loss: 9.5149  @50, Recall: 0.1250  MRR: 0.0227
2020-07-11 10:28:48,171 main:INFO:Test Loss: 9.4880  @50, Recall: 0.1406  MRR: 0.0246
2020-07-11 10:28:48,404 main:INFO:Test Loss: 9.6964  @50, Recall: 0.1273  MRR: 0.0224
2020-07-11 10:28:48,405 main:INFO:Epoch: 0 Train Loss: 9.8782 Test Loss: 9.5816 Recall: 0.1295 MRR: 0.0208
2020-07-11 10:28:48,405 main:INFO:Best Recall and MRR: 0.1295,  0.0208  Epoch: 0,  0
2020-07-11 10:28:49,054 main:INFO:Total Batch: 852
2020-07-11 10:28:49,084 main:INFO:Batch 0, Loss: 8.70335
2020-07-11 10:28:55,017 main:INFO:Batch 200, Loss: 8.60159
2020-07-11 10:29:00,918 main:INFO:Batch 400, Loss: 8.63617
2020-07-11 10:29:06,821 main:INFO:Batch 600, Loss: 8.67724
2020-07-11 10:29:12,725 main:INFO:Batch 800, Loss: 8.70898
2020-07-11 10:29:15,292 main:INFO:Test Loss: 9.5713  @50, Recall: 0.1406  MRR: 0.0197
2020-07-11 10:29:16,364 main:INFO:Test Loss: 9.5316  @50, Recall: 0.1680  MRR: 0.0376
2020-07-11 10:29:17,434 main:INFO:Test Loss: 9.4835  @50, Recall: 0.1367  MRR: 0.0195
2020-07-11 10:29:18,505 main:INFO:Test Loss: 9.4644  @50, Recall: 0.1172  MRR: 0.0227
2020-07-11 10:29:19,573 main:INFO:Test Loss: 9.4060  @50, Recall: 0.1445  MRR: 0.0252
2020-07-11 10:29:20,643 main:INFO:Test Loss: 9.3890  @50, Recall: 0.1719  MRR: 0.0272
2020-07-11 10:29:20,876 main:INFO:Test Loss: 9.7295  @50, Recall: 0.1273  MRR: 0.0285
2020-07-11 10:29:20,876 main:INFO:Epoch: 1 Train Loss: 8.7163 Test Loss: 9.5107 Recall: 0.1458 MRR: 0.0254
2020-07-11 10:29:20,876 main:INFO:Best Recall and MRR: 0.1458,  0.0254  Epoch: 1,  1
2020-07-11 10:29:21,360 main:INFO:Total Batch: 852
2020-07-11 10:29:21,391 main:INFO:Batch 0, Loss: 7.77480
2020-07-11 10:29:27,353 main:INFO:Batch 200, Loss: 7.65744
2020-07-11 10:29:33,264 main:INFO:Batch 400, Loss: 7.64605
2020-07-11 10:29:39,178 main:INFO:Batch 600, Loss: 7.63783
2020-07-11 10:29:45,073 main:INFO:Batch 800, Loss: 7.63648
2020-07-11 10:29:47,642 main:INFO:Test Loss: 9.6038  @50, Recall: 0.1328  MRR: 0.0203
2020-07-11 10:29:48,716 main:INFO:Test Loss: 9.5496  @50, Recall: 0.1680  MRR: 0.0392
2020-07-11 10:29:49,792 main:INFO:Test Loss: 9.5350  @50, Recall: 0.1367  MRR: 0.0207
2020-07-11 10:29:50,866 main:INFO:Test Loss: 9.5222  @50, Recall: 0.1172  MRR: 0.0261
2020-07-11 10:29:51,935 main:INFO:Test Loss: 9.4398  @50, Recall: 0.1445  MRR: 0.0253
2020-07-11 10:29:53,007 main:INFO:Test Loss: 9.4178  @50, Recall: 0.1719  MRR: 0.0300
2020-07-11 10:29:53,237 main:INFO:Test Loss: 9.8248  @50, Recall: 0.1273  MRR: 0.0292
2020-07-11 10:29:53,238 main:INFO:Epoch: 2 Train Loss: 7.6361 Test Loss: 9.5561 Recall: 0.1446 MRR: 0.0270
2020-07-11 10:29:53,238 main:INFO:Best Recall and MRR: 0.1458,  0.0270  Epoch: 1,  2
2020-07-11 10:29:53,725 main:INFO:Total Batch: 852
2020-07-11 10:29:53,756 main:INFO:Batch 0, Loss: 7.37897
2020-07-11 10:29:59,665 main:INFO:Batch 200, Loss: 7.46712
2020-07-11 10:30:05,581 main:INFO:Batch 400, Loss: 7.47876
2020-07-11 10:30:11,496 main:INFO:Batch 600, Loss: 7.48664
2020-07-11 10:30:17,376 main:INFO:Batch 800, Loss: 7.49818
2020-07-11 10:30:19,940 main:INFO:Test Loss: 9.6434  @50, Recall: 0.1367  MRR: 0.0196
2020-07-11 10:30:21,010 main:INFO:Test Loss: 9.5648  @50, Recall: 0.1602  MRR: 0.0399
2020-07-11 10:30:22,100 main:INFO:Test Loss: 9.5620  @50, Recall: 0.1367  MRR: 0.0205
2020-07-11 10:30:23,174 main:INFO:Test Loss: 9.5489  @50, Recall: 0.1172  MRR: 0.0260
2020-07-11 10:30:24,245 main:INFO:Test Loss: 9.4694  @50, Recall: 0.1406  MRR: 0.0253
2020-07-11 10:30:25,320 main:INFO:Test Loss: 9.4457  @50, Recall: 0.1719  MRR: 0.0299
2020-07-11 10:30:25,551 main:INFO:Test Loss: 9.8384  @50, Recall: 0.1273  MRR: 0.0286
2020-07-11 10:30:25,551 main:INFO:Epoch: 3 Train Loss: 7.5000 Test Loss: 9.5818 Recall: 0.1433 MRR: 0.0269
2020-07-11 10:30:25,552 main:INFO:Best Recall and MRR: 0.1458,  0.0270  Epoch: 1,  2
2020-07-11 10:30:25,579 main:INFO:Total Batch: 852
2020-07-11 10:30:25,610 main:INFO:Batch 0, Loss: 7.22226
2020-07-11 10:30:31,536 main:INFO:Batch 200, Loss: 7.33525
2020-07-11 10:30:37,459 main:INFO:Batch 400, Loss: 7.34088
2020-07-11 10:30:43,366 main:INFO:Batch 600, Loss: 7.34335
2020-07-11 10:30:49,260 main:INFO:Batch 800, Loss: 7.34473
2020-07-11 10:30:51,829 main:INFO:Test Loss: 9.6634  @50, Recall: 0.1328  MRR: 0.0196
2020-07-11 10:30:52,900 main:INFO:Test Loss: 9.5799  @50, Recall: 0.1562  MRR: 0.0396
2020-07-11 10:30:53,970 main:INFO:Test Loss: 9.5829  @50, Recall: 0.1250  MRR: 0.0204
2020-07-11 10:30:55,039 main:INFO:Test Loss: 9.5698  @50, Recall: 0.1133  MRR: 0.0265
2020-07-11 10:30:56,108 main:INFO:Test Loss: 9.4902  @50, Recall: 0.1406  MRR: 0.0250
2020-07-11 10:30:57,176 main:INFO:Test Loss: 9.4646  @50, Recall: 0.1641  MRR: 0.0284
2020-07-11 10:30:57,408 main:INFO:Test Loss: 9.8624  @50, Recall: 0.1273  MRR: 0.0289
2020-07-11 10:30:57,408 main:INFO:Epoch: 4 Train Loss: 7.3447 Test Loss: 9.6019 Recall: 0.1383 MRR: 0.0266
2020-07-11 10:30:57,409 main:INFO:Best Recall and MRR: 0.1458,  0.0270  Epoch: 1,  2
2020-07-11 10:30:57,435 main:INFO:Total Batch: 852
2020-07-11 10:30:57,466 main:INFO:Batch 0, Loss: 7.36846
2020-07-11 10:31:03,410 main:INFO:Batch 200, Loss: 7.32874
2020-07-11 10:31:09,330 main:INFO:Batch 400, Loss: 7.33265
2020-07-11 10:31:15,241 main:INFO:Batch 600, Loss: 7.32835
2020-07-11 10:31:21,188 main:INFO:Batch 800, Loss: 7.33017
2020-07-11 10:31:23,767 main:INFO:Test Loss: 9.6705  @50, Recall: 0.1328  MRR: 0.0195
2020-07-11 10:31:24,840 main:INFO:Test Loss: 9.5850  @50, Recall: 0.1562  MRR: 0.0393
2020-07-11 10:31:25,911 main:INFO:Test Loss: 9.5887  @50, Recall: 0.1250  MRR: 0.0210
2020-07-11 10:31:26,980 main:INFO:Test Loss: 9.5768  @50, Recall: 0.1133  MRR: 0.0263
2020-07-11 10:31:28,049 main:INFO:Test Loss: 9.4961  @50, Recall: 0.1406  MRR: 0.0251
2020-07-11 10:31:29,119 main:INFO:Test Loss: 9.4713  @50, Recall: 0.1641  MRR: 0.0289
2020-07-11 10:31:29,349 main:INFO:Test Loss: 9.8707  @50, Recall: 0.1273  MRR: 0.0287
2020-07-11 10:31:29,349 main:INFO:Epoch: 5 Train Loss: 7.3303 Test Loss: 9.6085 Recall: 0.1383 MRR: 0.0268
2020-07-11 10:31:29,349 main:INFO:Best Recall and MRR: 0.1458,  0.0270  Epoch: 1,  2
2020-07-11 10:31:29,350 main:INFO:After 3 epochs not improve, early stop
2020-07-11 10:31:29,350 main:INFO:Best Recall and MRR: 0.1458,  0.0270  Epoch: 1,  2
CheckPoint: ./models/v1/offline/7/session_id-2556

eval阶段还是有问题啊

train/validate split done...
create offline eval answer done...
begin read item df...
108916
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
(2643000, 4)
(1223242, 4)
using multi_processing
phase: 7
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test, target_phase=7
drop duplicates...
recall-source-num=0
0
read sr-gnn results....
sr-gnn begin...
sr-gnn rec path=user_data/sr-gnn/offline/7/data/standard_rec.txt
read sr-gnn done, num=1600
160000
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test, target_phase=7
train_path=user_data/offline_underexpose_train, test_path=user_data/offline_underexpose_test
(2643000, 4)
(1223242, 4)
       user_id  item_id      time  phase
2847         1    47611  0.983887      0
17907        1    76240  0.983770      0
18017        1    78142  0.983742      0
18604        1    89568  0.983763      0
19045        1    97795  0.983877      0
group done
num=159301, filter_num=699
read standard sr-gnn results done....
sr-gnn begin...
sr-gnn rec path=user_data/sr-gnn/offline/7/data/pos_node_weight_rec.txt
Traceback (most recent call last):
  File "my_sr_gnn_eval2.py", line 62, in <module>
    recall_methods={'sr-gnn'})
  File "/data1/xulm1/debiasing_rush/code/recall/do_recall_multi_processing.py", line 119, in do_multi_recall_results_multi_processing
    adjust_type=adjust_type)
  File "/data1/xulm1/debiasing_rush/code/recall/sr_gnn/read_sr_gnn_results.py", line 54, in read_sr_gnn_results
    with open(sr_gnn_rec_path) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'user_data/sr-gnn/offline/7/data/pos_node_weight_rec.txt'

这是为啥呢？

v1版本对应的结果，eval的时候已经读取成功了：
sr-gnn rec path=user_data/sr-gnn/offline/7/data/standard_rec.txt
read sr-gnn done, num=1600
v2版本对应的结果读取失败，user_data/sr-gnn/offline/7/data/pos_node_weight_rec.txt
v2的sr-gnn你没有跑，当然失败了。

ucasiggcas · 2020-07-11T04:03:19Z

v1版本的指标似乎没有显示啊？大佬，哪里有问题吗？

xuetf · 2020-07-11T04:08:04Z

v1版本的指标似乎没有显示啊？大佬，哪里有问题吗？

你可以认真读一下代码。多路召回合并结果后才会进行评估。不是每路单独评估的。你想每路单独评估也可以改下代码。

ucasiggcas · 2020-07-11T07:45:23Z

请教下在训练时设置mode为offline数据似乎少了很多啊，只有800多batch，而设置为online就有几千多Total batch，这样做的原因是什么呢？多谢大佬

xuetf · 2020-07-11T08:26:35Z

线下的时候用的单个phase的数据跑，线上的时候用的所有数据跑 (二者gap比较固定)。请认真阅读README.md中Evaluation部分的说明。

…

---Original--- From: "VideoRecSys"<notifications@github.com> Date: Sat, Jul 11, 2020 15:45 PM To: "xuetf/KDD_CUP_2020_Debiasing_Rush"<KDD_CUP_2020_Debiasing_Rush@noreply.github.com>; Cc: "xuetf"<476122294@qq.com>;"Comment"<comment@noreply.github.com>; Subject: Re: [xuetf/KDD_CUP_2020_Debiasing_Rush] great job (#3) 请教下在训练时设置mode为offline数据似乎少了很多啊，只有800多batch，而设置为online就有几千多Total batch，这样做的原因是什么呢？多谢大佬 — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

ucasiggcas · 2020-07-12T06:04:53Z

哈喽大佬，有没有关于数据集的详细解释啊？能给个链接吗？多谢

xuetf · 2020-07-12T06:12:15Z

README上给了官网链接了

…

---Original--- From: "VideoRecSys"<notifications@github.com> Date: Sun, Jul 12, 2020 14:05 PM To: "xuetf/KDD_CUP_2020_Debiasing_Rush"<KDD_CUP_2020_Debiasing_Rush@noreply.github.com>; Cc: "xuetf"<476122294@qq.com>;"Comment"<comment@noreply.github.com>; Subject: Re: [xuetf/KDD_CUP_2020_Debiasing_Rush] great job (#3) 哈喽大佬，有没有关于数据集的详细解释啊？能给个链接吗？多谢 — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

ucasiggcas · 2020-07-13T02:38:30Z

大佬，我看您的数据中似乎没有用到underexpose_user_feat.csv？这个数据是官方没有提供吗？
It includes another file named underexpose_user_feat.csv, the columns of which are: user_id, user_age_level, user_gender, user_city_level

ucasiggcas · 2020-07-14T06:22:25Z

大佬，请教下在设置offline后get online_topk

online_total_click = pd.DataFrame()
for c in range(now_phase + 1):
    print('phase:', c)
    click_train = pd.read_csv('{}/{}-{}.csv'.format(online_train_path, train_file_prefix, c), header=None,
                              names=['user_id', 'item_id', 'time'])
    
    phase_test_path = "{}/{}-{}".format(test_path, test_file_prefix, c)
    click_test = pd.read_csv('{}/{}-{}.csv'.format(phase_test_path, test_file_prefix, c), header=None,
                             names=['user_id', 'item_id', 'time'])

    all_click = click_train.append(click_test)
    all_click['phase'] = c
    online_total_click = online_total_click.append(all_click)
print(online_total_click.shape)
online_total_click = online_total_click.drop_duplicates(['user_id', 'item_id', 'time'])
print(online_total_click.shape)
online_top50_click_np = online_total_click['item_id'].value_counts().index[:50].values
online_top50_click = ','.join([str(i) for i in online_top50_click_np])

这个得到的是online的train和offline的test的合并后的数据啊，能这么做吗？

ucasiggcas · 2020-07-14T12:37:31Z

hi,dear
confused about the norm,

def process_item_feat(item_feat_df):
    processed_item_feat_df = item_feat_df.copy()
    # norm
    txt_item_feat_np = processed_item_feat_df[txt_dense_feat].values
    img_item_feat_np = processed_item_feat_df[img_dense_feat].values
    txt_item_feat_np = txt_item_feat_np / np.linalg.norm(txt_item_feat_np, axis=1, keepdims=True)
    img_item_feat_np = img_item_feat_np / np.linalg.norm(img_item_feat_np, axis=1, keepdims=True)
    processed_item_feat_df[txt_dense_feat] = pd.DataFrame(txt_item_feat_np, columns=txt_dense_feat)
    processed_item_feat_df[img_dense_feat] = pd.DataFrame(img_item_feat_np, columns=img_dense_feat)

    return processed_item_feat_df

其中的归一化是按照行进行的，每列是特征，为啥按照行进行归一化呢？举例如下：

>>> xx=np.random.randn(3,4)
>>> xx
array([[ 0.18874834,  0.37971162,  0.8287003 , -0.95896989],
       [-0.07977954,  0.04206023, -0.23647192, -0.36731412],
       [ 1.77722951,  0.68746666, -1.77812892,  0.54136854]])
>>> np.linalg.norm(xx, axis=1, keepdims=True)
array([[1.33647832],
       [0.4460633 ],
       [2.66194994]])

这说明每行计算一个2范数

xuetf · 2020-07-14T16:02:12Z

注意看特征的含义，128维图片向量，128维文本向量，图片和文本向量分别做归一化。

…

---Original--- From: "VideoRecSys"<notifications@github.com> Date: Tue, Jul 14, 2020 20:37 PM To: "xuetf/KDD_CUP_2020_Debiasing_Rush"<KDD_CUP_2020_Debiasing_Rush@noreply.github.com>; Cc: "xuetf"<476122294@qq.com>;"Comment"<comment@noreply.github.com>; Subject: Re: [xuetf/KDD_CUP_2020_Debiasing_Rush] great job (#3) hi,dear confused about the norm, def process_item_feat(item_feat_df): processed_item_feat_df = item_feat_df.copy() # norm txt_item_feat_np = processed_item_feat_df[txt_dense_feat].values img_item_feat_np = processed_item_feat_df[img_dense_feat].values txt_item_feat_np = txt_item_feat_np / np.linalg.norm(txt_item_feat_np, axis=1, keepdims=True) img_item_feat_np = img_item_feat_np / np.linalg.norm(img_item_feat_np, axis=1, keepdims=True) processed_item_feat_df[txt_dense_feat] = pd.DataFrame(txt_item_feat_np, columns=txt_dense_feat) processed_item_feat_df[img_dense_feat] = pd.DataFrame(img_item_feat_np, columns=img_dense_feat) return processed_item_feat_df 其中的归一化是按照行进行的，每列是特征，为啥按照行进行归一化呢？举例如下： >>> xx=np.random.randn(3,4) >>> xx array([[ 0.18874834, 0.37971162, 0.8287003 , -0.95896989], [-0.07977954, 0.04206023, -0.23647192, -0.36731412], [ 1.77722951, 0.68746666, -1.77812892, 0.54136854]]) >>> np.linalg.norm(xx, axis=1, keepdims=True) array([[1.33647832], [0.4460633 ], [2.66194994]]) 这说明每行计算一个2范数 — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

ucasiggcas · 2020-07-15T01:42:44Z

这个看出来了，我的意思是您的做法是对行进行归一化axis=1，
每列是个特征，为啥不是对列进行归一化呢？axis=0？？
多谢

ucasiggcas · 2020-07-15T08:39:29Z

请教大佬这里的啥意思啊？

    def cal_occ(sentence):
        for i, word in enumerate(sentence):
            hist_len = len(sentence)
            co_occur_dict.setdefault(word, {})
            for j in range(max(i - window, 0), min(i + window, hist_len)):
                if j == i or word == sentence[j]: continue
                loc_weight = (0.9 ** abs(i - j))
                co_occur_dict[word].setdefault(sentence[j], 0)
                co_occur_dict[word][sentence[j]] += loc_weight

其中的

            for j in range(max(i - window, 0), min(i + window, hist_len)):
                if j == i or word == sentence[j]: continue
                loc_weight = (0.9 ** abs(i - j))
                co_occur_dict[word].setdefault(sentence[j], 0)
                co_occur_dict[word][sentence[j]] += loc_weight

帮忙看下，多谢

ucasiggcas · 2020-07-15T09:37:02Z

哈喽，大佬这个函数是填充那些没有txt，img特征的item的吗？
def fill_item_feat(processed_item_feat_df, item_content_vec_dict):
如果item都有这些特征是不是就不需要填充了？

ucasiggcas · 2020-07-18T12:11:53Z

哈喽，大佬，我可以将phase7,8,9的数据搁在一起进行预测吗？
也就是不区分phase了，由训练集直接得到给所有user推items，这样做可以吗？

ucasiggcas · 2020-07-20T09:03:57Z

hi，大佬
faiss都没有引入，为啥不报错呢？好诡异啊
请教下大佬是怎么做到的？
在notebook中的文件Rush_0615.ipynb

xuetf · 2020-07-21T01:14:54Z

麻烦您认真读一下代码，全局搜索下哪里用到Faiss代码，是否执行了该代码，并确认Faiss是否已经安装。

…

---Original--- From: "VideoRecSys"<notifications@github.com> Date: Mon, Jul 20, 2020 17:04 PM To: "xuetf/KDD_CUP_2020_Debiasing_Rush"<KDD_CUP_2020_Debiasing_Rush@noreply.github.com>; Cc: "xuetf"<476122294@qq.com>;"Comment"<comment@noreply.github.com>; Subject: Re: [xuetf/KDD_CUP_2020_Debiasing_Rush] great job (#3) hi，大佬 faiss都没有引入，为啥不报错呢？好诡异啊请教下大佬是怎么做到的？在notebook中的文件Rush_0615.ipynb — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

xuetf · 2020-07-21T01:16:05Z

您好，可以这么做，您可以在您的项目中尝试使用更多的数据，比赛里这么做效果是下降的。

…

---Original--- From: "VideoRecSys"<notifications@github.com> Date: Sat, Jul 18, 2020 20:12 PM To: "xuetf/KDD_CUP_2020_Debiasing_Rush"<KDD_CUP_2020_Debiasing_Rush@noreply.github.com>; Cc: "xuetf"<476122294@qq.com>;"Comment"<comment@noreply.github.com>; Subject: Re: [xuetf/KDD_CUP_2020_Debiasing_Rush] great job (#3) 哈喽，大佬，我可以将phase7,8,9的数据搁在一起进行预测吗？也就是不区分phase了，由训练集直接得到给所有user推items，这样做可以吗？ — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

ucasiggcas · 2020-07-21T02:44:11Z

合成一个phase的结果如下：

score=[0.03093603 0.04635601 0.082798   0.10687023]
score=0.03093603439629078,
hitrate_50_full=0.08279800415039062,
ndcg_50_full=0.03093603439629078,
hitrate_50_half=0.10687022656202316, 
ndcg_50_half=0.04635601490736008

这是正常吗？

ucasiggcas · 2020-07-23T08:30:10Z

我看训练集和测试集的user都是分开的，不同的，我想获取所有user的推荐该怎么做呢？

ucasiggcas · 2020-07-26T12:30:38Z

请教下这是啥原因啊，需要调啥参数吗？

2020-07-26 20:29:47,003 main:INFO:Test Loss: nan  @50, Recall: 0.0000  MRR: 0.0000
2020-07-26 20:29:47,962 main:INFO:Test Loss: nan  @50, Recall: 0.0000  MRR: 0.0000
2020-07-26 20:29:48,702 main:INFO:Test Loss: nan  @50, Recall: 0.0000  MRR: 0.0000
2020-07-26 20:29:49,436 main:INFO:Test Loss: nan  @50, Recall: 0.0000  MRR: 0.0000
2020-07-26 20:29:50,184 main:INFO:Test Loss: nan  @50, Recall: 0.0039  MRR: 0.0002
2020-07-26 20:29:50,910 main:INFO:Test Loss: nan  @50, Recall: 0.0000  MRR: 0.0000
2020-07-26 20:29:51,680 main:INFO:Test Loss: nan  @50, Recall: 0.0000  MRR: 0.0000
2020-07-26 20:29:52,311 main:INFO:Test Loss: nan  @50, Recall: 0.0000  MRR: 0.0000

ucasiggcas · 2020-08-02T12:00:49Z

哈喽，大佬，
请教下你这个有负采样吗？SR-GNN的提升版本
多谢

ucasiggcas · 2020-08-14T08:50:33Z

哈喽，大佬，这个训练速度似乎有点慢啊，咋整啊，

ucasiggcas · 2020-08-16T07:48:43Z

另外一个奇怪的问题是，我在训练中如果没有验证集，训练完直接infer却不对，
举例：user_A 总的点击items=[a,b,c,d,f]
您的意思是训练集items[:-2]，验证集items[:-1]，infer items也就是最后一个item预测
而我直接将训练集items[:-1]，训练中不测试，训练完直接推断最后一个item,这样得到的HR和NDCG全是0
很奇怪啊，请问下为啥呢？多谢大佬

great job #3

great job #3

Comments

ucasiggcas commented Jun 23, 2020

xuetf commented Jun 23, 2020

xuetf commented Jun 25, 2020

ucasiggcas commented Jun 28, 2020

xuetf commented Jun 29, 2020

ucasiggcas commented Jul 7, 2020

ucasiggcas commented Jul 7, 2020 • edited Loading

ucasiggcas commented Jul 8, 2020

xuetf commented Jul 8, 2020

xuetf commented Jul 8, 2020

xuetf commented Jul 8, 2020

ucasiggcas commented Jul 9, 2020

ucasiggcas commented Jul 10, 2020

xuetf commented Jul 10, 2020 via email

ucasiggcas commented Jul 11, 2020

xuetf commented Jul 11, 2020

ucasiggcas commented Jul 11, 2020

xuetf commented Jul 11, 2020

ucasiggcas commented Jul 11, 2020

xuetf commented Jul 11, 2020 via email

ucasiggcas commented Jul 12, 2020

xuetf commented Jul 12, 2020 via email

ucasiggcas commented Jul 13, 2020

ucasiggcas commented Jul 14, 2020

ucasiggcas commented Jul 14, 2020

xuetf commented Jul 14, 2020 via email

ucasiggcas commented Jul 15, 2020

ucasiggcas commented Jul 15, 2020

ucasiggcas commented Jul 15, 2020

ucasiggcas commented Jul 18, 2020

ucasiggcas commented Jul 20, 2020

xuetf commented Jul 21, 2020 via email

xuetf commented Jul 21, 2020 via email

ucasiggcas commented Jul 21, 2020

ucasiggcas commented Jul 23, 2020

ucasiggcas commented Jul 26, 2020

ucasiggcas commented Aug 2, 2020

ucasiggcas commented Aug 14, 2020

ucasiggcas commented Aug 16, 2020

ucasiggcas commented Jul 7, 2020 •

edited

Loading