Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lcnn.box.BoxKeyError error while training #15

Closed
velutis opened this issue Jan 8, 2020 · 13 comments
Closed

lcnn.box.BoxKeyError error while training #15

velutis opened this issue Jan 8, 2020 · 13 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@velutis
Copy link

velutis commented Jan 8, 2020

Hi, i try to train this model how described in manual, but i am getting error
cnn.box.BoxKeyError: "'Box' object has no attribute 'image'"\n'
i use the pre-processed dataset wireframe.tar.xz from google drive. Any hints how ti fix this? Thanks

@zhou13 zhou13 added the question Further information is requested label Jan 8, 2020
@zhou13
Copy link
Owner

zhou13 commented Jan 8, 2020

Do you properly specify the config file?

@velutis
Copy link
Author

velutis commented Jan 8, 2020

i use default original config checked out from github config/wireframe.yaml
my command is line in manual: python ./train.py -d 0 --identifier baseline config/wireframe.yaml

@zhou13
Copy link
Owner

zhou13 commented Jan 8, 2020

Does cat config/wireframe.yaml output anything?

@velutis
Copy link
Author

velutis commented Jan 8, 2020

yes of course:)

io:

  logdir: logs/
  datadir: data/wireframe/
  resume_from:
  num_workers: 4
  tensorboard_port: 0
  validation_interval: 24000

model:
  image:
      mean: [109.730, 103.832, 98.681]
      stddev: [22.275, 22.124, 23.229]

  batch_size: 6

  # backbone multi-task parameters
  head_size: [[2], [1], [2]]
  loss_weight:
    jmap: 8.0
    lmap: 0.5
    joff: 0.25
    lpos: 1
    lneg: 1

  # backbone parameters
  backbone: stacked_hourglass
  depth: 4
  num_stacks: 2
  num_blocks: 1

  # sampler parameters
  ## static sampler
  n_stc_posl: 300
  n_stc_negl: 40

  ## dynamic sampler
  n_dyn_junc: 300
  n_dyn_posl: 300
  n_dyn_negl: 80
  n_dyn_othr: 600

  # LOIPool layer parameters
  n_pts0: 32
  n_pts1: 8

  # line verification network parameters
  dim_loi: 128
  dim_fc: 1024

  # maximum junction and line outputs
  n_out_junc: 250
  n_out_line: 2500

  # additional ablation study parameters
  use_cood: 0
  use_slop: 0
  use_conv: 0

  # junction threashold for evaluation (See #5)
  eval_junc_thres: 0.008

optim:
  name: Adam
  lr: 4.0e-4
  amsgrad: True
  weight_decay: 1.0e-4
  max_epoch: 24
  lr_decay_epoch: 10

@zhou13
Copy link
Owner

zhou13 commented Jan 8, 2020

Do you have pyyaml installed?

@zhou13
Copy link
Owner

zhou13 commented Jan 8, 2020

If that still does not work, you may want to debug the script by printing C and M in the main() of train.py. It should output the content of your yaml.

@velutis
Copy link
Author

velutis commented Jan 8, 2020

It was not installed but after installation result is the same, ok ill try to debug

@zhou13
Copy link
Owner

zhou13 commented Jan 8, 2020

Then you are likely not following the installation instruction exactly with conda. You can also try that first if that is the case. BTW, if you don't have pyyaml installed, it is unlikely you can import yaml in line 30 of train.py. Maybe you have some package conflicts.

@velutis
Copy link
Author

velutis commented Jan 8, 2020

well i reacreated my environment, but it is still the same, here is C and M values:

C: {'io': {'logdir': 'logs/', 'datadir': 'data/wireframe/', 'resume_from': None, 'num_workers': 4, 'tensorboard_port': 0, 'validation_interval': 24000}, 'model': {'image': {'mean': [109.73, 103.832, 98.681], 'stddev': [22.275, 22.124, 23.229]}, 'batch_size': 6, 'head_size': [[2], [1], [2]], 'loss_weight': {'jmap': 8.0, 'lmap': 0.5, 'joff': 0.25, 'lpos': 1, 'lneg': 1}, 'backbone': 'stacked_hourglass', 'depth': 4, 'num_stacks': 2, 'num_blocks': 1, 'n_stc_posl': 300, 'n_stc_negl': 40, 'n_dyn_junc': 300, 'n_dyn_posl': 300, 'n_dyn_negl': 80, 'n_dyn_othr': 600, 'n_pts0': 32, 'n_pts1': 8, 'dim_loi': 128, 'dim_fc': 1024, 'n_out_junc': 250, 'n_out_line': 2500, 'use_cood': 0, 'use_slop': 0, 'use_conv': 0, 'eval_junc_thres': 0.008}, 'optim': {'name': 'Adam', 'lr': 0.0004, 'amsgrad': True, 'weight_decay': 0.0001, 'max_epoch': 24, 'lr_decay_epoch': 10}}

M: {'image': {'mean': [109.73, 103.832, 98.681], 'stddev': [22.275, 22.124, 23.229]}, 'batch_size': 6, 'head_size': [[2], [1], [2]], 'loss_weight': {'jmap': 8.0, 'lmap': 0.5, 'joff': 0.25, 'lpos': 1, 'lneg': 1}, 'backbone': 'stacked_hourglass', 'depth': 4, 'num_stacks': 2, 'num_blocks': 1, 'n_stc_posl': 300, 'n_stc_negl': 40, 'n_dyn_junc': 300, 'n_dyn_posl': 300, 'n_dyn_negl': 80, 'n_dyn_othr': 600, 'n_pts0': 32, 'n_pts1': 8, 'dim_loi': 128, 'dim_fc': 1024, 'n_out_junc': 250, 'n_out_line': 2500, 'use_cood': 0, 'use_slop': 0, 'use_conv': 0, 'eval_junc_thres': 0.008}

so line when code throws error is image = (image - M.image.mean) / M.image.stddev File "lcnn\\lcnn\\datasets.py", line 35 in M printed form train.py i see image key, so i tried to print M in datasets.py file and got empty value '{}'.

PS i don't know can it be the cause - i'm trying this on windows environment.

@zhou13
Copy link
Owner

zhou13 commented Jan 8, 2020

Could you add print(hex(id(M))) after line from lcnn.config import C, M in both file and see if they are the same?

@velutis
Copy link
Author

velutis commented Jan 9, 2020

id is the same at beginning but then i get them printed again before exception:
will show all command line output:

./train.py -d 0 --identifier baseline  config/wireframe.yaml
hex from datasets
0x265ce4a1c48
hex from train
0x265ce4a1c48
debug C
{'io': {'logdir': 'logs/', 'datadir': 'data/wireframe/', 'resume_from': None, 'num_workers': 4, 'tensorboard_port': 0, 'validation_interval': 24000}, 'model': {'image': {'mean': [109.73, 103.832, 98.681], 'stddev': [22.275, 22.124, 23.229]}, 'batch_size': 6, 'head_size': [[2], [1], [2]], 'loss_weight': {'jmap': 8.0, 'lmap': 0.5, 'joff': 0.25, 'lpos': 1, 'lneg': 1}, 'backbone': 'stacked_hourglass', 'depth': 4, 'num_stacks': 2, 'num_blocks': 1, 'n_stc_posl': 300, 'n_stc_negl': 40, 'n_dyn_junc': 300, 'n_dyn_posl': 300, 'n_dyn_negl': 80, 'n_dyn_othr': 600, 'n_pts0': 32, 'n_pts1': 8, 'dim_loi': 128, 'dim_fc': 1024, 'n_out_junc': 250, 'n_out_line': 2500, 'use_cood': 0, 'use_slop': 0, 'use_conv': 0, 'eval_junc_thres': 0.008}, 'optim': {'name': 'Adam', 'lr': 0.0004, 'amsgrad': True, 'weight_decay': 0.0001, 'max_epoch': 24, 'lr_decay_epoch': 10}}
debug M
{'image': {'mean': [109.73, 103.832, 98.681], 'stddev': [22.275, 22.124, 23.229]}, 'batch_size': 6, 'head_size': [[2], [1], [2]], 'loss_weight': {'jmap': 8.0, 'lmap': 0.5, 'joff': 0.25, 'lpos': 1, 'lneg': 1}, 'backbone': 'stacked_hourglass', 'depth': 4, 'num_stacks': 2, 'num_blocks': 1, 'n_stc_posl': 300, 'n_stc_negl': 40, 'n_dyn_junc': 300, 'n_dyn_posl': 300, 'n_dyn_negl': 80, 'n_dyn_othr': 600, 'n_pts0': 32, 'n_pts1': 8, 'dim_loi': 128, 'dim_fc': 1024, 'n_out_junc': 250, 'n_out_line': 2500, 'use_cood': 0, 'use_slop': 0, 'use_conv': 0, 'eval_junc_thres': 0.008}
END debug
{   'io': {   'datadir': 'data/wireframe/',
              'logdir': 'logs/',
              'num_workers': 4,
              'resume_from': None,
              'tensorboard_port': 0,
              'validation_interval': 24000},
    'model': {   'backbone': 'stacked_hourglass',
                 'batch_size': 6,
                 'depth': 4,
                 'dim_fc': 1024,
                 'dim_loi': 128,
                 'eval_junc_thres': 0.008,
                 'head_size': <BoxList: [[2], [1], [2]]>,
                 'image': {   'mean': <BoxList: [109.73, 103.832, 98.681]>,
                              'stddev': <BoxList: [22.275, 22.124, 23.229]>},
                 'loss_weight': {   'jmap': 8.0,
                                    'joff': 0.25,
                                    'lmap': 0.5,
                                    'lneg': 1,
                                    'lpos': 1},
                 'n_dyn_junc': 300,
                 'n_dyn_negl': 80,
                 'n_dyn_othr': 600,
                 'n_dyn_posl': 300,
                 'n_out_junc': 250,
                 'n_out_line': 2500,
                 'n_pts0': 32,
                 'n_pts1': 8,
                 'n_stc_negl': 40,
                 'n_stc_posl': 300,
                 'num_blocks': 1,
                 'num_stacks': 2,
                 'use_conv': 0,
                 'use_cood': 0,
                 'use_slop': 0},
    'optim': {   'amsgrad': True,
                 'lr': 0.0004,
                 'lr_decay_epoch': 10,
                 'max_epoch': 24,
                 'name': 'Adam',
                 'weight_decay': 0.0001}}
Let's use 1 GPU(s)!
ntrain: 19988
nvalid: 462
outdir: logs/200109-081549-0082f17-baseline
TensorFlow installation not found - running with reduced feature set.
TensorBoard 1.15.0 at http://VAIDAS-PC:58769/ (Press CTRL+C to quit)
hex from datasets
0x292825abfa8
hex from train
0x292825abfa8
hex from datasets
0x1d31e91afa8
hex from train
0x1d31e91afa8
hex from datasets
0x246271bd048
hex from train
0x246271bd048
hex from datasets
0x24ea739c0a8
hex from train
0x24ea739c0a8
Traceback (most recent call last):....

@zhou13
Copy link
Owner

zhou13 commented Jan 9, 2020

From your log, it seems that there are multiple id: 0x265ce4a1c48, 0x292825abfa8, 0x246271bd048. This shouldn't happen.

We never test our algorithm on Windows. You remind me that python on Windows has a different implementation of multiprocess that does not inherit parent's variable: https://rhodesmill.org/brandon/2010/python-multiprocessing-linux-windows/. Could you change num_workers in yaml to 0 to disable multi-process and re-test it?

@velutis
Copy link
Author

velutis commented Jan 9, 2020

Yes that was the problem. Thank You!

@velutis velutis closed this as completed Jan 9, 2020
@zhou13 zhou13 added the enhancement New feature or request label Jan 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants