Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance issue #7

Open
ujin0415 opened this issue Oct 26, 2024 · 3 comments
Open

performance issue #7

ujin0415 opened this issue Oct 26, 2024 · 3 comments

Comments

@ujin0415
Copy link

ujin0415 commented Oct 26, 2024

Hi, @VlSomers ! I'm very interested in your research so I ran your default code based on Solider and Occluded-Duke dataset twice. And then I faced some weird performance improvements after the first training. The mAP of the first training was 67.45% and 73.47% in the second trial which is slightly below the performance on the paper. Also, the convergence speed was much faster in the second trial. The settings were all the same in both trials. I'm so wondering how this phenomenon can happen. Are there any related settings in the training process?

@VlSomers
Copy link
Owner

Hi @ujin0415 , thank you for your interest, I would be happy to help you with your issues! I never experienced what you are describing, there is of course some variance from one run to the other (maybe +/-1 %), but never on that scale. Can you make a third run to see if the training is more stable now? There is no related settings in the training that I can think of that can explain your issue unfortunately. However, as I explained in the repo, the codebase has undergone a major refactoring before the public release, especially for the configuration system, so maybe there is some difference between the released configs and the one I used for the experiences in the paper. If you cannot reproduce the performance, can you share your configs and logs, and I will see if there is some misconfigured parameters. Let me know if you solved your issue!

@ujin0415
Copy link
Author

ujin0415 commented Oct 29, 2024

Thank you for your kind reply! I tried several more runs but instability still exists.
Unfortunately, we didn't record the first two runs. So I attach our log file(It only contains 49 epochs :( ) for the last run.
output.txt

Thank you so much!

@VlSomers
Copy link
Owner

Can you share the full logs of an instable run? Did you also try to run the Swin Imagenet version? Fine-tuning the SOLIDER backbone has always been more difficult than fine-tuning the Imagenet one, I'm therefore wondering if your instabilities are related to SOLIDER. Unfortunately I'm very busy right now so I won't have the time to reproduce you experiments until late November. However, I recently made some runs with a private fork of this public repository, I everything went smoothly, so this might also be an issue related to your environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants