Closed
Description
opened on Jul 21, 2023
Description
All the CUDA jobs across several PRs (e.g. #5997, #5999) started failing yesterday, with the following errors.
The following packages were automatically installed and are no longer required:
clamav clamav-base clamav-freshclam libclamav9 libllvm3.9 libtfm1
linux-azure-5.4-cloud-tools-5.4.0-1031
linux-azure-5.4-cloud-tools-5.4.0-1032
linux-azure-5.4-cloud-tools-5.4.0-1034
linux-azure-5.4-cloud-tools-5.4.0-1035
linux-azure-5.4-cloud-tools-5.4.0-1036
linux-azure-5.4-cloud-tools-5.4.0-1039
linux-azure-5.4-cloud-tools-5.4.0-1040
linux-azure-5.4-cloud-tools-5.4.0-1041
linux-azure-5.4-cloud-tools-5.4.0-1043
linux-azure-5.4-cloud-tools-5.4.0-1044
linux-azure-5.4-cloud-tools-5.4.0-1046
linux-azure-5.4-cloud-tools-5.4.0-1047
linux-azure-5.4-cloud-tools-5.4.0-1048
linux-azure-5.4-cloud-tools-5.4.0-10[51](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:52)
linux-azure-5.4-cloud-tools-5.4.0-10[55](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:56)
linux-azure-5.4-cloud-tools-5.4.0-10[56](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:57)
linux-azure-5.4-cloud-tools-5.4.0-10[58](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:59)
linux-azure-5.4-cloud-tools-5.4.0-10[59](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:60)
linux-azure-5.4-cloud-tools-5.4.0-10[61](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:62)
linux-azure-5.4-cloud-tools-5.4.0-10[62](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:63)
linux-azure-5.4-cloud-tools-5.4.0-10[63](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:64)
linux-azure-5.4-cloud-tools-5.4.0-10[64](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:65)
linux-azure-5.4-cloud-tools-5.4.0-10[65](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:66)
linux-azure-5.4-cloud-tools-5.4.0-10[67](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:68)
linux-azure-5.4-cloud-tools-5.4.0-10[68](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:69)
linux-azure-5.4-cloud-tools-5.4.0-10[69](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:70)
linux-azure-5.4-cloud-tools-5.4.0-10[70](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:71)
linux-azure-5.4-cloud-tools-5.4.0-10[72](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:73)
linux-azure-5.4-cloud-tools-5.4.0-10[73](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:74)
linux-azure-5.4-cloud-tools-5.4.0-10[74](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:75)
linux-azure-5.4-cloud-tools-5.4.0-10[77](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:78)
linux-azure-5.4-cloud-tools-5.4.0-10[78](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:79)
linux-azure-5.4-cloud-tools-5.4.0-10[80](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:81)
linux-azure-5.4-cloud-tools-5.4.0-10[83](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:84)
linux-azure-5.4-cloud-tools-5.4.0-10[85](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:86)
linux-azure-5.4-cloud-tools-5.4.0-10[86](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:87)
linux-azure-5.4-cloud-tools-5.4.0-1089
... truncated ...
linux-azure-5.4-tools-5.4.0-1091 nvidia-kernel-source-515 nvidia-utils-515
xserver-xorg-video-nvidia-515
Use 'sudo apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 115 not upgraded.
3 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Setting up azure-mdsd (1.8.0-build.master.189) ...
Configuration file '/etc/default/mdsd'
==> Modified (by you or by a script) since installation.
==> Package distributor has shipped an updated version.
What would you like to do about it ? Your options are:
Y or I : install the package maintainer's version
N or O : keep your currently-installed version
D : show the differences between the versions
Z : start a shell to examine the situation
The default action is to keep your current version.
*** mdsd (Y/I/N/O/D/Z) [default=N] ? dpkg: error processing package azure-mdsd (--configure):
end of file on stdin at conffile prompt
Setting up auoms (2.7.0.11) ...
Configuration file '/etc/opt/microsoft/auoms/auoms.conf'
==> Modified (by you or by a script) since installation.
==> Package distributor has shipped an updated version.
What would you like to do about it ? Your options are:
Y or I : install the package maintainer's version
N or O : keep your currently-installed version
D : show the differences between the versions
Z : start a shell to examine the situation
The default action is to keep your current version.
*** auoms.conf (Y/I/N/O/D/Z) [default=N] ? dpkg: error processing package auoms (--configure):
end of file on stdin at conffile prompt
dpkg: dependency problems prevent configuration of azsec-monitor:
azsec-monitor depends on auoms (>= 2.4.5); however:
Package auoms is not configured yet.
Version of auoms on system, provided by auoms:amd64, is <none>.
dpkg: error processing package azsec-monitor (--configure):
dependency problems - leaving unconfigured
No apport report written because the error message indicates its a followup error from a previous failure.
Errors were encountered while processing:
azure-mdsd
auoms
azsec-monitor
E: Sub-process /usr/bin/dpkg returned an error code (1)
Error: Process completed with exit code [100](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:101).
Reproducible example
This is happening on master
and all PRs.
Additional Comments
Some resources that might be helpful:
Activity