Skip to content

[ci] CUDA jobs failing when installing packages #6001

Closed
@jameslamb

Description

Description

All the CUDA jobs across several PRs (e.g. #5997, #5999) started failing yesterday, with the following errors.

The following packages were automatically installed and are no longer required:
  clamav clamav-base clamav-freshclam libclamav9 libllvm3.9 libtfm1
  linux-azure-5.4-cloud-tools-5.4.0-1031
  linux-azure-5.4-cloud-tools-5.4.0-1032
  linux-azure-5.4-cloud-tools-5.4.0-1034
  linux-azure-5.4-cloud-tools-5.4.0-1035
  linux-azure-5.4-cloud-tools-5.4.0-1036
  linux-azure-5.4-cloud-tools-5.4.0-1039
  linux-azure-5.4-cloud-tools-5.4.0-1040
  linux-azure-5.4-cloud-tools-5.4.0-1041
  linux-azure-5.4-cloud-tools-5.4.0-1043
  linux-azure-5.4-cloud-tools-5.4.0-1044
  linux-azure-5.4-cloud-tools-5.4.0-1046
  linux-azure-5.4-cloud-tools-5.4.0-1047
  linux-azure-5.4-cloud-tools-5.4.0-1048
  linux-azure-5.4-cloud-tools-5.4.0-10[51](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:52)
  linux-azure-5.4-cloud-tools-5.4.0-10[55](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:56)
  linux-azure-5.4-cloud-tools-5.4.0-10[56](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:57)
  linux-azure-5.4-cloud-tools-5.4.0-10[58](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:59)
  linux-azure-5.4-cloud-tools-5.4.0-10[59](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:60)
  linux-azure-5.4-cloud-tools-5.4.0-10[61](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:62)
  linux-azure-5.4-cloud-tools-5.4.0-10[62](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:63)
  linux-azure-5.4-cloud-tools-5.4.0-10[63](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:64)
  linux-azure-5.4-cloud-tools-5.4.0-10[64](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:65)
  linux-azure-5.4-cloud-tools-5.4.0-10[65](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:66)
  linux-azure-5.4-cloud-tools-5.4.0-10[67](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:68)
  linux-azure-5.4-cloud-tools-5.4.0-10[68](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:69)
  linux-azure-5.4-cloud-tools-5.4.0-10[69](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:70)
  linux-azure-5.4-cloud-tools-5.4.0-10[70](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:71)
  linux-azure-5.4-cloud-tools-5.4.0-10[72](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:73)
  linux-azure-5.4-cloud-tools-5.4.0-10[73](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:74)
  linux-azure-5.4-cloud-tools-5.4.0-10[74](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:75)
  linux-azure-5.4-cloud-tools-5.4.0-10[77](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:78)
  linux-azure-5.4-cloud-tools-5.4.0-10[78](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:79)
  linux-azure-5.4-cloud-tools-5.4.0-10[80](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:81)
  linux-azure-5.4-cloud-tools-5.4.0-10[83](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:84)
  linux-azure-5.4-cloud-tools-5.4.0-10[85](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:86)
  linux-azure-5.4-cloud-tools-5.4.0-10[86](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:87)
  linux-azure-5.4-cloud-tools-5.4.0-1089
  ... truncated ...
  linux-azure-5.4-tools-5.4.0-1091 nvidia-kernel-source-515 nvidia-utils-515
  xserver-xorg-video-nvidia-515
Use 'sudo apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 115 not upgraded.
3 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Setting up azure-mdsd (1.8.0-build.master.189) ...

Configuration file '/etc/default/mdsd'
 ==> Modified (by you or by a script) since installation.
 ==> Package distributor has shipped an updated version.
   What would you like to do about it ?  Your options are:
    Y or I  : install the package maintainer's version
    N or O  : keep your currently-installed version
      D     : show the differences between the versions
      Z     : start a shell to examine the situation
 The default action is to keep your current version.
*** mdsd (Y/I/N/O/D/Z) [default=N] ? dpkg: error processing package azure-mdsd (--configure):
 end of file on stdin at conffile prompt
Setting up auoms (2.7.0.11) ...

Configuration file '/etc/opt/microsoft/auoms/auoms.conf'
 ==> Modified (by you or by a script) since installation.
 ==> Package distributor has shipped an updated version.
   What would you like to do about it ?  Your options are:
    Y or I  : install the package maintainer's version
    N or O  : keep your currently-installed version
      D     : show the differences between the versions
      Z     : start a shell to examine the situation
 The default action is to keep your current version.
*** auoms.conf (Y/I/N/O/D/Z) [default=N] ? dpkg: error processing package auoms (--configure):
 end of file on stdin at conffile prompt
dpkg: dependency problems prevent configuration of azsec-monitor:
 azsec-monitor depends on auoms (>= 2.4.5); however:
  Package auoms is not configured yet.
  Version of auoms on system, provided by auoms:amd64, is <none>.

dpkg: error processing package azsec-monitor (--configure):
 dependency problems - leaving unconfigured
No apport report written because the error message indicates its a followup error from a previous failure.
Errors were encountered while processing:
 azure-mdsd
 auoms
 azsec-monitor
E: Sub-process /usr/bin/dpkg returned an error code (1)
Error: Process completed with exit code [100](https://github.com/microsoft/LightGBM/actions/runs/5619064630/job/15245982451?pr=5999#step:2:101).

Reproducible example

This is happening on master and all PRs.

(example build link)

Additional Comments

Some resources that might be helpful:

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions