Closed
Description
Currently, if a user program OOM'ed, our logs for the job will not include any information of the failure reason and no returncode will be shown.
We should improve the UX for this with the following steps:
- Include necessary logs in the
sky logs
, e.g, OOM and returncode. - Improve the
sky queue
with additional failure reasons instead of just aFAILED
state of the job.
Version & Commit info:
sky -v
: PLEASE_FILL_INsky -c
: PLEASE_FILL_IN