Skip to content

Using preprocess_phi_3_new in LAVIS/open_flamingo/train/sft_data_utils.py gets labels all -100. #776

Open
@JHW5981

Description

Hello, thank you for your wonderful work.

I have a problem re-implementing LazySupervisedDataset and am stuck at the position of retrieving training labels. All labels are -100.

image

Below is a screenshot of my dataset:

image

I completely reuse your LazySupervisedDataset. When I initialize data_path, tokenizer, image_processor, and args, it runs without any issues. However, when I check the labels it generates, the tensor is entirely -100.

I debugged this strange behavior and found that the issue occurs because of the following piece of code:

image

First, when the if-clause above reaches the “user round,” the cur_len is absolutely not equal to total_len, so the line target[:] = IGNORE_INDEX is always executed.

image

Second, the code at line 226 does not skip the bos token but instead skips the "<|user|>" token. I don’t understand the reasoning behind this behavior.

image

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions