Using preprocess_phi_3_new
in LAVIS/open_flamingo/train/sft_data_utils.py
gets labels all -100. #776
Description
Hello, thank you for your wonderful work.
I have a problem re-implementing LazySupervisedDataset and am stuck at the position of retrieving training labels. All labels are -100.
Below is a screenshot of my dataset:
I completely reuse your LazySupervisedDataset. When I initialize data_path, tokenizer, image_processor, and args, it runs without any issues. However, when I check the labels it generates, the tensor is entirely -100.
I debugged this strange behavior and found that the issue occurs because of the following piece of code:
First, when the if-clause above reaches the “user round,” the cur_len is absolutely not equal to total_len, so the line target[:] = IGNORE_INDEX is always executed.
Second, the code at line 226 does not skip the bos token but instead skips the "<|user|>" token. I don’t understand the reasoning behind this behavior.