[Tracking issue] General dataset support

The aim is for all trainers to apply the same procedure in their init function:

- if needed, apply the chat template, then
- if needed, tokenize.

## Support todo:

### Standard dataset

  - [x] `BCOTrainer`
  - [x] `CPOTrainer`
  - [x] `DPOTrainer`
  - [ ] `GKDTrainer` (same as `SFTTrainer`)
  - [ ] `IterativeSFTTrainer`
  - [x] `KTOTrainer`
  - [x] `NashMDTrainer`
  - [x] `OnlineDPOTrainer`
  - [x] `ORPOTrainer`
  - [ ] `PPOTrainer`
  - [x] `RewardTrainer` #2102
  - [ ] `RLOOTrainer`
  - [x] `SFTTrainer` (could be previously achieved via `"dataset_text_field"`) #2078; #2405
  - [x] `XPOTrainer`

### Conversational dataset

  - [x] `BCOTrainer` #2107
  - [x] `CPOTrainer` #2144
  - [x] `DPOTrainer` #2131
  - [ ] `GKDTrainer`
  - [ ] `IterativeSFTTrainer`
  - [x] `KTOTrainer` #2248
  - [x] `NashMDTrainer` #2075
  - [x] `OnlineDPOTrainer` #2075
  - [x] `ORPOTrainer` #2184
  - [ ] `PPOTrainer`
  - [x] `RewardTrainer` #2102
  - [ ] `RLOOTrainer`
  - [ ] `SFTTrainer` (yes, via `get_formatting_func_from_dataset` for now, needs refactoring); refactor in #2405
  - [x] `XPOTrainer` #2075


### Misc

- [ ] Update `docs/dataset_format.mdx`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tracking issue] General dataset support #2071

Support todo:

Standard dataset

Conversational dataset

Misc

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development