-
Notifications
You must be signed in to change notification settings - Fork 148
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add 10b experiment to flava native and fix checkpoint wrapper (#309)
Summary: Adds 10B experiment config to FLAVA native training script and fixes issues with checkpointing due to kwargs and re-entrant. Pull Request resolved: #309 Test Plan: `torchrun --nproc_per_node=8 -m flava.native.train config=flava/native/configs/10b.yaml` Fixes #{issue number} Reviewed By: ankitade Differential Revision: D39563955 Pulled By: edward-io fbshipit-source-id: 93d1003c6a238e5f756581ca9507501edf2aa4df
- Loading branch information
1 parent
e5e9d4f
commit 53ab78b
Showing
8 changed files
with
194 additions
and
34 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Copyright (c) Meta Platforms, Inc. and affiliates. | ||
# All rights reserved. | ||
# | ||
# This source code is licensed under the BSD-style license found in the | ||
# LICENSE file in the root directory of this source tree. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
training: | ||
strategy: fsdp # can be changed to ddp or fsdp | ||
seed: 1337 | ||
|
||
batch_size: 8 | ||
num_workers: 4 | ||
prefetch_factor: 3 | ||
|
||
optimizer: | ||
learning_rate: 1e-3 | ||
adam_eps: 1e-8 | ||
adam_weight_decay: 0.1 | ||
adam_betas: [0.9, 0.999] | ||
|
||
warmup_steps: 10000 | ||
max_steps: 100000 | ||
|
||
validation_steps: 5000 | ||
log_interval: 10 | ||
|
||
enable_tf32: True | ||
enable_amp: True | ||
half_precision_format: "bfloat16" # or float16 | ||
enable_half_reduce_in_fsdp: True # handles the reduction across devices in half precision | ||
|
||
activation_checkpointing: True | ||
|
||
datasets: | ||
_target_: flava.definitions.TrainingDatasetsInfo | ||
selected: | ||
- image | ||
- vl | ||
- text | ||
image: | ||
_target_: flava.definitions.TrainingSingleDatasetInfo | ||
train: | ||
- _target_: flava.definitions.HFDatasetInfo | ||
key: imagenet-1k | ||
subset: default | ||
text: | ||
_target_: flava.definitions.TrainingSingleDatasetInfo | ||
train: | ||
- _target_: flava.definitions.HFDatasetInfo | ||
key: wikitext | ||
subset: wikitext-103-raw-v1 | ||
datamodule_extra_kwargs: | ||
text_columns: ["text"] | ||
vl: | ||
_target_: flava.definitions.TrainingSingleDatasetInfo | ||
train: | ||
- _target_: flava.definitions.HFDatasetInfo | ||
key: red_caps | ||
subset: backpacking | ||
rename_columns: | ||
- ["caption", "text"] | ||
val: | ||
- _target_: flava.definitions.HFDatasetInfo | ||
key: red_caps | ||
subset: backpacking | ||
rename_columns: | ||
- ["caption", "text"] | ||
split_key_mapping: | ||
validation: train | ||
|
||
|
||
model: | ||
image_num_hidden_layers: 64 | ||
image_hidden_size: 2048 | ||
image_intermediate_size: 10240 | ||
image_num_attention_heads: 16 | ||
|
||
text_num_hidden_layers: 64 | ||
text_hidden_size: 2048 | ||
text_intermediate_size: 10240 | ||
text_num_attention_heads: 16 | ||
|
||
multimodal_num_hidden_layers: 40 | ||
multimodal_hidden_size: 2048 | ||
multimodal_intermediate_size: 10240 | ||
multimodal_num_attention_heads: 16 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
training: | ||
strategy: fsdp # can be changed to ddp or fsdp | ||
seed: 1337 | ||
|
||
batch_size: 12 | ||
num_workers: 4 | ||
prefetch_factor: 3 | ||
|
||
optimizer: | ||
learning_rate: 1e-3 | ||
adam_eps: 1e-8 | ||
adam_weight_decay: 0.1 | ||
adam_betas: [0.9, 0.999] | ||
|
||
warmup_steps: 10000 | ||
max_steps: 100000 | ||
|
||
validation_steps: 5000 | ||
log_interval: 10 | ||
|
||
enable_tf32: True | ||
enable_amp: True | ||
half_precision_format: "bfloat16" # or float16 | ||
enable_half_reduce_in_fsdp: True # handles the reduction across devices in half precision | ||
|
||
activation_checkpointing: True | ||
|
||
datasets: | ||
_target_: flava.definitions.TrainingDatasetsInfo | ||
selected: | ||
- image | ||
- vl | ||
- text | ||
image: | ||
_target_: flava.definitions.TrainingSingleDatasetInfo | ||
train: | ||
- _target_: flava.definitions.HFDatasetInfo | ||
key: imagenet-1k | ||
subset: default | ||
text: | ||
_target_: flava.definitions.TrainingSingleDatasetInfo | ||
train: | ||
- _target_: flava.definitions.HFDatasetInfo | ||
key: wikitext | ||
subset: wikitext-103-raw-v1 | ||
datamodule_extra_kwargs: | ||
text_columns: ["text"] | ||
vl: | ||
_target_: flava.definitions.TrainingSingleDatasetInfo | ||
train: | ||
- _target_: flava.definitions.HFDatasetInfo | ||
key: red_caps | ||
subset: backpacking | ||
rename_columns: | ||
- ["caption", "text"] | ||
val: | ||
- _target_: flava.definitions.HFDatasetInfo | ||
key: red_caps | ||
subset: backpacking | ||
rename_columns: | ||
- ["caption", "text"] | ||
split_key_mapping: | ||
validation: train | ||
|
||
model: | ||
image_num_hidden_layers: 48 | ||
image_hidden_size: 1664 | ||
image_intermediate_size: 8192 | ||
image_num_attention_heads: 16 | ||
|
||
text_num_hidden_layers: 48 | ||
text_hidden_size: 1664 | ||
text_intermediate_size: 8192 | ||
text_num_attention_heads: 16 | ||
|
||
multimodal_num_hidden_layers: 24 | ||
multimodal_hidden_size: 1664 | ||
multimodal_intermediate_size: 8192 | ||
multimodal_num_attention_heads: 16 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters