[zero] add unit test for low-level zero init #2474

1SAA · 2023-01-13T07:00:56Z

No description provided.

github-actions · 2023-01-13T07:25:48Z

Click me to view the complete report

Package	Line Rate	Health
.	0%	❌
colossalai	74%	❌
colossalai._C	100%	✔
colossalai.amp	96%	✔
colossalai.amp.apex_amp	92%	✔
colossalai.amp.naive_amp	81%	➖
colossalai.amp.naive_amp.grad_scaler	86%	➖
colossalai.amp.torch_amp	66%	❌
colossalai.auto_parallel	100%	✔
colossalai.auto_parallel.checkpoint	0%	❌
colossalai.auto_parallel.meta_profiler	55%	❌
colossalai.auto_parallel.meta_profiler.meta_registry	33%	❌
colossalai.auto_parallel.passes	7%	❌
colossalai.auto_parallel.pipeline_shard	100%	✔
colossalai.auto_parallel.tensor_shard	57%	❌
colossalai.auto_parallel.tensor_shard.deprecated	49%	❌
colossalai.auto_parallel.tensor_shard.deprecated.op_handler	61%	❌
colossalai.auto_parallel.tensor_shard.node_handler	56%	❌
colossalai.auto_parallel.tensor_shard.node_handler.experimental	27%	❌
colossalai.auto_parallel.tensor_shard.node_handler.strategy	50%	❌
colossalai.auto_parallel.tensor_shard.solver	25%	❌
colossalai.auto_parallel.tensor_shard.utils	49%	❌
colossalai.builder	78%	❌
colossalai.cli	0%	❌
colossalai.cli.benchmark	0%	❌
colossalai.cli.check	0%	❌
colossalai.cli.launcher	0%	❌
colossalai.communication	79%	❌
colossalai.context	95%	✔
colossalai.context.process_group_initializer	99%	✔
colossalai.context.random	91%	✔
colossalai.device	31%	❌
colossalai.engine	85%	➖
colossalai.engine.gradient_accumulation	73%	❌
colossalai.engine.gradient_handler	84%	➖
colossalai.engine.schedule	48%	❌
colossalai.fx	21%	❌
colossalai.fx.codegen	5%	❌
colossalai.fx.passes	53%	❌
colossalai.fx.passes.algorithms	17%	❌
colossalai.fx.passes.experimental	17%	❌
colossalai.fx.profiler	14%	❌
colossalai.fx.profiler.experimental	87%	➖
colossalai.fx.profiler.experimental.profiler_function	63%	❌
colossalai.fx.profiler.experimental.profiler_module	42%	❌
colossalai.fx.tracer	39%	❌
colossalai.fx.tracer.bias_addition_patch	100%	✔
colossalai.fx.tracer.bias_addition_patch.patched_bias_addition_function	62%	❌
colossalai.fx.tracer.bias_addition_patch.patched_bias_addition_module	90%	✔
colossalai.fx.tracer.meta_patch	100%	✔
colossalai.fx.tracer.meta_patch.patched_function	89%	➖
colossalai.fx.tracer.meta_patch.patched_module	95%	✔
colossalai.gemini	85%	➖
colossalai.gemini.chunk	88%	➖
colossalai.gemini.memory_tracer	69%	❌
colossalai.gemini.ophooks	66%	❌
colossalai.gemini.paramhooks	95%	✔
colossalai.kernel	100%	✔
colossalai.kernel.cuda_native	24%	❌
colossalai.kernel.jit	0%	❌
colossalai.logging	72%	❌
colossalai.nn	52%	❌
colossalai.nn._ops	84%	➖
colossalai.nn.layer	67%	❌
colossalai.nn.layer.colossalai_layer	85%	➖
colossalai.nn.layer.moe	69%	❌
colossalai.nn.layer.parallel_1d	69%	❌
colossalai.nn.layer.parallel_2d	65%	❌
colossalai.nn.layer.parallel_2p5d	70%	❌
colossalai.nn.layer.parallel_3d	22%	❌
colossalai.nn.layer.parallel_sequence	38%	❌
colossalai.nn.layer.utils	90%	✔
colossalai.nn.layer.vanilla	60%	❌
colossalai.nn.layer.wrapper	30%	❌
colossalai.nn.loss	74%	❌
colossalai.nn.lr_scheduler	45%	❌
colossalai.nn.metric	54%	❌
colossalai.nn.optimizer	63%	❌
colossalai.nn.parallel	83%	➖
colossalai.nn.parallel.layers	34%	❌
colossalai.nn.parallel.layers.cache_embedding	52%	❌
colossalai.pipeline	48%	❌
colossalai.pipeline.middleware	49%	❌
colossalai.pipeline.middleware.adaptor	93%	✔
colossalai.pipeline.rpc	15%	❌
colossalai.registry	79%	❌
colossalai.tensor	79%	❌
colossalai.testing	90%	✔
colossalai.trainer	68%	❌
colossalai.trainer.hooks	43%	❌
colossalai.utils	58%	❌
colossalai.utils.checkpoint	100%	✔
colossalai.utils.checkpoint_io	95%	✔
colossalai.utils.data_sampler	83%	➖
colossalai.utils.model	82%	➖
colossalai.utils.multi_tensor_apply	78%	❌
colossalai.utils.profiler	0%	❌
colossalai.utils.profiler.legacy	0%	❌
colossalai.utils.rank_recorder	0%	❌
colossalai.utils.tensor_detector	14%	❌
colossalai.zero	95%	✔
colossalai.zero.init_ctx	97%	✔
colossalai.zero.shard_utils	95%	✔
colossalai.zero.sharded_model	60%	❌
colossalai.zero.sharded_optim	84%	➖
colossalai.zero.sharded_optim.bookkeeping	91%	✔
colossalai.zero.sharded_param	97%	✔
colossalai.zero.utils	90%	➖
op_builder	53%	❌
tests	100%	✔
tests.components_to_test	97%	✔
tests.components_to_test.utils	87%	➖
tests.test_amp	97%	✔
tests.test_auto_parallel	100%	✔
tests.test_auto_parallel.test_tensor_shard	37%	❌
tests.test_auto_parallel.test_tensor_shard.test_deprecated	57%	❌
tests.test_auto_parallel.test_tensor_shard.test_deprecated.test_deprecated_op_handler	71%	❌
tests.test_auto_parallel.test_tensor_shard.test_gpt	31%	❌
tests.test_auto_parallel.test_tensor_shard.test_metainfo	44%	❌
tests.test_auto_parallel.test_tensor_shard.test_node_handler	27%	❌
tests.test_autochunk	42%	❌
tests.test_autochunk.evoformer	27%	❌
tests.test_comm	85%	➖
tests.test_config	100%	✔
tests.test_context	77%	❌
tests.test_context.configs	100%	✔
tests.test_data	97%	✔
tests.test_data_pipeline_tensor_parallel	72%	❌
tests.test_ddp	98%	✔
tests.test_device	72%	❌
tests.test_engine	97%	✔
tests.test_fx	96%	✔
tests.test_fx.test_ckpt_solvers	31%	❌
tests.test_fx.test_codegen	29%	❌
tests.test_fx.test_meta	51%	❌
tests.test_fx.test_pipeline.test_hf_model	38%	❌
tests.test_fx.test_pipeline.test_timm_model	38%	❌
tests.test_fx.test_pipeline.test_topo	97%	✔
tests.test_fx.test_pipeline.test_torchvision	41%	❌
tests.test_fx.test_profiler	29%	❌
tests.test_fx.test_tracer	99%	✔
tests.test_fx.test_tracer.test_hf_model	81%	➖
tests.test_fx.test_tracer.test_timm_model	87%	➖
tests.test_fx.test_tracer.test_torchaudio_model	78%	❌
tests.test_fx.test_tracer.test_torchrec_model	93%	✔
tests.test_fx.test_tracer.test_torchvision_model	96%	✔
tests.test_gemini	95%	✔
tests.test_gemini.update	98%	✔
tests.test_layers	53%	❌
tests.test_layers.test_1d	97%	✔
tests.test_layers.test_1d.checks_1d	99%	✔
tests.test_layers.test_2d	91%	✔
tests.test_layers.test_2d.checks_2d	78%	❌
tests.test_layers.test_2p5d	98%	✔
tests.test_layers.test_2p5d.checks_2p5d	100%	✔
tests.test_layers.test_3d	45%	❌
tests.test_layers.test_3d.checks_3d	4%	❌
tests.test_layers.test_sequence	64%	❌
tests.test_moe	97%	✔
tests.test_ops	96%	✔
tests.test_optimizer	96%	✔
tests.test_pipeline	35%	❌
tests.test_tensor	78%	❌
tests.test_tensor.common_utils	75%	❌
tests.test_tensor.core	88%	➖
tests.test_tensor.model	56%	❌
tests.test_trainer	84%	➖
tests.test_trainer.test_pipeline	98%	✔
tests.test_utils	80%	❌
tests.test_utils.test_checkpoint	39%	❌
tests.test_utils.test_checkpoint_io	95%	✔
tests.test_zero	93%	✔
tests.test_zero.low_level_zero	97%	✔
Summary	58% (32979 / 57103)	❌

feifeibear · 2023-01-13T08:23:33Z

tests/test_zero/low_level_zero/test_zero_init.py

+from colossalai.zero import LowLevelZeroOptimizer
+
+
+class TestModel(nn.Module):


can we use the common model from tests/components_to_test?

it's just a initizlization test.

feifeibear · 2023-01-13T08:25:46Z

tests/test_zero/low_level_zero/test_zero_init.py

+
+
+def run_dist(rank, world_size, port):
+    config_dict = dict(parallel=dict(tensor=dict(size=2, mode='1d')))


can we use tp_degree to replace 2 here?

we just force the dp degree and tp dgree.

feifeibear · 2023-01-13T08:25:50Z

tests/test_zero/low_level_zero/test_zero_init.py

+
+
+def exam_zero_init():
+    dp_2_tp_2_pg = ProcessGroup(tp_degree=2)


can we use tp_degree to replace 2 here?

we just force the dp degree and tp dgree.

github-actions · 2023-01-13T09:26:00Z

Click me to view the complete report

Package	Line Rate	Health
.	0%	❌
colossalai	74%	❌
colossalai._C	100%	✔
colossalai.amp	96%	✔
colossalai.amp.apex_amp	92%	✔
colossalai.amp.naive_amp	81%	➖
colossalai.amp.naive_amp.grad_scaler	86%	➖
colossalai.amp.torch_amp	66%	❌
colossalai.auto_parallel	100%	✔
colossalai.auto_parallel.checkpoint	0%	❌
colossalai.auto_parallel.meta_profiler	55%	❌
colossalai.auto_parallel.meta_profiler.meta_registry	33%	❌
colossalai.auto_parallel.passes	7%	❌
colossalai.auto_parallel.pipeline_shard	100%	✔
colossalai.auto_parallel.tensor_shard	57%	❌
colossalai.auto_parallel.tensor_shard.deprecated	49%	❌
colossalai.auto_parallel.tensor_shard.deprecated.op_handler	61%	❌
colossalai.auto_parallel.tensor_shard.node_handler	56%	❌
colossalai.auto_parallel.tensor_shard.node_handler.experimental	27%	❌
colossalai.auto_parallel.tensor_shard.node_handler.strategy	50%	❌
colossalai.auto_parallel.tensor_shard.solver	25%	❌
colossalai.auto_parallel.tensor_shard.utils	49%	❌
colossalai.builder	78%	❌
colossalai.cli	0%	❌
colossalai.cli.benchmark	0%	❌
colossalai.cli.check	0%	❌
colossalai.cli.launcher	0%	❌
colossalai.communication	79%	❌
colossalai.context	95%	✔
colossalai.context.process_group_initializer	99%	✔
colossalai.context.random	91%	✔
colossalai.device	31%	❌
colossalai.engine	85%	➖
colossalai.engine.gradient_accumulation	73%	❌
colossalai.engine.gradient_handler	84%	➖
colossalai.engine.schedule	48%	❌
colossalai.fx	21%	❌
colossalai.fx.codegen	5%	❌
colossalai.fx.passes	53%	❌
colossalai.fx.passes.algorithms	17%	❌
colossalai.fx.passes.experimental	17%	❌
colossalai.fx.profiler	14%	❌
colossalai.fx.profiler.experimental	87%	➖
colossalai.fx.profiler.experimental.profiler_function	63%	❌
colossalai.fx.profiler.experimental.profiler_module	42%	❌
colossalai.fx.tracer	39%	❌
colossalai.fx.tracer.bias_addition_patch	100%	✔
colossalai.fx.tracer.bias_addition_patch.patched_bias_addition_function	62%	❌
colossalai.fx.tracer.bias_addition_patch.patched_bias_addition_module	90%	✔
colossalai.fx.tracer.meta_patch	100%	✔
colossalai.fx.tracer.meta_patch.patched_function	89%	➖
colossalai.fx.tracer.meta_patch.patched_module	95%	✔
colossalai.gemini	85%	➖
colossalai.gemini.chunk	88%	➖
colossalai.gemini.memory_tracer	69%	❌
colossalai.gemini.ophooks	66%	❌
colossalai.gemini.paramhooks	95%	✔
colossalai.kernel	100%	✔
colossalai.kernel.cuda_native	24%	❌
colossalai.kernel.jit	0%	❌
colossalai.logging	72%	❌
colossalai.nn	52%	❌
colossalai.nn._ops	84%	➖
colossalai.nn.layer	67%	❌
colossalai.nn.layer.colossalai_layer	85%	➖
colossalai.nn.layer.moe	69%	❌
colossalai.nn.layer.parallel_1d	69%	❌
colossalai.nn.layer.parallel_2d	65%	❌
colossalai.nn.layer.parallel_2p5d	70%	❌
colossalai.nn.layer.parallel_3d	22%	❌
colossalai.nn.layer.parallel_sequence	38%	❌
colossalai.nn.layer.utils	90%	✔
colossalai.nn.layer.vanilla	60%	❌
colossalai.nn.layer.wrapper	30%	❌
colossalai.nn.loss	74%	❌
colossalai.nn.lr_scheduler	45%	❌
colossalai.nn.metric	54%	❌
colossalai.nn.optimizer	63%	❌
colossalai.nn.parallel	83%	➖
colossalai.nn.parallel.layers	34%	❌
colossalai.nn.parallel.layers.cache_embedding	52%	❌
colossalai.pipeline	48%	❌
colossalai.pipeline.middleware	49%	❌
colossalai.pipeline.middleware.adaptor	93%	✔
colossalai.pipeline.rpc	15%	❌
colossalai.registry	79%	❌
colossalai.tensor	79%	❌
colossalai.testing	90%	✔
colossalai.trainer	68%	❌
colossalai.trainer.hooks	43%	❌
colossalai.utils	58%	❌
colossalai.utils.checkpoint	100%	✔
colossalai.utils.checkpoint_io	95%	✔
colossalai.utils.data_sampler	83%	➖
colossalai.utils.model	82%	➖
colossalai.utils.multi_tensor_apply	78%	❌
colossalai.utils.profiler	0%	❌
colossalai.utils.profiler.legacy	0%	❌
colossalai.utils.rank_recorder	0%	❌
colossalai.utils.tensor_detector	14%	❌
colossalai.zero	95%	✔
colossalai.zero.init_ctx	97%	✔
colossalai.zero.shard_utils	95%	✔
colossalai.zero.sharded_model	60%	❌
colossalai.zero.sharded_optim	84%	➖
colossalai.zero.sharded_optim.bookkeeping	91%	✔
colossalai.zero.sharded_param	97%	✔
colossalai.zero.utils	90%	➖
op_builder	53%	❌
tests	100%	✔
tests.components_to_test	97%	✔
tests.components_to_test.utils	87%	➖
tests.test_amp	97%	✔
tests.test_auto_parallel	100%	✔
tests.test_auto_parallel.test_tensor_shard	37%	❌
tests.test_auto_parallel.test_tensor_shard.test_deprecated	57%	❌
tests.test_auto_parallel.test_tensor_shard.test_deprecated.test_deprecated_op_handler	71%	❌
tests.test_auto_parallel.test_tensor_shard.test_gpt	31%	❌
tests.test_auto_parallel.test_tensor_shard.test_metainfo	44%	❌
tests.test_auto_parallel.test_tensor_shard.test_node_handler	27%	❌
tests.test_autochunk	42%	❌
tests.test_autochunk.evoformer	27%	❌
tests.test_comm	85%	➖
tests.test_config	100%	✔
tests.test_context	77%	❌
tests.test_context.configs	100%	✔
tests.test_data	97%	✔
tests.test_data_pipeline_tensor_parallel	72%	❌
tests.test_ddp	98%	✔
tests.test_device	72%	❌
tests.test_engine	97%	✔
tests.test_fx	96%	✔
tests.test_fx.test_ckpt_solvers	31%	❌
tests.test_fx.test_codegen	29%	❌
tests.test_fx.test_meta	51%	❌
tests.test_fx.test_pipeline.test_hf_model	38%	❌
tests.test_fx.test_pipeline.test_timm_model	38%	❌
tests.test_fx.test_pipeline.test_topo	97%	✔
tests.test_fx.test_pipeline.test_torchvision	41%	❌
tests.test_fx.test_profiler	29%	❌
tests.test_fx.test_tracer	99%	✔
tests.test_fx.test_tracer.test_hf_model	81%	➖
tests.test_fx.test_tracer.test_timm_model	87%	➖
tests.test_fx.test_tracer.test_torchaudio_model	78%	❌
tests.test_fx.test_tracer.test_torchrec_model	93%	✔
tests.test_fx.test_tracer.test_torchvision_model	96%	✔
tests.test_gemini	95%	✔
tests.test_gemini.update	98%	✔
tests.test_layers	53%	❌
tests.test_layers.test_1d	97%	✔
tests.test_layers.test_1d.checks_1d	99%	✔
tests.test_layers.test_2d	91%	✔
tests.test_layers.test_2d.checks_2d	78%	❌
tests.test_layers.test_2p5d	98%	✔
tests.test_layers.test_2p5d.checks_2p5d	100%	✔
tests.test_layers.test_3d	45%	❌
tests.test_layers.test_3d.checks_3d	4%	❌
tests.test_layers.test_sequence	64%	❌
tests.test_moe	97%	✔
tests.test_ops	96%	✔
tests.test_optimizer	96%	✔
tests.test_pipeline	35%	❌
tests.test_tensor	78%	❌
tests.test_tensor.common_utils	75%	❌
tests.test_tensor.core	88%	➖
tests.test_tensor.model	56%	❌
tests.test_trainer	84%	➖
tests.test_trainer.test_pipeline	98%	✔
tests.test_utils	80%	❌
tests.test_utils.test_checkpoint	39%	❌
tests.test_utils.test_checkpoint_io	95%	✔
tests.test_zero	93%	✔
tests.test_zero.low_level_zero	97%	✔
Summary	58% (32979 / 57101)	❌

* init * rename and remove useless func * basic chunk * add evoformer * align evoformer * add meta * basic chunk * basic memory * finish basic inference memory estimation * finish memory estimation * fix bug * finish memory estimation * add part of index tracer * finish basic index tracer * add doc string * add doc str * polish code * polish code * update active log * polish code * add possible region search * finish region search loop * finish chunk define * support new op * rename index tracer * finishi codegen on msa * redesign index tracer, add source and change compute * pass outproduct mean * code format * code format * work with outerproductmean and msa * code style * code style * code style * code style * change threshold * support check_index_duplicate * support index dupilictae and update loop * support output * update memory estimate * optimise search * fix layernorm * move flow tracer * refactor flow tracer * format code * refactor flow search * code style * adapt codegen to prepose node * code style * remove abandoned function * remove flow tracer * code style * code style * reorder nodes * finish node reorder * update run * code style * add chunk select class * add chunk select * code style * add chunksize in emit, fix bug in reassgin shape * code style * turn off print mem * add evoformer openfold init * init openfold * add benchmark * add print * code style * code style * init openfold * update openfold * align openfold * use max_mem to control stratge * update source add * add reorder in mem estimator * improve reorder efficeincy * support ones_like, add prompt if fit mode search fail * fix a bug in ones like, dont gen chunk if dim size is 1 * fix bug again * update min memory stratege, reduce mem usage by 30% * last version of benchmark * refactor structure * restruct dir * update test * rename * take apart chunk code gen * close mem and code print * code format * rename ambiguous variable * seperate flow tracer * seperate input node dim search * seperate prepose_nodes * seperate non chunk input * seperate reorder * rename * ad reorder graph * seperate trace flow * code style * code style * fix typo * set benchmark * rename test * update codegen test * Fix state_dict key missing issue of the ZeroDDP (#2363) * Fix state_dict output for ZeroDDP duplicated parameters * Rewrite state_dict based on get_static_torch_model * Modify get_static_torch_model to be compatible with the lower version (ZeroDDP) * update codegen test * update codegen test * add chunk search test * code style * add available * [hotfix] fix gpt gemini example (#2404) * [hotfix] fix gpt gemini example * [example] add new assertions * remove autochunk_available * [workflow] added nightly release to pypi (#2403) * add comments * code style * add doc for search chunk * [doc] updated readme regarding pypi installation (#2406) * add doc for search * [doc] updated kernel-related optimisers' docstring (#2385) * [doc] updated kernel-related optimisers' docstring * polish doc * rename trace_index to trace_indice * rename function from index to indice * rename * rename in doc * [polish] polish code for get_static_torch_model (#2405) * [gemini] polish code * [testing] remove code * [gemini] make more robust * rename * rename * remove useless function * [worfklow] added coverage test (#2399) * [worfklow] added coverage test * polish code * polish code * polish code * polish code * polish code * polish code * polish code * polish code * add doc for trace indice * [docker] updated Dockerfile and release workflow (#2410) * add doc * update doc * add available * change imports * add test in import * [workflow] refactored the example check workflow (#2411) * [workflow] refactored the example check workflow * polish code * polish code * polish code * polish code * polish code * polish code * polish code * polish code * polish code * polish code * polish code * Update parallel_context.py (#2408) * [hotfix] add DISTPAN argument for benchmark (#2412) * change the benchmark config file * change config * revert config file * rename distpan to distplan * [workflow] added precommit check for code consistency (#2401) * [workflow] added precommit check for code consistency * polish code * polish code * polish code * polish code * polish code * polish code * polish code * adapt new fx * [workflow] added translation for non-english comments (#2414) * [setup] refactored setup.py for dependency graph (#2413) * change import * update doc * [workflow] auto comment if precommit check fails (#2417) * [hotfix] add norm clearing for the overflow step (#2416) * [examples] adding tflops to PaLM (#2365) * [workflow]auto comment with test coverage report (#2419) * [workflow]auto comment with test coverage report * polish code * polish yaml * [doc] added documentation for CI/CD (#2420) * [doc] added documentation for CI/CD * polish markdown * polish markdown * polish markdown * [example] removed duplicated stable diffusion example (#2424) * [zero] add inference mode and its unit test (#2418) * [workflow] report test coverage even if below threshold (#2431) * [example] improved the clarity yof the example readme (#2427) * [example] improved the clarity yof the example readme * polish workflow * polish workflow * polish workflow * polish workflow * polish workflow * polish workflow * [ddp] add is_ddp_ignored (#2434) [ddp] rename to is_ddp_ignored * [workflow] make test coverage report collapsable (#2436) * [autoparallel] add shard option (#2423) * [fx] allow native ckpt trace and codegen. (#2438) * [cli] provided more details if colossalai run fail (#2442) * [autoparallel] integrate device mesh initialization into autoparallelize (#2393) * [autoparallel] integrate device mesh initialization into autoparallelize * add megatron solution * update gpt autoparallel examples with latest api * adapt beta value to fit the current computation cost * [zero] fix state_dict and load_state_dict for ddp ignored parameters (#2443) * [ddp] add is_ddp_ignored [ddp] rename to is_ddp_ignored * [zero] fix state_dict and load_state_dict * fix bugs * [zero] update unit test for ZeroDDP * [example] updated the hybrid parallel tutorial (#2444) * [example] updated the hybrid parallel tutorial * polish code * [zero] add warning for ignored parameters (#2446) * [example] updated large-batch optimizer tutorial (#2448) * [example] updated large-batch optimizer tutorial * polish code * polish code * [example] fixed seed error in train_dreambooth_colossalai.py (#2445) * [workflow] fixed the on-merge condition check (#2452) * [workflow] automated the compatiblity test (#2453) * [workflow] automated the compatiblity test * polish code * [autoparallel] update binary elementwise handler (#2451) * [autoparallel] update binary elementwise handler * polish * [workflow] automated bdist wheel build (#2459) * [workflow] automated bdist wheel build * polish workflow * polish readme * polish readme * Fix False warning in initialize.py (#2456) * Update initialize.py * pre-commit run check * [examples] update autoparallel tutorial demo (#2449) * [examples] update autoparallel tutorial demo * add test_ci.sh * polish * add conda yaml * [cli] fixed hostname mismatch error (#2465) * [example] integrate autoparallel demo with CI (#2466) * [example] integrate autoparallel demo with CI * polish code * polish code * polish code * polish code * [zero] low level optim supports ProcessGroup (#2464) * [example] update vit ci script (#2469) * [example] update vit ci script * [example] update requirements * [example] update requirements * [example] integrate seq-parallel tutorial with CI (#2463) * [zero] polish low level optimizer (#2473) * polish pp middleware (#2476) Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com> * [example] update gpt gemini example ci test (#2477) * [zero] add unit test for low-level zero init (#2474) * [workflow] fixed the skip condition of example weekly check workflow (#2481) * [example] stable diffusion add roadmap * add dummy test_ci.sh * [example] stable diffusion add roadmap (#2482) * [CI] add test_ci.sh for palm, opt and gpt (#2475) * polish code * [example] titans for gpt * polish readme * remove license * polish code * update readme * [example] titans for gpt (#2484) * [autoparallel] support origin activation ckpt on autoprallel system (#2468) * [autochunk] support evoformer tracer (#2485) support full evoformer tracer, which is a main module of alphafold. previously we just support a simplifed version of it. 1. support some evoformer's op in fx 2. support evoformer test 3. add repos for test code * [example] fix requirements (#2488) * [zero] add unit testings for hybrid parallelism (#2486) * [hotfix] gpt example titans bug #2493 * polish code and fix dataloader bugs * [hotfix] gpt example titans bug #2493 (#2494) * [fx] allow control of ckpt_codegen init (#2498) * [fx] allow control of ckpt_codegen init Currently in ColoGraphModule, ActivationCheckpointCodeGen will be set automatically in __init__. But other codegen can't be set if so. So I add an arg to control whether to set ActivationCheckpointCodeGen in __init__. * code style * [example] dreambooth example * add test_ci.sh to dreambooth * [autochunk] support autochunk on evoformer (#2497) * Revert "Update parallel_context.py (#2408)" This reverts commit 7d5640b. * add avg partition (#2483) Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com> * [auto-chunk] support extramsa (#3) (#2504) * [utils] lazy init. (#2148) * [utils] lazy init. * [utils] remove description. * [utils] complete. * [utils] finalize. * [utils] fix names. * [autochunk] support parsing blocks (#2506) * [zero] add strict ddp mode (#2508) * [zero] add strict ddp mode * [polish] add comments for strict ddp mode * [zero] fix test error * [doc] update opt and tutorial links (#2509) * [workflow] fixed changed file detection (#2515) Co-authored-by: oahzxl <xuanlei.zhao@gmail.com> Co-authored-by: eric8607242 <e0928021388@gmail.com> Co-authored-by: HELSON <c2h214748@gmail.com> Co-authored-by: Frank Lee <somerlee.9@gmail.com> Co-authored-by: Haofan Wang <haofanwang.ai@gmail.com> Co-authored-by: Jiarui Fang <fangjiarui123@gmail.com> Co-authored-by: ZijianYY <119492445+ZijianYY@users.noreply.github.com> Co-authored-by: YuliangLiu0306 <72588413+YuliangLiu0306@users.noreply.github.com> Co-authored-by: Super Daniel <78588128+super-dainiu@users.noreply.github.com> Co-authored-by: ver217 <lhx0217@gmail.com> Co-authored-by: Ziyue Jiang <ziyue.jiang97@gmail.com> Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com> Co-authored-by: oahzxl <43881818+oahzxl@users.noreply.github.com> Co-authored-by: binmakeswell <binmakeswell@gmail.com> Co-authored-by: Fazzie-Maqianli <55798671+Fazziekey@users.noreply.github.com> Co-authored-by: アマデウス <kurisusnowdeng@users.noreply.github.com>

[zero] add unit test for low-level zero init

38ff3e3

1SAA force-pushed the my_dev branch from 551e70f to 38ff3e3 Compare January 13, 2023 07:02

1SAA added the Run Build and Test label Jan 13, 2023

1SAA requested a review from feifeibear January 13, 2023 07:02

feifeibear reviewed Jan 13, 2023

View reviewed changes

polish code

841d070

feifeibear approved these changes Jan 15, 2023

View reviewed changes

feifeibear merged commit 21c8822 into hpcaitech:main Jan 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[zero] add unit test for low-level zero init #2474

[zero] add unit test for low-level zero init #2474

1SAA commented Jan 13, 2023

github-actions bot commented Jan 13, 2023

feifeibear Jan 13, 2023

1SAA Jan 13, 2023

feifeibear Jan 13, 2023

1SAA Jan 13, 2023

feifeibear Jan 13, 2023

1SAA Jan 13, 2023

github-actions bot commented Jan 13, 2023

		from colossalai.zero import LowLevelZeroOptimizer


		class TestModel(nn.Module):



		def run_dist(rank, world_size, port):
		config_dict = dict(parallel=dict(tensor=dict(size=2, mode='1d')))



		def exam_zero_init():
		dp_2_tp_2_pg = ProcessGroup(tp_degree=2)

[zero] add unit test for low-level zero init #2474

[zero] add unit test for low-level zero init #2474

Conversation

1SAA commented Jan 13, 2023

github-actions bot commented Jan 13, 2023

feifeibear Jan 13, 2023

Choose a reason for hiding this comment

1SAA Jan 13, 2023

Choose a reason for hiding this comment

feifeibear Jan 13, 2023

Choose a reason for hiding this comment

1SAA Jan 13, 2023

Choose a reason for hiding this comment

feifeibear Jan 13, 2023

Choose a reason for hiding this comment

1SAA Jan 13, 2023

Choose a reason for hiding this comment

github-actions bot commented Jan 13, 2023