Closed
Description
背景
动转静小组在理想态单测推全验证中,test_lstm.py
单测发现:调用default_main_program().random_seed
方法会导致 PIR 模式下的 random_seed 未能被设置到。这会导致在 PIR 模式 seed 设置无效,详见:#60343。
具体例子
with enable_to_static_guard(to_static):
paddle.static.default_main_program().random_seed = 1001
paddle.static.default_startup_program().random_seed = 1001
net = paddle.jit.to_static(Net(12, 2))
x = paddle.zeros((2, 10, 12))
y = net(x)
return y.numpy()
修改后:
with enable_to_static_guard(to_static):
paddle.seed(1001)
net = paddle.jit.to_static(Net(12, 2))
x = paddle.zeros((2, 10, 12))
y = net(x)
return y.numpy()
补充说明:
paddle.static.default_main_program().random_seed
为动静统一方法,动态图和老静态图都会读取random_seed
参数,而 PIR 不会,且program.random_seed
三年前就计划废弃,使用paddle.seed
代替
任务列表
✅:已经完全迁移,所有单测都OK!
🟢:审核完毕待合入,合入之后完全迁移!
🔵:可认领!
🟡:当前阶段不需要人力继续跟进,下阶段推进(大部分是精度问题)
🚧:迁移中,单测还没有过,还没有审核完。
大致正常流程为:
🔵 -> 🚧 -> 🟢 -> ✅
异常流程为:
🔵 -> 🚧 -> 🟡
第一阶段:(整体进展:0/18)
test 目录下文件清理
序号 | 文件位置 | 认领人 | PR |
---|---|---|---|
🔵1 | dist_allreduce_op.py dist_ctr.py dist_fleet_ctr.py dist_fleet_ctr_ps_gpu.py dist_fleet_heter_pipeline_ctr.py |
||
🔵2 | dist_fleet_raw_program_optimizer.py dist_fleet_simnet_bow.py dist_fleet_sync_batch_norm.py dist_mnist.py dist_mnist_dgc.py |
||
🔵3 | dist_mnist_fp16_allreduce.py dist_mnist_lars.py dist_se_resnext.py dist_sharding_save.py dist_word2vec.py |
||
🔵4 | fused_attention_pass_with_mp.py ir_memory_optimize_net_base.py op_test_ipu.py test_cond.py test_custom_leaky_relu_ipu.py |
||
🔵5 | test_desc_clone.py test_detection.py test_dist_base.py test_dist_data_parallel_ipu.py test_dist_pod128_sample.py |
||
🔵6 | test_dist_sample.py test_dist_transpiler.py test_dropout_nd_op.py test_eager_deletion_dynamic_rnn_base.py test_eager_deletion_padding_rnn.py |
||
🔵7 | test_eval_model_ipu.py test_fused_attention_op.py test_fused_attention_op_xpu.py test_fused_attention_pass.py test_fused_bias_dropout_residual_layer_norm_op.py |
||
🔵8 | test_fused_ec_moe_op.py test_fused_feedforward_op.py test_fused_feedforward_op_xpu.py test_fused_feedforward_pass.py test_fused_multi_transformer_int8_op.py |
||
🔵9 | test_fused_multi_transformer_op.py test_fused_resnet_basic_block_op_xpu.py test_fused_transformer_encoder_layer.py test_identity_loss_ipu.py test_imperative_deepcf.py |
||
🔵10 | test_imperative_hook_for_layer.py test_imperative_mnist.py test_imperative_mnist_sorted_gradient.py test_imperative_out_scale.py test_imperative_ptq.py |
||
🔵11 | test_imperative_qat.py test_imperative_qat_amp.py test_imperative_qat_lsq.py test_imperative_qat_matmul.py test_imperative_qat_user_defined.py |
||
🔵12 | test_imperative_recurrent_usage.py test_inference_model_io_ipu.py test_initializer.py test_initializer_nn.py test_lambv2_op.py |
||
🔵13 | test_layers.py test_llm_int8_linear.py test_metrics.py test_model_parallel_ipu.py test_modelruntime_ipu.py |
||
🔵14 | test_multiprocess_dataloader_dataset.py test_multiprocess_dataloader_dynamic.py test_multiprocess_dataloader_iterable_dataset_dynamic.py test_multiprocess_dataloader_iterable_dataset_static.py test_multiprocess_dataloader_static.py |
||
🔵15 | test_optimizer_in_control_flow.py test_optimizer_ipu.py test_program.py test_quantization_mkldnn_pass.py test_quantization_pass.py |
||
🔵16 | test_quantization_scale_pass.py test_rnn_decode_api.py test_save_inference_model.py test_seq2seq.py test_static_save_load.py |
||
🔵17 | test_static_save_load_bf16.py test_sync_batch_norm_op.py test_trt_conv_quant_dequant_pass.py test_trt_fc_fuse_quant_dequant_pass.py test_trt_matmul_quant_dequant.py |
||
🔵18 | test_user_defined_quantization.py test_weight_decay.py test_weight_only_linear.py test_word2vec.py test_yolov3.py |
认领方式
请大家以 comment 的形式认领任务,如:
【报名】:1、3、12-13
多个任务之间需要使用顿号分隔,报名多个连续任务可用横线表示,如 2-5
PR 提交格式:在 PR 的标题中以 【Cleanup random_seed No.】 开头,注明任务编号
看板信息
任务数量 | 🔵可认领 | 🚧迁移中 | 🟢待合入 | ✅完成 | 🟡下阶段推进 | 完成率 |
---|---|---|---|---|---|---|
18 | 18 | 0 | 0 | 0 | 0 | 0.0% |
问题记录
-
test/collective/fleet/fused_attention_pass_with_mp.py
- 精度不足 -
test/collective/fleet/test_fused_attention_op_xpu.py
- 精度不足