The final pipeline of the given task involves two main parts: video frame prediction and mask segmentation.
In the first part, a previously trained model is loaded using the --model1_path
argument.
This model is used to predict the 22nd frame of a test dataset, which is provided
using the --data_root
argument.
Once the 22nd frame is predicted, the second part of the pipeline involves mask segmentation.
A saved model, which has been trained specifically for this task,
is loaded using the --model2_path
argument.
The predicted 22nd frame is then passed through this model to generate a set of predicted masks.
Finally, the predicted masks are saved as a .npy file in the ./results/
folder.
To reproduce the results, you need to run the command:
python main.py
--model1_path
"Path to the checkpoint of Model 1" --model2_path
"Path to the checkpoint of Model 2" --data_root
"Path to the test dataset folder"
For example:
python main.py --model1_path '/scratch/sxp8182/SimVP-Simpler-yet-Better-Video-Prediction/results_5/Debug/checkpoint.pth' --model2_path '/scratch/sxp8182/model2504_new.pt' --data_root '/scratch/sxp8182/hidden'
The results can be found in the ./results/numpy_y_pred_masks.npy
file.
Overall, this pipeline demonstrates a powerful application of deep learning in video processing tasks, such as predicting future frames and segmenting objects within those frames.