Hey 👋🏽, I'm cpuimage
Hi, I am ZhiHan Gao, living in Shantou, China.
I specialize in developing audio, video, and image processing algorithms, and I share my open-source projects on GitHub. If you find my projects useful, please consider buying me a coffee. Your support is greatly appreciated!
Professional Experience
- 👨🏽💻 I have worked at leading tech companies including Baidu, KingSoft, and more.
- 📱 Developed algorithms for multiple applications:
- 💡Delivered AI-based technical customization services and successfully implemented and delivered several AI projects.
Research Progress and Achievements
- 🌱 Here are some of my past research endeavors and achievements in deep learning and statistical algorithms:
- Deep Learning
- A Trimap-Free Solution for Real-Time Automatic Portrait Matting on Mobile Devices
- A Robust Optimizer With Normalized Accelerated Convergence Capability in Deep Learning
-
A General and Adaptive Robust Loss Structure Scheme -
A Robust Loss Weighting Solution For Learning Long-Tail Data - Image Synthesis and Semantic Manipulation Using Stable Diffusion Networks
- Stable Diffusion Architecture Optimization And Deployment On Mobile Devices
- A Robust Solution For Accelerated Training Convergence And Learning Long-Tail Data
- A Arbitrary Resolution Super Resolution Solution for Real World
- Accelerate Stable Diffusion FP16 Inference Deployment Optimization with TensorRT
- Port Stable Diffusion X4 Upscaler To TensorFlow And Support FP16 Inference Deployment
- Port Stable Diffusion PromptGen (GPT2) To TensorFlow And Support ONNX Inference Deployment
-
Improve Batch Normalization for Robust Training and Inference - Stable Diffusion Architectural Distillation
- Content-aware 3-view synthesis based on Stable Diffusion in Game Art
- Super Resolution Solution based on Stable Diffusion
- Video Editing techniques based on Stable Diffusion
- Port Stable Diffusion XL 1.0 To TensorFlow And Support FP16 Inference Deployment
- A Plug-And-Play Algorithm For Asynchronous Inference With Frequency-Domain Decomposable Reconstruction For Arbitrary Visual Scenes
-
Stable Diffusion Inference With PyTorch Weights And More Features Like Stable Diffusion Web UI In Keras 3.x - FLUX.1 Support FP16 Inference Deployment and Low Memory Lora Training In PyTorch
- LLM from Scratch with PyTorch
- Enhanced FaceFusion: Decoupled Modules and Optimized Inference for Visual Performance
- Ultra High-Resolution Portrait Retouching
- Training-Free Universal High-Resolution Synthesis for Any Video Model (in progress)
- Statistical Algorithms
- Real time and embedded implementation of speech enhancement algorithms based on Minimum Mean-Square Error Short-Time Spectral Amplitude estimation (MMSE-STSA)
- Deep Learning
Collaboration and Contact
- 👯 I’m looking to collaborate on audio and image algorithms
- 💬 Any paid technical service or solution consulting