
Mask2Former Optimization Pipeline
PyTorch → ONNX → INT8 quantization; edge-ready segmentation.
Problem
Segmentation models were too slow for low-power devices.
Approach
- Converted pretrained Mask2Former from PyTorch to ONNX
- Planned INT8 calibration set and engine build in TensorRT
- Defined latency/accuracy targets and benchmark protocol
Impact
- Targeting ~40% latency drop with ≤2% accuracy delta
- Repeatable export→optimize→deploy workflow for edge