Real-time semantic segmentation is widely used in the field of autonomous driving and robotics. Most previous networks achieved great accuracy based on a complicated model involving mass computing. The existing lightweight networks generally reduce the parameter sizes by sacrificing the segmentation accuracy. It is critical to balance the parameters and accuracy for real-time semantic segmentation tasks. In this paper, we introduce a Lightweight-Multiscale-Feature-Fusion Network (LMFFNet) mainly composed of three types of components: Split-Extract-Merge Bottleneck (SEM-B) block, Features Fusion Module (FFM), and Multiscale Attention Decoder (MAD). The SEM-B block extracts sufficient features with fewer parameters. FFMs fuse multiscale semantic features to effectively improve the segmentation accuracy. The MAD well recovers the details of the input images through the attention mechanism. Two networks combined with different components are proposed based on the LMFFNet model. Without pretraining, the smaller network of LMFFNet-S achieves 72.7% mIoU on Cityscapes test set at the 512×1024 resolution with only 1.1 M parameters at a reference speed of 98.9 fps running on a GTX1080Ti GPU while the larger version of LMFFNet-L achieves 74.7% mIoU with 1.4 M parameters at 89.6 fps. Besides, 67.7% mIoU at 208.9 fps and 70.3% mIoU at 72.4 fps are respectively achieved for 360 × 480 and 720 × 960 resolutions on CamVid test set using LMFFNet-S while LMFFNet--L achieves 68.1% mIoU at 182.9 fps and 71.0% mIoU at 66.5 fps, correspondingly. The proposed LMFFNets make an adequate trade-off between accuracy and parameter size for real-time inference for semantic segmentation tasks.
Stars
20
Forks
3
Watchers
20
Open Issues
4
Overall repository health assessment
No package.json found
This might not be a Node.js project
40
commits