Abstract:
To address the challenges of low detection accuracy, high false positive rate and false negative rate, and poor trajectory continuity of traditional image processing methods and deep learning methods in target detection and tracking in complex traffic scenarios, a collaborative optimization framework integrating the advantages of the two methods was proposed. Firstly, the foreground extraction effect was improved by optimizing the background image and contour to solve the defects of traditional methods of being sensitive to interference. Secondly, by fusing the foreground pixels extracted by traditional image processing and the YOLO detection frame, a minimum outer rectangle calibration mechanism was proposed to dynamically adjust the detection frame fit and eliminate false detection frames. Lastly, foreground pixel information was integrated into the SORT algorithm framework and tracking continuity was enhanced via optical flow field compensation to mitigate trajectory fragmentation and identity switching issues prevalent in occlusion scenarios. Experimental evaluation demonstrates that: on the roadside dataset of complex scenarios, the detection Intersection over Union of the proposed model reached 97.46%, and the accuracy was 95.32%, compared with YOLOv7 and YOLOv11, the model of this paper significantly improved the detection accuracy and bounding box alignment. The tracking comprehensive evaluation index Multiple Object Tracking Accuracy was improved to 85.33%, and the ID switching rate and the trajectory fragmentation were reduced to 13.21% and 27.38%, which were significantly improved over the original SORT and DeepSORT algorithms. The conclusion shows that the model significantly improves the detection and tracking performance in complex traffic scenarios by integrating the advantages of traditional methods and deep learning. It holds practical value for application and widespread adoption.