BIT's new progress in noisy and imbalanced data processing in machine vision classification

News Resource: School of Optics and Photonics

Editor: News Agency of BIT

Translator: Guo Yating, News Agency of BIT

Beijing Institute of Technology, August 1st, 2022: Xu Tingfa's research team at the School of Optics and Photonics, BIT has made new progress in the field of noisy and imbalanced data processing in machine vision classification.  Related research results were published by AAAI-2022 (CCF A), a top international conference in the field of artificial intelligence, under the title "Delving into Sample Loss Curve to Embrace Noisy and Imbalanced Data". AAAI is one of the top highly influential international academic conferences in the field of artificial intelligence. Categized as Class A conference by Computer Society, it mainly publishes and reports on the latest research progress and technologies in the fields of computational intelligence, artificial intelligence, data science and deep learning. The first author of this work is Jiang Shenwang, a postdoctoral fellow at BIT, and the corresponding author is Li Jianan, a special associate researcher, and Professor Xu Tingfa at BIT.

With the development of artificial intelligence, how to train models in noisy and imbalanced data in machine learning has attracted much attention. However, most of the current methods in idealized state study how to deal with noisy or imbalanced data alone. In view of this scientific problem, Professor Xu Tingfa of BIT took the lead in proposing the loss value curve and meta-learning method to solve the noisy and imbalanced data simultaneously. Extensive computational and experimental observations show that, although the noisy and tail samples cannot be distinguished from individual weights, the loss-value curves generated during the entire training process of the sample can provide sufficient information to distinguish the two samples, as shown in Figure 1.

Figure 1. Loss value curve

On this basis, the research team proposed the noisy and imbalanced data processing model of the loss value curve. As shown in Figure 2, the noisy and imbalanced data processing model based on the loss value curve can be divided into two stages: Probing Stage and Allocating Stage. In Probing Stage, the model is trained first to obtain the loss value curve of each sample. In Allocating Stage, we input the loss value curve as a sample feature to the weight generation model and generate the weight for each sample, and it is optimized by using the meta-learning method.

Figure 2.  A noisy and imbalanced data processing model based on loss-value curves

Figure 3 shows a visual analysis of the generated weight curves. In the weight curve, the clean samples and the tail are given higher weights, while the noisy samples and the head samples are given lower weights. This re-empowerment strategy can effectively suppress the influence of noisy data and improve the learning ability of the model on the tail data.

Figure 3.  Visual analysis of weight generation for various samples

This research provides new ideas for noisy and imbalanced data processing in the field of machine vision, and lays a theoretical and technical foundation for further practical application.

Paper details: Jiang, Shenwang, et al. "Delving into Sample Loss Curve to Embrace Noisy and Imbalanced Data." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 36. No. 6. 2022.

Paper link:

About the author:  

Jiang Shenwang is a postdoctoral fellow in the School of Optics and Photonics, BIT. In 2014 and 2020, he received his bachelor's and doctoral degrees from BIT respectively, with his research interests in machine learning and computer vision. He has published 16 academic papers in the top international conferences AAAI, IEEE and other journals, with 98 total Google academic references and 4 first authors. He applied for 1 Chinese invention patent. He has won the champion of The 2nd Anti-UAV Workshop & Challenge in 2021 and the best paper of the Workshop, the second prize of college Student Mathematics Competition (Beijing), the second prize of Student Physics Competition (Beijing) and other honors.

Li Jianan, PhD, is a pre-assistant professor (special associate researcher) of School of Optics and Photonics, BIT, and a postdoctoral fellow of National University of Singapore. His research interests include photoelectric imaging detection and recognition, computer vision, embedded video processing, etc. In the past five years, he has published 30 high-level academic papers, among which the first author paper includes 9 top journals such as IEEE TPAMI, TVCG, CVPR and ICLR, and 1 ESI highly cited paper. The highest single paper was cited more than 600 times, and Google Academic cited more than 2,400 times. He presided over 3 National Natural Youth Science Foundation of China and Postdoctoral Science Foundation projects; and participated in 5 major research instrument development projects of the National Natural Science Foundation of China. He was selected into the Young Talent Support Project of Beijing Association for Science and Technology and won the excellent doctoral dissertation of China Image and graphics Society, Wang Daheng University Student Optical Award and other honors. He won the champion of ImageNet Large-scale Visual Recognition Challenge (ILSVRC-2017) and guided the team to win the ICCV 2021 "Drone Tracking" Challenge Champion and Best Paper Award.

Tingfa Xu, professor, PhD supervisor, professor of optics Engineering, Deputy director of key Laboratory of Optoelectronic Imaging Technology and System of Ministry of Education, Director of Intelligent and Big Data Technology Laboratory, Chongqing Innovation Center, BIT. In recent years, he has led its scientific research team to deepen its research on photoelectric imaging detection and identification, computational imaging and medical photoelectric imaging. He has presided over more than 40 major scientific instrument development projects of the National Natural Science Foundation of China.  He has published 135 academic papers in a series of international and domestic journals, including more than 90 papers indexed by SCI/EI. As the first inventor, he has applied for 45 national invention patents, 15 of which have been authorized or publicized. He guided the graduate students to win the excellent doctoral dissertation of Chinese Image Society, two won the Wang Daheng University Student Optics Award, and two won the honorary titles of top 100 of National Optics and Optical Engineering Doctoral Academic League.