Label denoising based on Bayesian aggregation
سال
: 2016
چکیده: Label noise is a common problem that affects supervised learning and can produce misleading results. It is shown that only 5% of switched labels lead to a decrease of performances. Therefore, the true class of an instance must be distinguished from its observed label. In the past decade, classification in presence of label noise was the topic of interest. Several scholars focused on kNN-based approaches for data cleansing. These types of approaches often are susceptible to high label noise rate and when a batch of instances with noisy labels are exist they may deteriorate the results. The problem arises since the methods have a local view of instances. Another approach is to have a global view of instances. In a global view, instances with large distance from their respective classes are detected as noisy. A potential problem however is the determination of a threshold. An inappropriate threshold may lead to detection of a correct instance as noisy instance. In this paper a new method for label denoising based on Bayesian aggregation is proposed which solves the problems of kNN-based approaches by aggregating the local and global views of instances. The aggregation of local and global information leads to a more robust and accurate detection of instances with noisy labels and estimation of their true labels. The experimental results show the capabilities and robustness of the proposed method.
کلیدواژه(گان): Label noise,Mislabeled data,Bayesian aggregation,Data cleansing,Supervised learning
کالکشن
:
-
آمار بازدید
Label denoising based on Bayesian aggregation
Show full item record
contributor author | پارسا باقرزاده | en |
contributor author | هادی صدوقی یزدی | en |
contributor author | Parsa Bagherzadeh | fa |
contributor author | Hadi Sadoghi Yazdi | fa |
date accessioned | 2020-06-06T13:27:04Z | |
date available | 2020-06-06T13:27:04Z | |
date issued | 2016 | |
identifier uri | http://libsearch.um.ac.ir:80/fum/handle/fum/3355361 | |
description abstract | Label noise is a common problem that affects supervised learning and can produce misleading results. It is shown that only 5% of switched labels lead to a decrease of performances. Therefore, the true class of an instance must be distinguished from its observed label. In the past decade, classification in presence of label noise was the topic of interest. Several scholars focused on kNN-based approaches for data cleansing. These types of approaches often are susceptible to high label noise rate and when a batch of instances with noisy labels are exist they may deteriorate the results. The problem arises since the methods have a local view of instances. Another approach is to have a global view of instances. In a global view, instances with large distance from their respective classes are detected as noisy. A potential problem however is the determination of a threshold. An inappropriate threshold may lead to detection of a correct instance as noisy instance. In this paper a new method for label denoising based on Bayesian aggregation is proposed which solves the problems of kNN-based approaches by aggregating the local and global views of instances. The aggregation of local and global information leads to a more robust and accurate detection of instances with noisy labels and estimation of their true labels. The experimental results show the capabilities and robustness of the proposed method. | en |
language | English | |
title | Label denoising based on Bayesian aggregation | en |
type | Journal Paper | |
contenttype | External Fulltext | |
subject keywords | Label noise | en |
subject keywords | Mislabeled data | en |
subject keywords | Bayesian aggregation | en |
subject keywords | Data cleansing | en |
subject keywords | Supervised learning | en |
journal title | International Journal of Machine Learning and Cybernetics | fa |
pages | 0-0 | |
journal volume | 0 | |
journal issue | 0 | |
identifier link | https://profdoc.um.ac.ir/paper-abstract-1053118.html | |
identifier articleid | 1053118 |