Flawed Data Labels Interfere with Accuracy of Machine Learning Benchmarks

Paradise Techsoft Solutions Pvt. Ltd
Jan 6, 2022
2 min read

Updated: Oct 20

The rapid rise of artificial intelligence (AI) could positively change healthcare forever, leading to faster diagnoses and allowing providers to spend more time communicating directly with patients. There are also risks associated with AI in healthcare that must be addressed. For example, errors in dataset labeling utilized for machine learning (ML) training benchmarks (i.e., labeling an image of a frog as a cat, or an apple labeled as a t shirt). This is troubling, as AI and ML errors can negatively impact many patients by attributing inaccurate data. A new study from the Massachusetts Institute of Technology (MIT) found label errors in ten of the most-cited, open-source datasets utilized in ML research. The datasets cited contained six visual datasets: MNIST, CIFAR-10, CIFAR-100, Caltech-256, ImageNet, and QuickDraw; three text datasets: 20news, IMDB, and Amazon Reviews; and one audio dataset: AudioSet. Researchers estimated an average error rate of 3.4% across the ten datasets tested. Machine learning diagnostic models must learn from accurate training datasets containing a diverse set of diseases and outcomes. Faults in the labeling of the ML datasets could lead to flawed AI diagnostic models. Healthcare organizations must be able to articulate the complexity of the question that the MI model is meant to solve. The team responsible for labeling the MI datasets need to understand the required clinical documentation associated with the label, enabling organizations arriving at a time and cost-effective approach. In doing so, it will allow organizations to evaluate and label multiple diseases concurrently by reducing reliance on physicians. Key factors in determining the complexity of diseases in ML include annotation requirements, imaging modality, and presentation of symptoms. Maintaining consideration of these requirements is vital to creating accurately labeled medical images. Understanding that these data limitations exist, MCC, utilizing RemitOneTM, captures data at the point of care and ensures it is accurate and complete. Paving the way for quality data to train the ML models to create effective AI. RemitOneTM truly allows for complete and accurate documentation and coding to be handled automatically with built-in compliance in our point-of-care AI platform. If you’re concerned about consistency and accuracy of the datasets fed into your AI, contact our team at info@mccremitone.com to find out how you can utilize RemitOneTM to confidently use machine learning to train your AI. To see more about MCC and RemitOne, visit our documentary segment that aired on CNBC here: https://www.mccremitone.net/r1video/MCC_03.mp4.

2 Comments

Ярослав Агин

Dec 27, 2025

Часом знаходжу ці джерела випадково, іноді хтось скине в чат, іноді сам зберігаю “на потім”. Частину переглядаю рідко, частину — коли шукаю щось локальне чи нестандартне. Вони різні: новини, огляди, думки, регіональні стрічки. Я не беру все за правду — скоріше, для порівняння та пошуку контрасту між подачею. Можливо, хтось іще знайде серед них щось цікаве або принаймні нове. Головне — мати з чого обирати. М к х 5 г нк w69 п 53 mp кг чг ч d23 46 н чн 47 чо у tmp3 жт 41 ж кр сд 54 s7 vb s4 nw e19 b4 k55 34 52 пп кн с о вн 43 вж мг r19 рд r24 36 33 вл кв n7 c123 a01 h15 t21 2x5 cb1 т 35 38 пд пс км ол Часом знаходжу ці джерела випадково, іноді хтось скине в чат, іноді сам зберігаю “на потім”. Частину переглядаю рідко, частину — коли шукаю щось локальне чи нестандартне. Вони різні: новини, огляди, думки, регіональні стрічки. Я не беру все за правду —…

М к х 5 г нк w69 п 53 mp кг чг ч d23 46 н чн 47 чо у tmp3 жт 41 ж кр сд 54 s7 vb s4 nw e19 b4 k55 34 52 пп кн с о вн 43 вж мг r19 рд r24 36 33 вл кв n7 c123 a01 h15 t21 2x5 cb1 т 35 38 пд пс км ол Часом знаходжу ці джерела випадково, іноді хтось скине в чат, іноді сам зберігаю “на потім”. Частину переглядаю рідко, частину — коли шукаю щось локальне чи нестандартне. Вони різні: новини, огляди, думки, регіональні стрічки. Я не беру все за правду — скоріше, для порівняння та пошуку контрасту між подачею. Можливо, хтось іще знайде серед них щось цікаве або принаймні нове. Головне — мати з чого обирати.

Flawed Data Labels Interfere with Accuracy of Machine Learning Benchmarks

Recent Posts

2 Comments

Address

Email