Mākslīgā intelekta izmantošana krūšu blīvuma noteikšanā, salīdzinot ar cilvēku-radiologu: Retrospektīvs novērošanas pētījums
Loading...
Date
Authors
Advisor
Journal Title
Journal ISSN
Volume Title
Publisher
Latvijas Universitāte
Language
lav
Abstract
Nosaukums: Mākslīgā intelekta izmantošana krūšu blīvuma noteikšanā salīdzinājumā ar radiologa novērtējumu: retrospektīvs novērošanas pētījums Levads: Precīzai krūts blīvuma noteikšanai ir liela nozīme krūts vēža diagnostikā. Mamogrāfijas izmeklējumā, blīvi krūšu audi var piesegt audzējus, kas tādējādi var novest pie nepareizas diagnozes. Radiologa krūšu blīvuma novērtējums attēlodiagnostikas izmeklējumu attēlos ir subjektīvs un var atšķirties. Mākslīgais intelekts (MI) pēdējos gados ir integrēts krūšu attēldiagnostikas iekārtās un ir ievērojami attīstījies. Šī pētījuma mērķis bija salīdzināt un novērtēt MI precizitāti ar cilvēka radiologu precizitāti, klasificējot krūšu blīvumu, izmantojot BI-RADS klasifikācijas sistēmu, un izpētīt MI potenciālu klīnisko lēmumu pieņemšanā. Mērķis: Novērtēt mākslīgā intelekta diagnostisko veiktspēju krūšu blīvuma noteikšanā salīdzinājumā ar radiologiem-cilvēkiem, izmantojot slimnīcā esošos datus, un izpētīt tā iespējamo ietekmi uz klīnisko lēmumu pieņemšanu. Materiāls un metodes: Šis pētījums veikts Rīgas Austrumu klīniskajā universitātes slimnīcā, iekļaujot 240 mammogrāfijas izmeklējumus. Visus izmeklējumus novērtēja gan MI (Valpora sistēma), gan radiologi, novērtējot krūšu blīvumu, izmantojot BI-RADS kategoriju A līdz D. Izmantojot saistīto paraugu robežviendabīguma (RSMH) testu, tika analizēta blīvuma novērtējuma atbilstība starp MI un radiologu klasifikāciju. Tika veikts binārās loģistikas regresijas modelis, lai noskaidrotu, vai pacienta vecums ietekmē novērtējumu blīvuma atbilstību a . Tika veikta binārā sajaukšanas matrica, lai grupētu BI-RADS kategorijas zema blīvuma (A+B) un augsta blīvuma (C+D) kategorijās, lai aprēķinātu diagnostikos precizitātes rādītājus (jutīgums, specifiskums, pozitīva paredzamā vērtība (PPV) un negatīvā paredzamā vērtība (NPV)). Rezultāti: Pētījumā tika iekļauti 240 gadījumi, visas sievietes vecumā no 40 līdz 75 gadiem. Vidējais vecums bija 59,81 gads (SD ± 7,9). Atbilstība starp MI un radiologa novērtējumiem bija augsta, un 98% gadījumu atklāja precīzas atbilstības (54%) vai atšķīrās tikai par vienu BI-RADS kategoriju (44%). RSMH tests neatklāja statistiski nozīmīgu atšķirību klasifikācijā starp MI un radiologiem (p = 0,128). Tomēr binārā loģistiskā regresija parādīja, ka, lai gan kopējais modelis bija statistiski nozīmīgs (p = 0,047), vecums kā prognozētājs bija tikai daļēji nozīmīgs (p = 0,05, OR = 0,968, 95% Cl: 0,937-1,000), kas liecina, ka tas, visticamāk, nav klīniski nozīmīgs, jo tas ir statistiski nozīmīgs. Kad BI-RADS tika pārgrupēti zema un augsta blīvuma grupās, MI uzrādīja 81,7% sakritību zema blīvuma grupā un 61,2% augsta blīvuma grupā, kopumā saskaņojot 73,3%. Diagnostikas veiktspējas rādītāji atklāja jutību 61,2%, specifiskumu 81,7%, PPV 69,8% un NPV 75,3%. Secinājums: Šis pētījums parādīja, ka mākslīgā intelekta programmas “Valpora” un radiologu krūšu blīvuma novērtējums, izmantojot BI-RADS klasifikāciju, ir salīdzināmi un ar augstu sakritību un nelielām neatbils
Title: The use of artificial intelligence in detecting breast density compared to human radiologists: a retrospective observational study. Background: Accurate classification of breast density plays a major role in the detection of breast cancer.. On imaging such as mammography, it is well known that dense breast tissue can obscure tumors, which consequently can lead to missed diagnosis. Detecting breast density on imaging by a human radiologist is often subjective and can vary, which can impact accuracy in screening. Artificial intelligence (AI) has, in recent years, been integrated into breast imaging machines and has grown significantly. The aim of this study was to compare and evaluate the accuracy of AI with that of human radiologists when classifying breast density using the BI-RADS categorical system and to investigate AI's potential in supporting clinical decision-making. Objective: To evaluate the diagnostic performance of artificial intelligence in detecting breast density compared to human radiologists using existing data in the hospital and to investigate its potential effect on clinical decision-making. Material and methods: This study was performed in Riga Eastern Clinical University Hospital using 240 mammographic cases. Informed consent was obtained by the ethics committee due to the retrospective nature of the study and the use of anonymized data. All cases were assessed by both AI (Valpora system) and radiologists, where they evaluated the breast density by using BI-RADS category A to D. By using the Related-Samples Marginal Homogeneity (RSMH) test, the agreement between AI and radiologists classification was analyzed. The binary logistic regression model was performed in order to investigate whether the patient's age influenced the agreement. A binary confusion matrix was conducted to group BI-RADS categories into low density (A+B) and high density (C+D) in order to calculate the diagnostic performance metrics (sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV)). Results: The study involved 240 cases, all females between the ages of 40 to 75 years old. The mean age was 59.81 years (SD ± 7.9). The agreement between AI and radiologist assessments was high, with 98% of the cases revealing exact matches (54%) or differing by only one BI-RADS category apart (44%). The RSMH test revealed no statistically significant difference in the classification between AI and radiologists (p = 0.128). However, the binary logistic regression demonstrated that even though the overall model was statistically significant (p= 0.047), age as a predictor was only marginally significant (p = 0.05, OR = 0.968, 95% Cl: 0.937-1.000), suggesting it is not likely to be clinically relevant as it is statistically borderline. When BI-RADS were regrouped into low and high density, AI demonstrated 81.7% agreement in the low-density group and 61.2% in the high-density group, with an overall agreement of 73.3%. Diagnosti
Title: The use of artificial intelligence in detecting breast density compared to human radiologists: a retrospective observational study. Background: Accurate classification of breast density plays a major role in the detection of breast cancer.. On imaging such as mammography, it is well known that dense breast tissue can obscure tumors, which consequently can lead to missed diagnosis. Detecting breast density on imaging by a human radiologist is often subjective and can vary, which can impact accuracy in screening. Artificial intelligence (AI) has, in recent years, been integrated into breast imaging machines and has grown significantly. The aim of this study was to compare and evaluate the accuracy of AI with that of human radiologists when classifying breast density using the BI-RADS categorical system and to investigate AI's potential in supporting clinical decision-making. Objective: To evaluate the diagnostic performance of artificial intelligence in detecting breast density compared to human radiologists using existing data in the hospital and to investigate its potential effect on clinical decision-making. Material and methods: This study was performed in Riga Eastern Clinical University Hospital using 240 mammographic cases. Informed consent was obtained by the ethics committee due to the retrospective nature of the study and the use of anonymized data. All cases were assessed by both AI (Valpora system) and radiologists, where they evaluated the breast density by using BI-RADS category A to D. By using the Related-Samples Marginal Homogeneity (RSMH) test, the agreement between AI and radiologists classification was analyzed. The binary logistic regression model was performed in order to investigate whether the patient's age influenced the agreement. A binary confusion matrix was conducted to group BI-RADS categories into low density (A+B) and high density (C+D) in order to calculate the diagnostic performance metrics (sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV)). Results: The study involved 240 cases, all females between the ages of 40 to 75 years old. The mean age was 59.81 years (SD ± 7.9). The agreement between AI and radiologist assessments was high, with 98% of the cases revealing exact matches (54%) or differing by only one BI-RADS category apart (44%). The RSMH test revealed no statistically significant difference in the classification between AI and radiologists (p = 0.128). However, the binary logistic regression demonstrated that even though the overall model was statistically significant (p= 0.047), age as a predictor was only marginally significant (p = 0.05, OR = 0.968, 95% Cl: 0.937-1.000), suggesting it is not likely to be clinically relevant as it is statistically borderline. When BI-RADS were regrouped into low and high density, AI demonstrated 81.7% agreement in the low-density group and 61.2% in the high-density group, with an overall agreement of 73.3%. Diagnosti