AnNLP-driven framework for automated radiology–pathology concordance assessment in breast biopsy

dc.authorid0000-0002-8273-976X
dc.authorid0000-0003-4605-7822
dc.authorid0000-0002-4958-4575
dc.authorid0009-0000-9534-7581
dc.authorid0009-0008-4804-2927
dc.authorid0009-0000-0289-1781
dc.authorid0000-0001-6250-929X
dc.contributor.authorEsmerer, Emel
dc.contributor.authorNazlı, Mehmet Ali
dc.contributor.authorUzun-Per, Meryem
dc.contributor.authorGümüş Değidiben, Melike
dc.contributor.authorSöyleyici, Merve
dc.contributor.authorTahir, Eren
dc.contributor.authorBal, Mert
dc.date.accessioned2026-05-18T08:19:19Z
dc.date.available2026-05-18T08:19:19Z
dc.date.issued2026
dc.departmentFakülteler, Mühendislik ve Doğa Bilimleri Fakültesi, Bilgisayar Mühendisliği Bölümü
dc.description.abstractBackground/Objectives: To develop and assess the feasibility of a natural language processing (NLP) framework for automated assessment of radiology-pathology concordance in breast biopsy using machine learning-based analysis of unstructured reports. Methods: This retrospective study included 766 paired radiology and pathology reports from ultrasound- or mammography-guided breast biopsies (August 2020-May 2024). Reports underwent translation, normalization, tokenization, lemmatization, and synonym expansion, followed by structured encoding of BI-RADS and pathology categories. Three models were trained: a Decision Tree, a LightGBM classifier, and a fine-tuned BioBERT model. Concordance labels were defined by multidisciplinary consensus. Performance metrics included accuracy, sensitivity, specificity, F1-score, area under the curve (AUC), and Cohen's kappa. SHapley Additive exPlanations (SHAP) analysis was used to identify influential features. Results: Among 766 cases, 707 (92.3%) were concordant and 59 (7.7%) were initially discordant. After excluding B3 lesions (n = 46), 13 true discordant cases remained (1.7%). Including B3 lesions increased clinically non-concordant or indeterminate cases from 1.7% to 7.7%, indicating that the apparent performance of the models is likely sensitive to case definition and dataset composition. BI-RADS 4a was the most common category (31.3%), and benign pathology (B2) accounted for 64.4% of biopsies. Within this dataset, LightGBM yielded the highest apparent AUC (0.999) (however, given the extremely small number of true discordant cases, this estimate is likely unstable and should be interpreted with caution), while BioBERT showed the strongest agreement with expert consensus (κ = 0.89). SHAP analysis identified clinically meaningful terms such as calcification, hypoechoic, ductal, and carcinoma as key contributors to model predictions. Given the very limited number of true discordant cases, these performance estimates are likely unstable and should be regarded as preliminary, requiring validation in larger, multi-center cohorts. Conclusions: This study presents a proof-of-concept NLP-based framework for radiology-pathology concordance assessment. The models showed promising performance in identifying potentially discordant cases; however, given the limited number of true discordant samples, these findings should be considered preliminary and require further validation in larger, multi-center datasets before clinical implementation.
dc.identifier.citationEsmerer, E., Nazlı, M. A., Uzun-Per, M., Gümüş Değidiben, M., Söyleyici, M., Tahir, E., & Bal, M. (2026). AnNLP-driven framework for automated radiology–pathology concordance assessment in breast biopsy. Diagnostics, 16(9), pp. 1-15. https://doi.org/10.3390/diagnostics16091249
dc.identifier.doi10.3390/diagnostics16091249
dc.identifier.endpage15
dc.identifier.issn2075-4418
dc.identifier.issue9
dc.identifier.pmidPMID: 42121953
dc.identifier.scopusqualityQ2
dc.identifier.startpage1
dc.identifier.urihttps://doi.org/10.3390/diagnostics16091249
dc.identifier.urihttps://hdl.handle.net/20.500.13055/1483
dc.identifier.volume16
dc.identifier.wosqualityQ1
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.indekslendigikaynakPubMed
dc.indekslendigikaynak.otherSCI-E - Science Citation Index Expanded
dc.institutionauthorUzun-Per, Meryem
dc.institutionauthorid0000-0002-4958-4575
dc.language.isoen
dc.publisherMDPI Publishing
dc.relation.ispartofDiagnostics
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectNatural Language Processing
dc.subjectRadiology–Pathology Concordance
dc.subjectBreast Biopsy
dc.subjectMachine Learning
dc.subjectArtificial İntelligence
dc.titleAnNLP-driven framework for automated radiology–pathology concordance assessment in breast biopsy
dc.typeArticle
dspace.entity.typePublication

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Kapalı Erişim
İsim:
Tam Metin / Full Text.pdf
Boyut:
1.02 MB
Biçim:
Adobe Portable Document Format
Lisans paketi
Listeleniyor 1 - 1 / 1
Kapalı Erişim
İsim:
license.txt
Boyut:
1.17 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: