Background/Objectives: Intraoral radiographs acquired using photostimulable phosphor (PSP) plates are inherently susceptible to a wide spectrum of artifacts that can compromise diagnostic reliability and lead to unnecessary repeat exposures. Although structured taxonomies describing these artifacts have been proposed, automated methods capable of detecting and localizing multiple artifact types at the pixel level remain limited, particularly under realistic multi-class conditions. In this study, we address the problem of fine-grained, multi-class PSP artifact segmentation by systematically evaluating a deep learning-based framework and establishing a realistic baseline for this inherently challenging task. Methods: A retrospective, multi-center dataset comprising 1497 intraoral PSP radiographs (bitewing and periapical) collected from three institutions was analyzed. Pixel-level annotations were generated by expert oral and maxillofacial radiologists according to a standardized taxonomy consisting of four major artifact groups and 29 artifact classes, together with a background class. A 2D nnU-Net v2 architecture was employed as a baseline segmentation model. Model development was performed using 5-fold cross-validation, and performance was evaluated on an independent test set using Dice coefficient, Intersection over Union (IoU), Precision, and Recall. Results: Across all classes, the model achieved a mean Dice score of 0.0894 ± 0.0084 in cross-validation and 0.0952 on the independent test set, reflecting the intrinsic complexity of the task. Class-wise analysis revealed substantial variability, with higher performance in larger and visually distinctive artifacts, whereas small-scale, low-contrast, and underrepresented classes exhibited markedly reduced performance. Notably, several artifact categories were absent from the training data, resulting in a zero-shot scenario that directly constrained model generalization. Furthermore, segmentation performance demonstrated a strong dependency on class frequency, measured in terms of pixel distribution, underscoring the impact of severe class imbalance. Group-based evaluation showed relatively higher performance for pre-exposure and exposure-related artifacts compared to post-exposure and scanner-related categories. Conclusions: These findings demonstrate that large-scale, multi-class pixel-level segmentation of PSP artifacts represents a fundamentally challenging problem shaped by the combined effects of class imbalance, small object size, heterogeneous artifact morphology, and incomplete training representation. While the proposed framework confirms the feasibility of automated artifact localization, its current performance suggests greater immediate value as a quality control or screening support tool rather than a fully autonomous diagnostic system. By providing a comprehensive baseline and systematic analysis, this study establishes a benchmark for future research and highlights the critical need for imbalance-aware learning strategies, hierarchical modeling, and data-centric approaches to advance this field.
Background/Objectives: Intraoral radiographs acquired using photostimulable phosphor (PSP) plates are inherently susceptible to a wide spectrum of artifacts that can compromise diagnostic reliability and lead to unnecessary repeat exposures. Although structured taxonomies describing these artifacts have been proposed, automated methods capable of detecting and localizing multiple artifact types at the pixel level remain limited, particularly under realistic multi-class conditions. In this study, we address the problem of fine-grained, multi-class PSP artifact segmentation by systematically evaluating a deep learning-based framework and establishing a realistic baseline for this inherently challenging task. Methods: A retrospective, multi-center dataset comprising 1497 intraoral PSP radiographs (bitewing and periapical) collected from three institutions was analyzed. Pixel-level annotations were generated by expert oral and maxillofacial radiologists according to a standardized taxonomy consisting of four major artifact groups and 29 artifact classes, together with a background class. A 2D nnU-Net v2 architecture was employed as a baseline segmentation model. Model development was performed using 5-fold cross-validation, and performance was evaluated on an independent test set using Dice coefficient, Intersection over Union (IoU), Precision, and Recall. Results: Across all classes, the model achieved a mean Dice score of 0.0894 ± 0.0084 in cross-validation and 0.0952 on the independent test set, reflecting the intrinsic complexity of the task. Class-wise analysis revealed substantial variability, with higher performance in larger and visually distinctive artifacts, whereas small-scale, low-contrast, and underrepresented classes exhibited markedly reduced performance. Notably, several artifact categories were absent from the training data, resulting in a zero-shot scenario that directly constrained model generalization. Furthermore, segmentation performance demonstrated a strong dependency on class frequency, measured in terms of pixel distribution, underscoring the impact of severe class imbalance. Group-based evaluation showed relatively higher performance for pre-exposure and exposure-related artifacts compared to post-exposure and scanner-related categories. Conclusions: These findings demonstrate that large-scale, multi-class pixel-level segmentation of PSP artifacts represents a fundamentally challenging problem shaped by the combined effects of class imbalance, small object size, heterogeneous artifact morphology, and incomplete training representation. While the proposed framework confirms the feasibility of automated artifact localization, its current performance suggests greater immediate value as a quality control or screening support tool rather than a fully autonomous diagnostic system. By providing a comprehensive baseline and systematic analysis, this study establishes a benchmark for future research and highlights the critical need for imbalance-aware learning strategies, hierarchical modeling, and data-centric approaches to advance this field.