A Deep Learning Approach for Cervical Pathology and HPV Detection on Digitized Pap Smear Slides

Gülfize Coşkun¹, Fikret Dirilenoğlu², İpek Erbarut Seven³, Cem Berk Türk³, Bahar Müezzinoğlu⁴, Erkan Kaplanoğlu⁵, Mustafa Caner Aküner¹

¹Department of Mechatronics Engineering, Marmara University, Türkiye
²Department of Pathology, Faculty of Medicine, Near East University, Cyprus
³Department of Pathology, Faculty of Medicine, Marmara University, Türkiye
⁴Department of Pathology, Faculty of Medicine, Istanbul Medipol University Türkiye
⁵Department of Engineering Management and Technology, The University of Tennessee at Chattanooga, USA

Introduction:

Liquid-based cytology is widely and effectively used worldwide to detect premalignant and malignant changes in cervical cells. Cytopathologists examine Pap smear samples manually under the microscope to detect abnormal cells. However, this process can be time-consuming, subjective, and prone to human errors. Besides, it may result in unclear or unsatisfactory results due to poor sampling, uneven cell distribution, improper fixation, or obscuring factors. Cervical cancer screening using liquid-based cytology (LBC) requires manual examination of Pap smear samples by cytopathologists, which is time-consuming, subjective, and error-prone. Automated image analysis can assist cytopathologists by providing fast, accurate, and reliable diagnosis. This work aims to develop and train a deep learning model on digitized Pap smear slides to diagnose cervical pathologies, detect human papillomavirus (HPV), and perform quantitative analyses.

Materials and methods:

Liquid-based Pap smear slides and cytopathologic diagnosis reports of negative and positive cases for intraepithelial lesion or malignancy (IELM) and HPV were requested from the archives of two research hospitals. Digital images were created using three whole slide scanners. These images were classified by pathologists into four IELM classes according to the Bethesda system and two HPV classes. The IELM dataset was trained and tested with a four-class single shot detector. The HPV dataset was trained and tested with a binary class unsupervised deep learning model to classify the presence of HPV.

Figure 1. The first line presents normal cells, the second line represents images including malignant cells with HPV-related changes (Pap smear).

Figure 2. Negative, ASC-US, LSIL, and HSIL are the four classes which created the digital dataset of this work.

Figure 3. The input data consisted of 300 digitized whole slide images and 100,000 patch images of 512x512 pixel, The four-class model for IELM was extracted in four classes, namely negative, ASC-US, LSIL, and HSIL. The binary model for the presence of HPV was also extracted in two classes, negative and positive. In the first line, the model of the four IELM is shown, which was created with the single shot detector. The second line shows the model for a prediction of HPV. This prediction of the model was created by using an unsupervised deep learning model. In the end, the predicted cells are shown on a whole slide image, which was given as an input data.

Figure 4. This figure represents the results as a graphic. The details of the results are written below.

Results:

The four-class model IELM achieved an accuracy of 54.3% a sensitivity of 57%,, and an F1 score of 55%. The binary model for HPV achieved an accuracyof 63.51% and F1 score of 61%.

Conclusion:

This work presents a deep learning approach using multiple artificial intelligence models for automated cervical cancer screening on LBC. The proposed model can classify Pap smear images into four IELM classes and two HPV classes with moderate accuracy, sensitivity, and F1 score. The model can also perform quantitative analyses such as counting the number of cells, measuring the size and shape of nuclei. The model can be a useful tool for cytopathologists to improve the efficiency and reliability of cervical cancer diagnosis.