This document presents a study that uses Fourier transform infrared (FTIR) spectroscopy and forward feature selection to identify optimal spectral biomarkers for diagnosing different oral lesions. Tissue samples from normal, oral submucous fibrosis (OSF), oral leukoplakia (OLK) and oral squamous cell carcinoma (OSCC) patients were analyzed using FTIR spectroscopy. Forward feature selection identified the 6 best spectral markers for differentiating each lesion type. Classification using these spectral markers achieved sensitivities from 63.6-86.9% and specificities from 68.8-91.3%, demonstrating the potential of this approach for non-invasive oral lesion diagnosis.
1. Fourier Transform Infrared
Spectroscopic Spectral Feature
Subset Selection for Optimal Oral
Lesion Diagnosis
Satarupa Banerjee*1, Jitamanyu Chakrabarty2,
Mousumi Pal3,
Ranjan Rashmi Paul3 , Jyotirmoy Chatterjee1
1. School of Medical Science and Technology, Indian Institute of Technology Kharagpur,
India
2. Department of Chemistry, National Institute of Technology Durgapur, India
3. Department of Oral and Maxillofacial Pathology, Guru Nanak Institute of Dental Sc. and
Res., Kolkata, India.
3. Oral Submucous Fibrosis/ OSF
Normal/NOM
Oral Squamous
Cell Carcinoma/
OSCC
Different Grades of Oral Epithelial Dysplasia
Oral Leukoplakia/OLK
Different
Types of
Pre -Cancer
Oral Cancer, Precancer and Carcinogenesis
4. Incidence of OSCC
Mortality due to OSCC
Worldwide prevalence of OSCC
Worldwide Incidence, Prevalence and Mortality of OSCC
Featured prediction of OSCC in 2035
5. Histopathological assessment suffers from inter
and intra observer variability (Kujan et al. 2007), 48-72
hours of processing time
Utilization of multiple molecular biomarker
in disease diagnosis is costly affair
Early confirmative diagnosis is needed
During conventional FTIR data analysis by PCA LDA based classification,
existence of individual feature is lost (Banerjee et al. 2015, 2016)
Global Challenge
Proposed Solution
Exploration of FTIR based spectral marker selection, since Raman
spectroscopy can not be implemented in clinical setup due to high data
acquisition time, low efficiency of inelastic light scattering (Baker et al.
2014)
Feasibility study to assess role of forward feature selection in label free
spectral marker identification
6. Methodology
Formalin fixed
paraffin embedded
tissue sections
57 tissue
biopsy
samples (7
NOM, 11
OSF, 16
OLK and
23 OSCC)
FTIR Spectra
acquisition of
deparaffinized
acetone dried
sections
Histopathological
validation of H&E
tissue sections Feature
Selection
Forward Feature Selection
Two Class Classification by
linear SVM, using LOOCV
Wrapper
Dimensionality
Reduction
Pre-processing
PCA-LDA
6 Best Feature Subset
Performance assessment using
Sensitivity and Specificity
Softwares Used
OMNIC Series Software - Thermo
Scientific
IRootLab toolbox in MATLAB R2015a
Orange 2.7 for Classification Task
7. Goal
Choose a subset of the complete set of input
features which can predict the output with
accuracy comparable to the performance of
the complete input set
with great reduction of the computational cost
Procedure
Forward Feature Selection
(Heuristic, Wrapper based Search)
8. (a1) Mean FTIR spectra of whole region (400-4000-1 cm) (a2) Mean spectra of whole region (400-4000-1 cm) after rubberband like base like
correction (RBBC) (a3) Mean spectra of fingerprint region after RBBC, maximum vector normalization followed by Savitzki-Golay
differentiation of 1st Derivative spectra of NOM, OLK, OSF and OSCC (a4) LDA scores plot of pre-processed spectra after mean centering and
PCA-LDA with confidence ellipse representing confidence interval at 95% (a.u arbitrary unit),(b) Second derivative of average FTIR spectra of
NOM, OLK, OSF and OSCC
Result
9. Disease Classification Sensitivity (%) Specificity (%) Accuracy (%)
NOM vs. OLK 68.8 78.3 74.4
OSF vs. OSCC 63.64 91.3 82.35
OLK vs. OSCC 86.96 68.75 79.49
OLK vs. OSF 81.82 81.82 81.48
Classification Performance Assessment
Selected Biomarker
Disease Classification Spectral Marker Selected (in cm-1 )
NOM vs. OLK 1032, 956, 1707, 1639, 1606, and 1565
OSF vs. OSCC 1687, 1619,1531,1481,1384, and 1322
OLK vs. OSCC 1782, 1713, 1665, 1545, 1409, and 1161
OLK vs. OSF 1670,1306,1757,1723,1611, and 1554
10. Label free FTIR based spectral markers can delineate oral lesions, mainly
OLK and OSF with high sensitivity and specificity
Features selected for each type of disease classification can be used for
any pattern recognition system
Suggested computational technique is capable of theoretically relevant
peak picking for optimal classification based subjective oral disease
diagnosis
Chemical alteration in diseases is mainly due to protein phosphorylation,
as evident from the selected spectra
Low cost, readily available spectral marker selection technique was
proposed
Take Home Message
11. S Banerjee, S Chatterjee, A Anura, J Chakrabarty, M Pal, B Ghosh, R R Paul, D Sheet, J
Chatterjee. Global Spectral and Local Molecular Connects with Optical Coherence
Tomography Features to Classify Oral Lesions towards Unraveling Quantitative Imaging Bio-
markers RSC Advances 6.9 (2016): 7511-7520.
S Banerjee, M Pal, J Chakrabarty, C Petibois, RR Paul, A Giri, and J Chatterjee. "Fourier-
transform-infrared-spectroscopy based spectral-biomarker selection towards optimum diagnostic
differentiation of oral leukoplakia and cancer." Analytical and bioanalytical chemistry 407, no.
26 (2015): 7935-7943.
S Banerjee and J Chatterjee "Molecular Pathology Signatures in Predicting Malignant
Potentiality of Dysplastic Oral Pre-cancers." Springer Science Reviews 3.2 (2015): 127-136.
Kujan, Omar, et al. "Why oral histopathology suffers inter-observer variability on grading oral
epithelial dysplasia: an attempt to understand the sources of variation." Oral oncology 43.3
(2007): 224-231.
Baker, Matthew J., et al. "Using Fourier transform IR spectroscopy to analyze biological
materials." Nature protocols 9.8 (2014): 1771-1791.
References