際際滷

際際滷Share a Scribd company logo
Feature Selection and Classification in
       Supporting Report-Based Self-
    Management for People with Chronic
                    Pain
     Author:Yan Huang, Huiru Zheng, Chris Nugent, Paul McCullagh, Norman
                 Black, Kevin E. Vowles, and Lance McCracken




Source: Information Technology in Biomedicine, IEEE
Transactions on Jan. 2011, Journals & Magazines
Advisor: Ben-Jye Chang
Student: YU-HSIEN CHO
OUTLINE
   I. Introduction
   II. Issue
   III. Motivations
   VI. Approaches
   VI. Numerical results
   V. Conclusion
Introduction


 Older people has increased, that two thirds of
  people who reached retirement age had at
  least two chronic conditions.
Introduction


 Machine learning approach, self-reporting
  data collected from the integrated
  biopsychosocial treatment, in order to identify
  an optimal set of features for supporting self
  management.
Introduction




Fig. 1. Assessment interface of the PSMS and the assessment workflow
                          for self-management
Introduction


 We assess the feasibility of applying
  automated classification techniques to
  identify "low" and "better" health status levels
  from self-reporting data and explore an
  appropriate classification algorithm.
OUTLINE
   I. Introduction
   II. Issue
   III. Motivations
   VI. Approaches
   VI. Numerical results
   V. Conclusion
Issue

 Numbers of selected questions and
  classification performance of a persons health
  status level.

 Which ranking method and which classification
  model had the best performance.
OUTLINE
   I. Introduction
   II. Issue
   III. Motivations
   VI. Approaches
   VI. Numerical results
   V. Conclusion
Motivations

 Traditional health care, expensive, consuming
  significant resources , inconvenient.


 PWCP, self-management of their health care has
  been shown to be effective in terms of
  improving their QoL.
PWCP: People With Chronic Pain
QoL: Quality of Life
OUTLINE
   I. Introduction
   II. Issue
   III. Motivations
   VI. Approaches
   VI. Numerical results
   V. Conclusion
Approachs
A.Dataset
187 subjects who suffered from chronic pain
8 types of questionnaire
 total number of questions was 329, answers had values

"pretreatment stage as " low health level ,
 "posttreatment stage as " better health level 

16 (8.6%) of the patients withdrew ,
 171 (91.4%) of the patients completed the treatment

training sets:114 patients, testing sets:57 patients
Approachs
B.Methods
Four feature selection methods, rank the questions.

1.SVM-RFE(Support Vector Machine With Recursive
Feature Elimination):
The ranking criterion for feature i :


Methods: Step 1: Train an SVM on the dataset.
         Step 2: Rank features according to the criterion c.
         Step 3: Eliminate the lowest ranked feature.
         Step 4: If more than one feature remains, return to step 1.
Approachs
                                                                   Q.1
        2.OneR: 1-level decision tree                         1    21
                                                              2    16
 Steps:
                                                              3    22
For each feature fi
For each value v from the domain of fi                         4   20
Select the set of instances where feature fi has value v       5   35
Let c = the most frequent class in that set
Add the clause if feature fi has value v then the class is cto
the rule for feature fi
Output the rule with the highest classification accuracy.
Approachs

3.Information Gain:based on Shannons information
theory and can be calculated from (1)(3)




 A represents a feature (question) of an instance, which has n values

 two classes(pre. and post.),each has 114 instances
Approachs
Approachs

4. X2 Statistic:




 m, number of answers for one question(feature)
 ni , frequency of that answer i
 Pi , probability of that answer i
 n , total frequency for all the questions
 answers, 228329
Approachs

C. Classification Performance Assessment
 Purpose : classify the persons appropriate
  health status

 Classifier: C4.5,Naive Bayes, SVM, MLP
Approachs

1.Overall accuracy:




2.Area Under the ROC Curve(AUC):
Suggested as a tool, which can evaluate
the performance of the classification
alorgithm
OUTLINE
   I. Introduction
   II. Issue
   III. Motivations
   VI. Approaches
   VI. Numerical results
   V. Conclusion
Numerical results
There were no significant differences between the feature
 ranking methods in overall classification accuracy.
(any of the four feature ranking methods can be used)


There were significant differences between the
 classifiers for each ranking method.


The MLP classifier has been identified as the best option
 to build the classification model for PSMS in the sense
 that both overall accuracy and AUC were very high.
OUTLINE
   I. Introduction
   II. Issue
   III. Motivations
   VI. Approaches
   V. Numerical results
   IV. Conclusion
Conclusion

 Feedback information for their self-
  management

 Changing their behavior,lifestyle, and care
  plan in order to achieve effective self-
  management of their chronic condition
Feature selection and classification in supporting report based self-management with chronic pain

More Related Content

Feature selection and classification in supporting report based self-management with chronic pain

  • 1. Feature Selection and Classification in Supporting Report-Based Self- Management for People with Chronic Pain Author:Yan Huang, Huiru Zheng, Chris Nugent, Paul McCullagh, Norman Black, Kevin E. Vowles, and Lance McCracken Source: Information Technology in Biomedicine, IEEE Transactions on Jan. 2011, Journals & Magazines Advisor: Ben-Jye Chang Student: YU-HSIEN CHO
  • 2. OUTLINE I. Introduction II. Issue III. Motivations VI. Approaches VI. Numerical results V. Conclusion
  • 3. Introduction Older people has increased, that two thirds of people who reached retirement age had at least two chronic conditions.
  • 4. Introduction Machine learning approach, self-reporting data collected from the integrated biopsychosocial treatment, in order to identify an optimal set of features for supporting self management.
  • 5. Introduction Fig. 1. Assessment interface of the PSMS and the assessment workflow for self-management
  • 6. Introduction We assess the feasibility of applying automated classification techniques to identify "low" and "better" health status levels from self-reporting data and explore an appropriate classification algorithm.
  • 7. OUTLINE I. Introduction II. Issue III. Motivations VI. Approaches VI. Numerical results V. Conclusion
  • 8. Issue Numbers of selected questions and classification performance of a persons health status level. Which ranking method and which classification model had the best performance.
  • 9. OUTLINE I. Introduction II. Issue III. Motivations VI. Approaches VI. Numerical results V. Conclusion
  • 10. Motivations Traditional health care, expensive, consuming significant resources , inconvenient. PWCP, self-management of their health care has been shown to be effective in terms of improving their QoL. PWCP: People With Chronic Pain QoL: Quality of Life
  • 11. OUTLINE I. Introduction II. Issue III. Motivations VI. Approaches VI. Numerical results V. Conclusion
  • 12. Approachs A.Dataset 187 subjects who suffered from chronic pain 8 types of questionnaire total number of questions was 329, answers had values "pretreatment stage as " low health level , "posttreatment stage as " better health level 16 (8.6%) of the patients withdrew , 171 (91.4%) of the patients completed the treatment training sets:114 patients, testing sets:57 patients
  • 13. Approachs B.Methods Four feature selection methods, rank the questions. 1.SVM-RFE(Support Vector Machine With Recursive Feature Elimination): The ranking criterion for feature i : Methods: Step 1: Train an SVM on the dataset. Step 2: Rank features according to the criterion c. Step 3: Eliminate the lowest ranked feature. Step 4: If more than one feature remains, return to step 1.
  • 14. Approachs Q.1 2.OneR: 1-level decision tree 1 21 2 16 Steps: 3 22 For each feature fi For each value v from the domain of fi 4 20 Select the set of instances where feature fi has value v 5 35 Let c = the most frequent class in that set Add the clause if feature fi has value v then the class is cto the rule for feature fi Output the rule with the highest classification accuracy.
  • 15. Approachs 3.Information Gain:based on Shannons information theory and can be calculated from (1)(3) A represents a feature (question) of an instance, which has n values two classes(pre. and post.),each has 114 instances
  • 17. Approachs 4. X2 Statistic: m, number of answers for one question(feature) ni , frequency of that answer i Pi , probability of that answer i n , total frequency for all the questions answers, 228329
  • 18. Approachs C. Classification Performance Assessment Purpose : classify the persons appropriate health status Classifier: C4.5,Naive Bayes, SVM, MLP
  • 19. Approachs 1.Overall accuracy: 2.Area Under the ROC Curve(AUC): Suggested as a tool, which can evaluate the performance of the classification alorgithm
  • 20. OUTLINE I. Introduction II. Issue III. Motivations VI. Approaches VI. Numerical results V. Conclusion
  • 21. Numerical results There were no significant differences between the feature ranking methods in overall classification accuracy. (any of the four feature ranking methods can be used) There were significant differences between the classifiers for each ranking method. The MLP classifier has been identified as the best option to build the classification model for PSMS in the sense that both overall accuracy and AUC were very high.
  • 22. OUTLINE I. Introduction II. Issue III. Motivations VI. Approaches V. Numerical results IV. Conclusion
  • 23. Conclusion Feedback information for their self- management Changing their behavior,lifestyle, and care plan in order to achieve effective self- management of their chronic condition