ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Biological patterns(electronic nose) data classification and
recognition machine learning approaches using GPGPU
Pavels Kartasevs
Msc Applied Bioinformatics course
Cranfield University
Contents
¡ñ Electronic nose
¡ñ SVM and ANN
¡ñ Comparison of developed solution
¡ñ Heterogeneous processing
¡ñ Results
¡ñ Further improvements and conclusions
Problem description
¡ñ Prediction tools allows to analyze information
from different sources
¡ñ Application: Meat spoilage prediction
¡ñ Meat spoilage problem (from manufacturer to
producer)
¡ñ Fast enough solution and availability of free
software
Meat spoilage
¡ñ Problem that can impact health
¡ñ Cause ¨C many different bacteria
¡ñ Sensory panel/laboratory analysis
disadvantage
¡ñ Automatic analysis tools
Electronic nose
¡ñ Wide emerging field of cheap analysis devices
¡ñ Can be used for food science
¡ñ Automatic food quality determination
Electronic nose in prediction of meat
spoilage
¡ñ Electronic nose generates data
¡ñ Low cost of the device
¡ñ Fast result
¡ñ E-nose results interpretation
SVM and Neural networks
¡ñ SVM
Support vector machines are relatively new
form of supervised machine learning.
¡ñ Artificial neural
networks
Artificial neural network by their model mimics
human brain structure.
Difference between SVM and ANN
¡ñ SVM is fast
¡ñ Must preform grid
search to find
optimum solution
¡ñ Construct
mathematical model
of problem
¡ñ ANN learns, opposite
to SVM
¡ñ Can work efficiently
than SVM
¡ñ Processing speed
depends on neuron
count
SVM Performance comparison
ensembleSVM_Count_ALL.R CPP_BIO Program
0
50
100
150
200
250
300
350
112
10
308
19
"Intel? Xeon(R) CPU X5492 @ 3.40GHz ¡Á 8
DDR2 800 Mhz, Ubuntu 64-bit"
Core i5-3210M / 4 Gb DDR3
Minutes
Implementation
¡ñ To get such speed all application/algorithm
was reimplemented in C/C++ programming
language which is the fastest programming
language
¡ñ LibSVM C/C++ library
Prediction performance of R SVM
1 iterations, C param. from 1 to 50
with step 1, gamma from 0.1 to 10 with
step 0.1, 80 SVM
Time 60 min. Time 6.5 min.
GPU as co-processor
¡ñ Gpu is good on
parallel computations
¡ñ GPU memory latency
¡ñ GPU library call
latency
¡ñ
GPU libraries results
Easy-cpu Easy-gpu Svm-train(cpu) Gpusvm-0.2
0
5
10
15
20
25
30
35
Processing time of 2Mb beef_fillets_fitr data
Library
Time(Seconds)
Why is GPU slower?
GPU Ensemble
¡ñ Due to small data amount running one SVM
on the GPU in inefficient
¡ñ But using GPU structure is making sense to
run ensemble of SVM on the GPU in parallel
Re-implementation of libsvm on the
GPU
¡ñ 2 different approaches
Target NVIDIA ¡°FERMI¡± GPU Target ALL NVIDIA GPU
GPU re-implementation results
¡ñ Full GPU processing time: 1 minute vs 12
seconds on the CPU
¡ñ As accurate as CPU
GPU re-implementation results(2)
¡ñ Heterogeneous GPU processing
cpu gpu
0
0.5
1
1.5
2
2.5
1.8
2
Time (seconds)
GPU implementation is slower by 10%
GPU re-implementation results(3)
30.00%
70.00%
Time performing by CPU to calaulate SVM matrix
SVM Kernel calculation
Other computing
SVM Kernel matrix calculation on GPU saves ~30% of the CPU time, CPU
is free to do other calculations
Solution for Graphical User Interface
1
2
3
3
4
Future improvements
¡ñ Further improvements of solution might
include:
¡ñ Re-implement solution fully in Java languarge
to make portable and library and platform
independent
¡ñ Add Web-interface to the solution
¡ñ Write installation application to easy install
solution
Conclusions
¡ñ Implemented solution is 10 times faster, than
existing R framework solution
¡ñ Graphical interface implemented
¡ñ Different analysis types
¡ñ Heterogeneous computing
Thank you.
Any questions?

More Related Content

Msc presentation Bioinformatics

  • 1. Biological patterns(electronic nose) data classification and recognition machine learning approaches using GPGPU Pavels Kartasevs Msc Applied Bioinformatics course Cranfield University
  • 2. Contents ¡ñ Electronic nose ¡ñ SVM and ANN ¡ñ Comparison of developed solution ¡ñ Heterogeneous processing ¡ñ Results ¡ñ Further improvements and conclusions
  • 3. Problem description ¡ñ Prediction tools allows to analyze information from different sources ¡ñ Application: Meat spoilage prediction ¡ñ Meat spoilage problem (from manufacturer to producer) ¡ñ Fast enough solution and availability of free software
  • 4. Meat spoilage ¡ñ Problem that can impact health ¡ñ Cause ¨C many different bacteria ¡ñ Sensory panel/laboratory analysis disadvantage ¡ñ Automatic analysis tools
  • 5. Electronic nose ¡ñ Wide emerging field of cheap analysis devices ¡ñ Can be used for food science ¡ñ Automatic food quality determination
  • 6. Electronic nose in prediction of meat spoilage ¡ñ Electronic nose generates data ¡ñ Low cost of the device ¡ñ Fast result ¡ñ E-nose results interpretation
  • 7. SVM and Neural networks ¡ñ SVM Support vector machines are relatively new form of supervised machine learning. ¡ñ Artificial neural networks Artificial neural network by their model mimics human brain structure.
  • 8. Difference between SVM and ANN ¡ñ SVM is fast ¡ñ Must preform grid search to find optimum solution ¡ñ Construct mathematical model of problem ¡ñ ANN learns, opposite to SVM ¡ñ Can work efficiently than SVM ¡ñ Processing speed depends on neuron count
  • 9. SVM Performance comparison ensembleSVM_Count_ALL.R CPP_BIO Program 0 50 100 150 200 250 300 350 112 10 308 19 "Intel? Xeon(R) CPU X5492 @ 3.40GHz ¡Á 8 DDR2 800 Mhz, Ubuntu 64-bit" Core i5-3210M / 4 Gb DDR3 Minutes
  • 10. Implementation ¡ñ To get such speed all application/algorithm was reimplemented in C/C++ programming language which is the fastest programming language ¡ñ LibSVM C/C++ library
  • 11. Prediction performance of R SVM 1 iterations, C param. from 1 to 50 with step 1, gamma from 0.1 to 10 with step 0.1, 80 SVM Time 60 min. Time 6.5 min.
  • 12. GPU as co-processor ¡ñ Gpu is good on parallel computations ¡ñ GPU memory latency ¡ñ GPU library call latency ¡ñ
  • 13. GPU libraries results Easy-cpu Easy-gpu Svm-train(cpu) Gpusvm-0.2 0 5 10 15 20 25 30 35 Processing time of 2Mb beef_fillets_fitr data Library Time(Seconds) Why is GPU slower?
  • 14. GPU Ensemble ¡ñ Due to small data amount running one SVM on the GPU in inefficient ¡ñ But using GPU structure is making sense to run ensemble of SVM on the GPU in parallel
  • 15. Re-implementation of libsvm on the GPU ¡ñ 2 different approaches Target NVIDIA ¡°FERMI¡± GPU Target ALL NVIDIA GPU
  • 16. GPU re-implementation results ¡ñ Full GPU processing time: 1 minute vs 12 seconds on the CPU ¡ñ As accurate as CPU
  • 17. GPU re-implementation results(2) ¡ñ Heterogeneous GPU processing cpu gpu 0 0.5 1 1.5 2 2.5 1.8 2 Time (seconds) GPU implementation is slower by 10%
  • 18. GPU re-implementation results(3) 30.00% 70.00% Time performing by CPU to calaulate SVM matrix SVM Kernel calculation Other computing SVM Kernel matrix calculation on GPU saves ~30% of the CPU time, CPU is free to do other calculations
  • 19. Solution for Graphical User Interface 1 2 3 3 4
  • 20. Future improvements ¡ñ Further improvements of solution might include: ¡ñ Re-implement solution fully in Java languarge to make portable and library and platform independent ¡ñ Add Web-interface to the solution ¡ñ Write installation application to easy install solution
  • 21. Conclusions ¡ñ Implemented solution is 10 times faster, than existing R framework solution ¡ñ Graphical interface implemented ¡ñ Different analysis types ¡ñ Heterogeneous computing