際際滷

際際滷Share a Scribd company logo
MicrosoftML
ukasz Grala
lukasz@tidk.pl | lukasz.grala@cs.put.poznan.pl
ukasz Grala
 Senior architekt rozwiza Platformy Danych & Business Intelligence & Zaawansowanej Analityki w TIDK
 Tw坦rca Data Scientist as as Service
 Certyfikowany trener Microsoft i wykadowca na wy甜szych uczelniach
 Autor zaawansowanych szkole i warsztat坦w, oraz licznych publikacji i webcast坦w
 Od 2010 roku wyr坦甜niany nagrod Microsoft Data Platform MVP
 Doktorant Politechnika Poznaska  Wydzia Informatyki (obszar bazy danych, eksploracja danych, uczenie maszynowe)
 Prelegent na licznych konferencjach w kraju i na wiecie
 Posiada liczne certyfikaty (MCT, MCSE, MCSA, MCITP,)
 Czonek zarzdu Polskiego Towarzystwa Informatycznego Oddzia Wielkopolski
 Czonek i lider Polish SQL Server User Group (PLSSUG)
 Pasjonat analizy, przechowywania i przetwarzania danych, mionik Jazzu
email lukasz@tidk.pl - lukasz.grala@cs.put.poznan.pl blog: grala.it
MicrosoftML = Scalable R + World Class ML
Microsoft R Server
Enterprise-grade
Scalable/Distributed
Portable
Microsoft ML Toolkit
Fast, Scalable Learners
Battle-tested
State of the art
Key Benefits
Learners
Algorithms Strengths
rxFastLinear Fast, accurate linear learner with auto L1 & L2
rxLogisticRegression Logistic Regression with L1 & L2
rxFastTree
Boosted Decision tree from Bing. Competitive wth
XGBoost. Most accurate learner for most cases
rxFastForest Random Forest
rxNeuralNet GPU accelereted Net# DNNs with Convolutions
rxOneClassSvm Anomaly or unbalanced binary classification
Learners - Scalability
 Streaming (not RAM bound)
 Billions of features
 Multi-proc
 GPU acceleration for DNNs
 Distributed on Hadoop/Spark via Ensambling
Text Analytics Scenarios
Text Classification
 Email or support ticket routing
 Customer call triage
 Detect illegal trading activity
Sentiment Analysis
 Social media monitoring
 Call center support
Sentiment Analysis
 Pre-trained model
 Cognitive Service Parity
 Uses DNN Embedding
 Domain Adaptation
Image Featurization
 Image similarity search:
 Product Catalog search
 Classification
 Plankton Monitoring
 Galaxy Classification
 Retina Pathology detection
 Anomaly Detection
 Defects detection in manufacturing
Image Featurization
Convolutional DNNs with GPU
Pre-trained Models
 ResNet18
 ResNet 50
 ResNet 101
 AlexNet
Algorithms and Transforms in MicrosoftML
Algorithms and Transforms in MicrosoftML
Algorithms and Transforms in MicrosoftML
Algorithms and Transforms in MicrosoftML
Microsoft ML - State of The Art Microsoft Machine Learning - Package R
Compare features by product
1 Memory bound because product can only process datasets that fit into the available memory.
2 Because the Intel Math Kernel Library (MKL) is included in MRO, the performance of a generic R solution is generally better. MKL replaces the standard R implementations of
Basic Linear Algebra Subroutines (BLAS) and the LAPACK library with multithreaded versions. As a result, calls to those low-level routines tend to execute faster on Microsoft R
than on a conventional installation of R.
Question?
lukasz@tidk.pl lukasz.grala@cs.put.poznan.pl
tidk.pl DSaaS.co facebook.com/TIDKpl grala.it
Centrum Wykadowe Politechniki Poznaskiej
22 kwietnia 2017
http://gabc2017poznan.evenea.pl
ukasz Grala  Damian Widera  Grzegorz Stolecki  Tobiasz Koprowski Tomasz Libera  Hubert Kobierzewski
Sawek Stanek  Pawe Potasinski
15 - 17 maja 2017, Wrocaw  6 warsztat坦w, 61 sesji, 5 cie甜ki
Ponad 30 prelegent坦w, midzy innymi: Pawe Potasiski  Dariusz Porowski - ukasz Grala  Damian Widera - Grzegorz Stolecki  Marcin Szeliga  Tobiasz Koprowski  Andrzej Kukua  Tomasz Kopacz
Brent Ozar Greg Low Chris Web
FBSQLDay http://sqlday.pl @SQLDay
Zaawansowana analityka, analiza danych w chmurze, Big Data Analytics i wizualizacja danych
Tworzenie rozwiza platformy danych i Business Intelligence
Administracja platform danych
Wykorzystanie rozwiza platformy danych i analityki w biznesie

More Related Content

Microsoft ML - State of The Art Microsoft Machine Learning - Package R

  • 1. MicrosoftML ukasz Grala lukasz@tidk.pl | lukasz.grala@cs.put.poznan.pl
  • 2. ukasz Grala Senior architekt rozwiza Platformy Danych & Business Intelligence & Zaawansowanej Analityki w TIDK Tw坦rca Data Scientist as as Service Certyfikowany trener Microsoft i wykadowca na wy甜szych uczelniach Autor zaawansowanych szkole i warsztat坦w, oraz licznych publikacji i webcast坦w Od 2010 roku wyr坦甜niany nagrod Microsoft Data Platform MVP Doktorant Politechnika Poznaska Wydzia Informatyki (obszar bazy danych, eksploracja danych, uczenie maszynowe) Prelegent na licznych konferencjach w kraju i na wiecie Posiada liczne certyfikaty (MCT, MCSE, MCSA, MCITP,) Czonek zarzdu Polskiego Towarzystwa Informatycznego Oddzia Wielkopolski Czonek i lider Polish SQL Server User Group (PLSSUG) Pasjonat analizy, przechowywania i przetwarzania danych, mionik Jazzu email lukasz@tidk.pl - lukasz.grala@cs.put.poznan.pl blog: grala.it
  • 3. MicrosoftML = Scalable R + World Class ML Microsoft R Server Enterprise-grade Scalable/Distributed Portable Microsoft ML Toolkit Fast, Scalable Learners Battle-tested State of the art
  • 5. Learners Algorithms Strengths rxFastLinear Fast, accurate linear learner with auto L1 & L2 rxLogisticRegression Logistic Regression with L1 & L2 rxFastTree Boosted Decision tree from Bing. Competitive wth XGBoost. Most accurate learner for most cases rxFastForest Random Forest rxNeuralNet GPU accelereted Net# DNNs with Convolutions rxOneClassSvm Anomaly or unbalanced binary classification
  • 6. Learners - Scalability Streaming (not RAM bound) Billions of features Multi-proc GPU acceleration for DNNs Distributed on Hadoop/Spark via Ensambling
  • 7. Text Analytics Scenarios Text Classification Email or support ticket routing Customer call triage Detect illegal trading activity Sentiment Analysis Social media monitoring Call center support
  • 8. Sentiment Analysis Pre-trained model Cognitive Service Parity Uses DNN Embedding Domain Adaptation
  • 9. Image Featurization Image similarity search: Product Catalog search Classification Plankton Monitoring Galaxy Classification Retina Pathology detection Anomaly Detection Defects detection in manufacturing
  • 10. Image Featurization Convolutional DNNs with GPU Pre-trained Models ResNet18 ResNet 50 ResNet 101 AlexNet
  • 11. Algorithms and Transforms in MicrosoftML
  • 12. Algorithms and Transforms in MicrosoftML
  • 13. Algorithms and Transforms in MicrosoftML
  • 14. Algorithms and Transforms in MicrosoftML
  • 16. Compare features by product 1 Memory bound because product can only process datasets that fit into the available memory. 2 Because the Intel Math Kernel Library (MKL) is included in MRO, the performance of a generic R solution is generally better. MKL replaces the standard R implementations of Basic Linear Algebra Subroutines (BLAS) and the LAPACK library with multithreaded versions. As a result, calls to those low-level routines tend to execute faster on Microsoft R than on a conventional installation of R.
  • 18. Centrum Wykadowe Politechniki Poznaskiej 22 kwietnia 2017 http://gabc2017poznan.evenea.pl ukasz Grala Damian Widera Grzegorz Stolecki Tobiasz Koprowski Tomasz Libera Hubert Kobierzewski Sawek Stanek Pawe Potasinski
  • 19. 15 - 17 maja 2017, Wrocaw 6 warsztat坦w, 61 sesji, 5 cie甜ki Ponad 30 prelegent坦w, midzy innymi: Pawe Potasiski Dariusz Porowski - ukasz Grala Damian Widera - Grzegorz Stolecki Marcin Szeliga Tobiasz Koprowski Andrzej Kukua Tomasz Kopacz Brent Ozar Greg Low Chris Web FBSQLDay http://sqlday.pl @SQLDay Zaawansowana analityka, analiza danych w chmurze, Big Data Analytics i wizualizacja danych Tworzenie rozwiza platformy danych i Business Intelligence Administracja platform danych Wykorzystanie rozwiza platformy danych i analityki w biznesie