ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Concurrency Control for 
Parallel Machine Learning 
Dimitris Papailiopoulos 
Xinghao Pan, Joseph Gonzalez, Stefanie Jegelka, Tamara Broderick, Dimitris Papailiopoulos, Joseph 
Bradley, Michael I. Jordan
Model 
State 
Data 
Serial Inference
Model 
State 
Parallel Inference 
Processor 1 
Processor 2 
Data
Model 
State 
Data 
Parallel Inference 
Processor 1 
Processor 2 
Concurrency: 
more machines = less time 
Correctness: 
serial equivalence 
?
Model 
State 
Data 
Coordination Free Parallel 
Inference 
Processor 1 
Processor 2 
? 
Ignore collisions 
Concurrency: 
(almost) free 
+ 
Speedup = #CPU 
Correctness? 
Not always...
Correctness 
Serial 
Low High
Correctness 
Concurrency 
Coordination-free 
Serial 
High 
Low High 
Low
Correctness 
Concurrency 
Coordination-free 
Serial 
High 
Low High 
Low 
Concurrency 
Control 
Database mechanisms 
o Guarantee correctness 
o Maximize concurrency 
? Mutual exclusion 
? Optimistic CC
Model 
State 
Data 
Mutual Exclusion Through 
Locking 
Processor 1 
Processor 2 
Introduce locking (scheduling) protocols to prevent 
conflicts.
Mutual Exclusion Through 
Model 
State 
Data 
Processor 1 
Processor 2 
Locking 
? 
Enforce local serialization to avoid conflicts.
Optimistic Concurrency Control 
Model 
State 
Data 
Processor 1 
Processor 2 
Allow computation to proceed without blocking. 
Kung & Robinson. On optimistic methods for concurrency 
control.
Optimistic Concurrency Control 
Model 
State 
Data 
Invalid Outcome 
? ? 
Processor 1 
Processor 2 
Validate potential conflicts. 
Kung & Robinson. On optimistic methods for concurrency 
control.
Optimistic Concurrency Control 
Model 
State 
Data 
? ? 
Processor 1 
Processor 2 
Rollback and Redo 
Take a compensating action. 
Kung & Robinson. On optimistic methods for concurrency 
control.
Concurrency Control 
14 
Coordination Free: 
Provably fast and correct under key assumptions. 
Concurrency Control: 
Provably correct and fast under key assumptions. 
Systems Ideas to 
Improve Efficiency
Machine Learning + Concurrency 
Clusteri 
ng 
Online 
Facility 
Location 
Control 
(Xinghao Pan et al.) 
Submodular 
Maximization 
Subset selection, diminishing 
marginal gains 
Max Graph 
Cut 
Set 
Cover 
Sensor Placement 
Social Network 
Influence 
Propagation 
Document 
Summarization 
Sports 
Football 
Word Series 
Giants 
Cardinals 
Politics 
Midterm 
Obama 
Democrat 
Tea 
Finance 
QE 
market 
interest 
Dow 
Topic Modelling 
Correlation 
Clustering 
Deduplication 
Community 
Detection
Machine Learning + Concurrency 
Clusteri 
ng 
Online 
Facility 
Location 
Control 
(Xinghao Pan et al.) 
Submodular 
Maximization 
Subset selection, diminishing 
marginal gains 
Max Graph 
Cut 
Set 
Cover 
Sensor Placement 
Social Network 
Influence 
Propagation 
Document 
Summarization 
Sports 
Football 
Word Series 
Giants 
Cardinals 
Politics 
Midterm 
Obama 
Democrat 
Tea 
Finance 
QE 
market 
interest 
Dow 
Topic Modelling 
Correlation 
Clustering 
Deduplication 
Community 
Detection 
Serial ML 
algorithm 
Sequence of 
transactions 
Identify potential 
conflicts 
Apply Concurrency 
Control 
mechanisms 
Parallel ML 
algorithm
Application: Deduplication 
Computer Science 
Division ¨C University of 
California Berkeley CA 
University of California at Berkeley 
Department of 
Physics Stanford 
University California 
Lawrence Berkeley National 
Labs <ref>California</ref>
Application: Deduplication
Serial Correlation Clustering 
Nir Ailon, Moses Charikar, and Alantha Newman. 
Aggregating inconsistent information: ranking and clustering. 
Journal of the ACM (JACM), 55(5):23, 2008. 
Serially process vertices
Serial Correlation Clustering 
Nir Ailon, Moses Charikar, and Alantha Newman. 
Aggregating inconsistent information: ranking and clustering. 
Journal of the ACM (JACM), 55(5):23, 2008. 
Serially process vertices
Serial Correlation Clustering 
Nir Ailon, Moses Charikar, and Alantha Newman. 
Aggregating inconsistent information: ranking and clustering. 
Journal of the ACM (JACM), 55(5):23, 2008. 
Serially process vertices
Serial Correlation Clustering 
Nir Ailon, Moses Charikar, and Alantha Newman. 
Aggregating inconsistent information: ranking and clustering. 
Journal of the ACM (JACM), 55(5):23, 2008. 
Serially process vertices
Serial Correlation Clustering 
Nir Ailon, Moses Charikar, and Alantha Newman. 
Aggregating inconsistent information: ranking and clustering. 
Journal of the ACM (JACM), 55(5):23, 2008. 
Serially process vertices 
Approximation 3 OPT (in expectation)
Parallel Correlation Clustering
Concurrency Control Correlation Clustering 
(C4) Parallel Correlation Clustering 
Cannot Resolve introduce 
by 
Mutual adjacent Exclusion 
cluster 
centers
Concurrency Control Correlation Clustering 
(C4) 
Common Resolve neighbor by 
must be 
assigned Optimistic to Concurrency 
earliest center 
Control 
? 
Optimistic Assumption 
No other new cluster created 
Resolution 
Assign common neighbor to earliest cluster
Properties of C4 
(Concurrency Control Correlation Clustering) 
Theorem: C4 is correct. 
C4 preserves same guarantees as serial algorithm (3 
OPT). 
Concurren Correctness 
Theorem: C4 has provably small overheads. 
cy 
= almost linear speedup 
Expected #blocked transactions < 2¦Ó |E| / |V|. 
¦Ó ¡Ô diff in parallel cpu¡¯s progress
Empirical Validation on Billion Edge 
Graphs 
Amazon EC2 r3.8xlarge instances 
Multicore up to 16 threads 
Real and synthetic graphs 
100 runs (10 random orderings x 10 runs) 
Graph Vertices Edges 
IT-2004 Italian web-graph 41 Million 1.14 Billion 
Webbase-2001 WebBase crawl 118 Million 1.02 Billion 
Erdos-Renyi Synthetic random 100 Million ¡Ö 1.0 Billion
C4: Cost of Coordination 
< 0.02% blocked
C4: Speed-up 
Ideal 
10x 
speedu 
p
Conclusion 
Concurrency Control 
for Parallel ML 
o Guarantee 
correctness 
o Maximize 
concurrency 
Code release in the works! 
https://amplab.cs.berkeley.edu/projects/cc 
ml/ 
xinghao@berkeley.edu 
Applications 
Correlation Clustering 
Submodular Maximization 
Clustering 
Online Facility Location 
Feature Modeling

More Related Content

Similar to Concurrency Control for Parallel Machine Learning (20)

Black-box Behavioral Model Inference for Autopilot Software Systems
Black-box Behavioral Model Inference for Autopilot Software SystemsBlack-box Behavioral Model Inference for Autopilot Software Systems
Black-box Behavioral Model Inference for Autopilot Software Systems
Mohammad Jafar Mashhadi
?
October 26, Optimization
October 26, OptimizationOctober 26, Optimization
October 26, Optimization
University of Colorado at Boulder
?
Approaches to online quantile estimation
Approaches to online quantile estimationApproaches to online quantile estimation
Approaches to online quantile estimation
Data Con LA
?
My
MyMy
My
DrAmin Dastanpour
?
2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekinge2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekinge
Prof. Wim Van Criekinge
?
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
Abhishek Vatsa
?
2015 bioinformatics alignments_wim_vancriekinge
2015 bioinformatics alignments_wim_vancriekinge2015 bioinformatics alignments_wim_vancriekinge
2015 bioinformatics alignments_wim_vancriekinge
Prof. Wim Van Criekinge
?
Ashg2014 grc workshop_schneider
Ashg2014 grc workshop_schneiderAshg2014 grc workshop_schneider
Ashg2014 grc workshop_schneider
Genome Reference Consortium
?
Two methods for optimising cognitive model parameters
Two methods for optimising cognitive model parametersTwo methods for optimising cognitive model parameters
Two methods for optimising cognitive model parameters
University of Huddersfield
?
Computational Approaches to Systems Biology
Computational Approaches to Systems BiologyComputational Approaches to Systems Biology
Computational Approaches to Systems Biology
Mike Hucka
?
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
Salford Systems
?
PPT
PPTPPT
PPT
butest
?
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..
butest
?
2005: A Matlab Tour on Artificial Immune Systems
2005: A Matlab Tour on Artificial Immune Systems2005: A Matlab Tour on Artificial Immune Systems
2005: A Matlab Tour on Artificial Immune Systems
Leandro de Castro
?
Cost Optimized Design Technique for Pseudo-Random Numbers in Cellular Automata
Cost Optimized Design Technique for Pseudo-Random Numbers in Cellular AutomataCost Optimized Design Technique for Pseudo-Random Numbers in Cellular Automata
Cost Optimized Design Technique for Pseudo-Random Numbers in Cellular Automata
ijait
?
International Journal on Cryptography and Information Security (IJCIS)
International Journal on Cryptography and Information Security (IJCIS)International Journal on Cryptography and Information Security (IJCIS)
International Journal on Cryptography and Information Security (IJCIS)
ijcisjournal
?
Josh Patterson MLconf slides
Josh Patterson MLconf slidesJosh Patterson MLconf slides
Josh Patterson MLconf slides
MLconf
?
20131019 ÉúÎïÎïÀíÈôÊÖ Journal Club
20131019 ÉúÎïÎïÀíÈôÊÖ Journal Club20131019 ÉúÎïÎïÀíÈôÊÖ Journal Club
20131019 ÉúÎïÎïÀíÈôÊÖ Journal Club
Med_KU
?
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYONDIMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
Rabi Das
?
Advances in Bayesian Learning
Advances in Bayesian LearningAdvances in Bayesian Learning
Advances in Bayesian Learning
butest
?
Black-box Behavioral Model Inference for Autopilot Software Systems
Black-box Behavioral Model Inference for Autopilot Software SystemsBlack-box Behavioral Model Inference for Autopilot Software Systems
Black-box Behavioral Model Inference for Autopilot Software Systems
Mohammad Jafar Mashhadi
?
Approaches to online quantile estimation
Approaches to online quantile estimationApproaches to online quantile estimation
Approaches to online quantile estimation
Data Con LA
?
2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekinge2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekinge
Prof. Wim Van Criekinge
?
2015 bioinformatics alignments_wim_vancriekinge
2015 bioinformatics alignments_wim_vancriekinge2015 bioinformatics alignments_wim_vancriekinge
2015 bioinformatics alignments_wim_vancriekinge
Prof. Wim Van Criekinge
?
Two methods for optimising cognitive model parameters
Two methods for optimising cognitive model parametersTwo methods for optimising cognitive model parameters
Two methods for optimising cognitive model parameters
University of Huddersfield
?
Computational Approaches to Systems Biology
Computational Approaches to Systems BiologyComputational Approaches to Systems Biology
Computational Approaches to Systems Biology
Mike Hucka
?
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
Salford Systems
?
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..
butest
?
2005: A Matlab Tour on Artificial Immune Systems
2005: A Matlab Tour on Artificial Immune Systems2005: A Matlab Tour on Artificial Immune Systems
2005: A Matlab Tour on Artificial Immune Systems
Leandro de Castro
?
Cost Optimized Design Technique for Pseudo-Random Numbers in Cellular Automata
Cost Optimized Design Technique for Pseudo-Random Numbers in Cellular AutomataCost Optimized Design Technique for Pseudo-Random Numbers in Cellular Automata
Cost Optimized Design Technique for Pseudo-Random Numbers in Cellular Automata
ijait
?
International Journal on Cryptography and Information Security (IJCIS)
International Journal on Cryptography and Information Security (IJCIS)International Journal on Cryptography and Information Security (IJCIS)
International Journal on Cryptography and Information Security (IJCIS)
ijcisjournal
?
Josh Patterson MLconf slides
Josh Patterson MLconf slidesJosh Patterson MLconf slides
Josh Patterson MLconf slides
MLconf
?
20131019 ÉúÎïÎïÀíÈôÊÖ Journal Club
20131019 ÉúÎïÎïÀíÈôÊÖ Journal Club20131019 ÉúÎïÎïÀíÈôÊÖ Journal Club
20131019 ÉúÎïÎïÀíÈôÊÖ Journal Club
Med_KU
?
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYONDIMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
Rabi Das
?
Advances in Bayesian Learning
Advances in Bayesian LearningAdvances in Bayesian Learning
Advances in Bayesian Learning
butest
?

More from jeykottalam (8)

AMP Camp 5 Intro
AMP Camp 5 IntroAMP Camp 5 Intro
AMP Camp 5 Intro
jeykottalam
?
Intro to Spark and Spark SQL
Intro to Spark and Spark SQLIntro to Spark and Spark SQL
Intro to Spark and Spark SQL
jeykottalam
?
MLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning LibraryMLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning Library
jeykottalam
?
SparkR: Enabling Interactive Data Science at Scale
SparkR: Enabling Interactive Data Science at ScaleSparkR: Enabling Interactive Data Science at Scale
SparkR: Enabling Interactive Data Science at Scale
jeykottalam
?
SampleClean: Bringing Data Cleaning into the BDAS Stack
SampleClean: Bringing Data Cleaning into the BDAS StackSampleClean: Bringing Data Cleaning into the BDAS Stack
SampleClean: Bringing Data Cleaning into the BDAS Stack
jeykottalam
?
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
jeykottalam
?
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate AscentCOCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
jeykottalam
?
The BDAS Open Source Community
The BDAS Open Source CommunityThe BDAS Open Source Community
The BDAS Open Source Community
jeykottalam
?
Intro to Spark and Spark SQL
Intro to Spark and Spark SQLIntro to Spark and Spark SQL
Intro to Spark and Spark SQL
jeykottalam
?
MLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning LibraryMLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning Library
jeykottalam
?
SparkR: Enabling Interactive Data Science at Scale
SparkR: Enabling Interactive Data Science at ScaleSparkR: Enabling Interactive Data Science at Scale
SparkR: Enabling Interactive Data Science at Scale
jeykottalam
?
SampleClean: Bringing Data Cleaning into the BDAS Stack
SampleClean: Bringing Data Cleaning into the BDAS StackSampleClean: Bringing Data Cleaning into the BDAS Stack
SampleClean: Bringing Data Cleaning into the BDAS Stack
jeykottalam
?
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
jeykottalam
?
COCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate AscentCOCOA: Communication-Efficient Coordinate Ascent
COCOA: Communication-Efficient Coordinate Ascent
jeykottalam
?
The BDAS Open Source Community
The BDAS Open Source CommunityThe BDAS Open Source Community
The BDAS Open Source Community
jeykottalam
?

Recently uploaded (20)

The Evolution of Microsoft Project Portfolio Management
The Evolution of Microsoft Project Portfolio ManagementThe Evolution of Microsoft Project Portfolio Management
The Evolution of Microsoft Project Portfolio Management
OnePlan Solutions
?
Coreldraw 2021 Crack Latest Version 2025
Coreldraw 2021 Crack Latest Version 2025Coreldraw 2021 Crack Latest Version 2025
Coreldraw 2021 Crack Latest Version 2025
farooq048kp
?
Digital Application Development Services
Digital Application Development ServicesDigital Application Development Services
Digital Application Development Services
daavishenry
?
Wondershare Filmora Crack 2025 For Windows Free
Wondershare Filmora Crack 2025 For Windows FreeWondershare Filmora Crack 2025 For Windows Free
Wondershare Filmora Crack 2025 For Windows Free
mohsinrazakpa43
?
Lecture2_REQUIREMENT_Process__Modelss.pptx
Lecture2_REQUIREMENT_Process__Modelss.pptxLecture2_REQUIREMENT_Process__Modelss.pptx
Lecture2_REQUIREMENT_Process__Modelss.pptx
Aqsa162589
?
6 Best AI Tools for Contract Management.pdf
6 Best AI Tools for Contract Management.pdf6 Best AI Tools for Contract Management.pdf
6 Best AI Tools for Contract Management.pdf
Anadea
?
ESET NOD32 Antivirus Crack with License Key 2025
ESET NOD32 Antivirus Crack with License Key 2025ESET NOD32 Antivirus Crack with License Key 2025
ESET NOD32 Antivirus Crack with License Key 2025
umeerbinfaizan
?
Cypress Parallel Testing Tutorial: Speed Up Your Test Runs with Ease
Cypress Parallel Testing Tutorial: Speed Up Your Test Runs with EaseCypress Parallel Testing Tutorial: Speed Up Your Test Runs with Ease
Cypress Parallel Testing Tutorial: Speed Up Your Test Runs with Ease
Shubham Joshi
?
The Open-Closed Principle - Part 1 - The Original Version
The Open-Closed Principle - Part 1 - The Original VersionThe Open-Closed Principle - Part 1 - The Original Version
The Open-Closed Principle - Part 1 - The Original Version
Philip Schwarz
?
E-commerce App Development cost in 2025.pdf
E-commerce App Development cost in 2025.pdfE-commerce App Development cost in 2025.pdf
E-commerce App Development cost in 2025.pdf
sandeepjangidimg
?
Tour Booking, Booking Service, Tour Agents, Hotel Booking in odoo
Tour Booking, Booking Service, Tour Agents, Hotel Booking in odooTour Booking, Booking Service, Tour Agents, Hotel Booking in odoo
Tour Booking, Booking Service, Tour Agents, Hotel Booking in odoo
AxisTechnolabs
?
Windows 8.1 Pro Activator Crack Version [April-2025]
Windows 8.1 Pro Activator Crack Version [April-2025]Windows 8.1 Pro Activator Crack Version [April-2025]
Windows 8.1 Pro Activator Crack Version [April-2025]
jhonjosh91
?
mORMot 2 - Pascal Cafe 2025 in Nederlands
mORMot 2 - Pascal Cafe 2025 in NederlandsmORMot 2 - Pascal Cafe 2025 in Nederlands
mORMot 2 - Pascal Cafe 2025 in Nederlands
Arnaud Bouchez
?
EMEA Virtual Marketo User Group - Adobe Summit 2025 Round Up
EMEA Virtual Marketo User Group - Adobe Summit 2025 Round UpEMEA Virtual Marketo User Group - Adobe Summit 2025 Round Up
EMEA Virtual Marketo User Group - Adobe Summit 2025 Round Up
BradBedford3
?
Typing Master Pro 12 Crack Updated Version [April-2025]
Typing Master Pro 12 Crack Updated Version [April-2025]Typing Master Pro 12 Crack Updated Version [April-2025]
Typing Master Pro 12 Crack Updated Version [April-2025]
jhonjosh91
?
Migrating GitHub Actions with Nested Virtualization to Cloud Native Ecosystem...
Migrating GitHub Actions with Nested Virtualization to Cloud Native Ecosystem...Migrating GitHub Actions with Nested Virtualization to Cloud Native Ecosystem...
Migrating GitHub Actions with Nested Virtualization to Cloud Native Ecosystem...
KCD Guadalajara
?
Bandicut Video Cutter v3.6.8.709 Crack [April-2025]
Bandicut Video Cutter v3.6.8.709 Crack [April-2025]Bandicut Video Cutter v3.6.8.709 Crack [April-2025]
Bandicut Video Cutter v3.6.8.709 Crack [April-2025]
jackalen173
?
Microsoft Office Crack 2019 Free Download
Microsoft Office Crack 2019 Free DownloadMicrosoft Office Crack 2019 Free Download
Microsoft Office Crack 2019 Free Download
tayab01kp
?
Driver Genius 24 Crack 2025 License Key Free Download
Driver Genius 24 Crack 2025 License Key Free DownloadDriver Genius 24 Crack 2025 License Key Free Download
Driver Genius 24 Crack 2025 License Key Free Download
umeerbinfaizan
?
Oracle Database administration Security PPT
Oracle Database administration Security PPTOracle Database administration Security PPT
Oracle Database administration Security PPT
pshankarnarayan
?
The Evolution of Microsoft Project Portfolio Management
The Evolution of Microsoft Project Portfolio ManagementThe Evolution of Microsoft Project Portfolio Management
The Evolution of Microsoft Project Portfolio Management
OnePlan Solutions
?
Coreldraw 2021 Crack Latest Version 2025
Coreldraw 2021 Crack Latest Version 2025Coreldraw 2021 Crack Latest Version 2025
Coreldraw 2021 Crack Latest Version 2025
farooq048kp
?
Digital Application Development Services
Digital Application Development ServicesDigital Application Development Services
Digital Application Development Services
daavishenry
?
Wondershare Filmora Crack 2025 For Windows Free
Wondershare Filmora Crack 2025 For Windows FreeWondershare Filmora Crack 2025 For Windows Free
Wondershare Filmora Crack 2025 For Windows Free
mohsinrazakpa43
?
Lecture2_REQUIREMENT_Process__Modelss.pptx
Lecture2_REQUIREMENT_Process__Modelss.pptxLecture2_REQUIREMENT_Process__Modelss.pptx
Lecture2_REQUIREMENT_Process__Modelss.pptx
Aqsa162589
?
6 Best AI Tools for Contract Management.pdf
6 Best AI Tools for Contract Management.pdf6 Best AI Tools for Contract Management.pdf
6 Best AI Tools for Contract Management.pdf
Anadea
?
ESET NOD32 Antivirus Crack with License Key 2025
ESET NOD32 Antivirus Crack with License Key 2025ESET NOD32 Antivirus Crack with License Key 2025
ESET NOD32 Antivirus Crack with License Key 2025
umeerbinfaizan
?
Cypress Parallel Testing Tutorial: Speed Up Your Test Runs with Ease
Cypress Parallel Testing Tutorial: Speed Up Your Test Runs with EaseCypress Parallel Testing Tutorial: Speed Up Your Test Runs with Ease
Cypress Parallel Testing Tutorial: Speed Up Your Test Runs with Ease
Shubham Joshi
?
The Open-Closed Principle - Part 1 - The Original Version
The Open-Closed Principle - Part 1 - The Original VersionThe Open-Closed Principle - Part 1 - The Original Version
The Open-Closed Principle - Part 1 - The Original Version
Philip Schwarz
?
E-commerce App Development cost in 2025.pdf
E-commerce App Development cost in 2025.pdfE-commerce App Development cost in 2025.pdf
E-commerce App Development cost in 2025.pdf
sandeepjangidimg
?
Tour Booking, Booking Service, Tour Agents, Hotel Booking in odoo
Tour Booking, Booking Service, Tour Agents, Hotel Booking in odooTour Booking, Booking Service, Tour Agents, Hotel Booking in odoo
Tour Booking, Booking Service, Tour Agents, Hotel Booking in odoo
AxisTechnolabs
?
Windows 8.1 Pro Activator Crack Version [April-2025]
Windows 8.1 Pro Activator Crack Version [April-2025]Windows 8.1 Pro Activator Crack Version [April-2025]
Windows 8.1 Pro Activator Crack Version [April-2025]
jhonjosh91
?
mORMot 2 - Pascal Cafe 2025 in Nederlands
mORMot 2 - Pascal Cafe 2025 in NederlandsmORMot 2 - Pascal Cafe 2025 in Nederlands
mORMot 2 - Pascal Cafe 2025 in Nederlands
Arnaud Bouchez
?
EMEA Virtual Marketo User Group - Adobe Summit 2025 Round Up
EMEA Virtual Marketo User Group - Adobe Summit 2025 Round UpEMEA Virtual Marketo User Group - Adobe Summit 2025 Round Up
EMEA Virtual Marketo User Group - Adobe Summit 2025 Round Up
BradBedford3
?
Typing Master Pro 12 Crack Updated Version [April-2025]
Typing Master Pro 12 Crack Updated Version [April-2025]Typing Master Pro 12 Crack Updated Version [April-2025]
Typing Master Pro 12 Crack Updated Version [April-2025]
jhonjosh91
?
Migrating GitHub Actions with Nested Virtualization to Cloud Native Ecosystem...
Migrating GitHub Actions with Nested Virtualization to Cloud Native Ecosystem...Migrating GitHub Actions with Nested Virtualization to Cloud Native Ecosystem...
Migrating GitHub Actions with Nested Virtualization to Cloud Native Ecosystem...
KCD Guadalajara
?
Bandicut Video Cutter v3.6.8.709 Crack [April-2025]
Bandicut Video Cutter v3.6.8.709 Crack [April-2025]Bandicut Video Cutter v3.6.8.709 Crack [April-2025]
Bandicut Video Cutter v3.6.8.709 Crack [April-2025]
jackalen173
?
Microsoft Office Crack 2019 Free Download
Microsoft Office Crack 2019 Free DownloadMicrosoft Office Crack 2019 Free Download
Microsoft Office Crack 2019 Free Download
tayab01kp
?
Driver Genius 24 Crack 2025 License Key Free Download
Driver Genius 24 Crack 2025 License Key Free DownloadDriver Genius 24 Crack 2025 License Key Free Download
Driver Genius 24 Crack 2025 License Key Free Download
umeerbinfaizan
?
Oracle Database administration Security PPT
Oracle Database administration Security PPTOracle Database administration Security PPT
Oracle Database administration Security PPT
pshankarnarayan
?

Concurrency Control for Parallel Machine Learning

  • 1. Concurrency Control for Parallel Machine Learning Dimitris Papailiopoulos Xinghao Pan, Joseph Gonzalez, Stefanie Jegelka, Tamara Broderick, Dimitris Papailiopoulos, Joseph Bradley, Michael I. Jordan
  • 2. Model State Data Serial Inference
  • 3. Model State Parallel Inference Processor 1 Processor 2 Data
  • 4. Model State Data Parallel Inference Processor 1 Processor 2 Concurrency: more machines = less time Correctness: serial equivalence ?
  • 5. Model State Data Coordination Free Parallel Inference Processor 1 Processor 2 ? Ignore collisions Concurrency: (almost) free + Speedup = #CPU Correctness? Not always...
  • 7. Correctness Concurrency Coordination-free Serial High Low High Low
  • 8. Correctness Concurrency Coordination-free Serial High Low High Low Concurrency Control Database mechanisms o Guarantee correctness o Maximize concurrency ? Mutual exclusion ? Optimistic CC
  • 9. Model State Data Mutual Exclusion Through Locking Processor 1 Processor 2 Introduce locking (scheduling) protocols to prevent conflicts.
  • 10. Mutual Exclusion Through Model State Data Processor 1 Processor 2 Locking ? Enforce local serialization to avoid conflicts.
  • 11. Optimistic Concurrency Control Model State Data Processor 1 Processor 2 Allow computation to proceed without blocking. Kung & Robinson. On optimistic methods for concurrency control.
  • 12. Optimistic Concurrency Control Model State Data Invalid Outcome ? ? Processor 1 Processor 2 Validate potential conflicts. Kung & Robinson. On optimistic methods for concurrency control.
  • 13. Optimistic Concurrency Control Model State Data ? ? Processor 1 Processor 2 Rollback and Redo Take a compensating action. Kung & Robinson. On optimistic methods for concurrency control.
  • 14. Concurrency Control 14 Coordination Free: Provably fast and correct under key assumptions. Concurrency Control: Provably correct and fast under key assumptions. Systems Ideas to Improve Efficiency
  • 15. Machine Learning + Concurrency Clusteri ng Online Facility Location Control (Xinghao Pan et al.) Submodular Maximization Subset selection, diminishing marginal gains Max Graph Cut Set Cover Sensor Placement Social Network Influence Propagation Document Summarization Sports Football Word Series Giants Cardinals Politics Midterm Obama Democrat Tea Finance QE market interest Dow Topic Modelling Correlation Clustering Deduplication Community Detection
  • 16. Machine Learning + Concurrency Clusteri ng Online Facility Location Control (Xinghao Pan et al.) Submodular Maximization Subset selection, diminishing marginal gains Max Graph Cut Set Cover Sensor Placement Social Network Influence Propagation Document Summarization Sports Football Word Series Giants Cardinals Politics Midterm Obama Democrat Tea Finance QE market interest Dow Topic Modelling Correlation Clustering Deduplication Community Detection Serial ML algorithm Sequence of transactions Identify potential conflicts Apply Concurrency Control mechanisms Parallel ML algorithm
  • 17. Application: Deduplication Computer Science Division ¨C University of California Berkeley CA University of California at Berkeley Department of Physics Stanford University California Lawrence Berkeley National Labs <ref>California</ref>
  • 19. Serial Correlation Clustering Nir Ailon, Moses Charikar, and Alantha Newman. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM), 55(5):23, 2008. Serially process vertices
  • 20. Serial Correlation Clustering Nir Ailon, Moses Charikar, and Alantha Newman. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM), 55(5):23, 2008. Serially process vertices
  • 21. Serial Correlation Clustering Nir Ailon, Moses Charikar, and Alantha Newman. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM), 55(5):23, 2008. Serially process vertices
  • 22. Serial Correlation Clustering Nir Ailon, Moses Charikar, and Alantha Newman. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM), 55(5):23, 2008. Serially process vertices
  • 23. Serial Correlation Clustering Nir Ailon, Moses Charikar, and Alantha Newman. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM), 55(5):23, 2008. Serially process vertices Approximation 3 OPT (in expectation)
  • 25. Concurrency Control Correlation Clustering (C4) Parallel Correlation Clustering Cannot Resolve introduce by Mutual adjacent Exclusion cluster centers
  • 26. Concurrency Control Correlation Clustering (C4) Common Resolve neighbor by must be assigned Optimistic to Concurrency earliest center Control ? Optimistic Assumption No other new cluster created Resolution Assign common neighbor to earliest cluster
  • 27. Properties of C4 (Concurrency Control Correlation Clustering) Theorem: C4 is correct. C4 preserves same guarantees as serial algorithm (3 OPT). Concurren Correctness Theorem: C4 has provably small overheads. cy = almost linear speedup Expected #blocked transactions < 2¦Ó |E| / |V|. ¦Ó ¡Ô diff in parallel cpu¡¯s progress
  • 28. Empirical Validation on Billion Edge Graphs Amazon EC2 r3.8xlarge instances Multicore up to 16 threads Real and synthetic graphs 100 runs (10 random orderings x 10 runs) Graph Vertices Edges IT-2004 Italian web-graph 41 Million 1.14 Billion Webbase-2001 WebBase crawl 118 Million 1.02 Billion Erdos-Renyi Synthetic random 100 Million ¡Ö 1.0 Billion
  • 29. C4: Cost of Coordination < 0.02% blocked
  • 30. C4: Speed-up Ideal 10x speedu p
  • 31. Conclusion Concurrency Control for Parallel ML o Guarantee correctness o Maximize concurrency Code release in the works! https://amplab.cs.berkeley.edu/projects/cc ml/ xinghao@berkeley.edu Applications Correlation Clustering Submodular Maximization Clustering Online Facility Location Feature Modeling