ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow:
Russel Winder
@russel_winder
http://www.russel.org.uk
russel@winder.org.uk
The Concurrency/Parallelism
Architecture You Need
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
What is Dataflow?
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
What are (in computing†
):
Concurrency:
Structuring solution and code
such that multiple parts may
execute independently and
possibly even at the same
time.
Parallelism:
Execute multiple parts of a
system at the same time on
different processors so as to
get things working faster.
†
In natural language these words have very different meanings.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
What is Dataflow?
An architecture comprising channels allowing data to flow from
one operator to another, where each operator has multiple
input channels and multiple output channels, and executes
code only in response to the arrival of data on the inputs.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Historically
Dataflow computers:
– Values flowing between…
–…operators that calculate…
–…new values to pass to…
–…other operators.
Dataflow hardware didn't take
off, but the architecture works
at various scales.
The Manchester Prototype Dataflow Computer
J R Gurd, C C Kirkham, I Watson
CACM 28(1), 1985-01.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow diagrams have been an
integral part of analysis and design of
information systems since the 1970s
T de Marco, Structured Analysis and Systems Specification,
Yourdon Press, NY, 1978.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow and Functional
Operators seem like they
might be pure functions,
but…
…they are not necessarily,
operators may have internal
state.
Operators may be referentially
transparent, but they may be
not.
Operators may even have side
effects.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow is an
event-based
architecture
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow systems are
(possibly)
reactive systems.
Which would make them exceedingly
trendy even if the idea is very old.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow systems have
no†
shared memory.
†
or at least should have no.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
operator
channel
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow systems are
message passing systems.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Each operator must†
be single threaded.
†
or at least should.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow Frameworks
Scala:
–Future
Akka:
–Dataflow variables, aka
Promise
–Deprecated in favour of Async
Java:
–Pre-8, Future
–8+, CompletableFuture, aka
Promise
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Architectural Issue
Each of the aforementioned frameworks assumes that each
operator creates a single value. Communication is by dataflow
variables: each dataflow variable is a thread-safe single
assignment variable.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
GPars…
Has dataflow variables
(promises) and tasks and so
can do everything Akka and
Java can offer.
Has DataflowQueue, and so
can create real dataflow
networks.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
One does like to code…
…doesn't one.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
We need a problem…
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
A Problem
Calculate mean and standard deviation of a data sample.
¯x =
1
n
∑i=0
n
xi
s =
√ 1
n−1
∑i=0
n
(xi−¯x)2
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Amend the Problem
s =
√ 1
n−1 ((∑i=0
n
xi
2
)−n¯x ¯x)
¯x =
1
n
∑i=0
n
xi
@YourTwitterHandle@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
C
od
e
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Switch to using an IDE for this.Switch to using an IDE for this.
Code Example
@YourTwitterHandle#DVXFR14{session hashtag} @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
S
um
m
ary
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Summary
Dataflow is an architecture:
Event-driven, single-threaded
operators communicating by
message passing using
channels.
Dataflow is an easement:
Synchronization is inherent in
the model, and there is no
shared memory, so all
deadlocks are trivial.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow is a way of harnessing
concurrency and parallelism
in easy to program ways.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
GPars is usable from Java
as well as Groovy.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Testing is really Groovy with Spock.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow is an architecture of
code you need to know.
@YourTwitterHandle#DVXFR14{session hashtag} @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Q
&
A
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow:
Russel Winder
@russel_winder
http://www.russel.org.uk
russel@winder.org.uk
The Concurrency/Parallelism
Architecture You Need

More Related Content

Dataflow: the concurrency/parallelism architecture you need