際際滷

際際滷Share a Scribd company logo
Politecnico di Milano, DEIB
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Marco Balduini, Riccardo Tommasini, Emanuele Della Valle
A Cheaper, Faster yet more Accurate ?
Streaming Linked Data Framework
1
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
RSP is Great!
2
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Why RSP?
3
- offers a generic overview over streams and static data
- enables query answering across heterogeneous sources
- consents to create/publish new streams or graphs
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
The RSP Idea
4
in short
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
CQL Model
5
Streams
Relations
Streams-to-Relations
Relations-to-Streams
Relationsto-Relations
The CQL continuous query language ?
- Arvind Arasu , Shivnath Babu , Jennifer Widom, 2006, VLDBJ
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
RSP-QL Model
6
RDF Streams Solution
Mappings
S2R operators
R2S operators
R2R operators
The CQL continuous query language ?
- E.?Della?Valle, S. Ceri, D.?Barbieri, D. Braga, A.?Campi, 2008, FIS
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
RSP in a Nutshell
7
RDF Stream-to-RDF
RDF-to-RDF (solution mappings)
RDF-to-RDF Stream
on RDF Streams
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
RSP in Practice
8
With SLD
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
An Social Media Example
9
How many micro-posts do occur over time?
How often does a hashtag appears in the micro-post
stream?
Two Information Needs
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Streaming Linked Data Server
Sources
Raw Stream
10
Adapter RDF Stream Bus Publisher
Visualizer
Recorder Re-player Analiser Decorator
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
11
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
An Important
Optimisation
REGISTER STREAM sstr AS
CONSTRUCT {
?id sma:twitterCount ?tc }
FROM STREAM <social> [RANGE 1m STEP 1m]
WHERE {
SELECT (uuid() AS ?id) ?tc
WHERE {
SELECT
(COUNT (DISTINCT ?mp) AS ?tc)
WHERE {
?mp a sma:Tweet } } }
12
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
13
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
14
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Using
C-SPARQL
REGISTER STREAM countT AS
CONSTRUCT {
?uid sma:twitterCount ?tot .}
FROM STREAM <sstr> [RANGE 15m
STEP 1m]
WHERE {
SELECT
(uuid() AS ?uid)
(SUM(?tc) AS ?tot)
WHERE {
?id sma:twitterCount ?tc }}
15
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Is RSP always great?
16
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Observations
17
It is flexible. :)
It forces RDF when query results are often relational :(
It is not optimal, i.e. RSP-QL vs SQL vs Path Queries
on SLD
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Revolutionising SLD
18
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
A ^Lazy ̄ Processing Model
19
Stream operators can be applied on generic data items.
QL-specific operators requires a particular data type.
Postpone the data transformation as late as possible.
on streams
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Generic Programming
20
Generic programming is a style of computer
programming in which algorithms are written in terms of
types to-be-specified-later that are then instantiated
when needed for specific types provided as parameters.
an old idea
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
A new Processing Model
21
Generic?
Streams<T>
Generic?
Instantaneous<T>
S2I<T>
I2S<T>
I2I<T>
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Lazy Transformation by Generic Programming
22
stream-to-instantaneous<T>
instantaneous-to-instantaneous<T>
instantaneous-to-stream<T>
on streams<T>
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
RSP in Practice
23
with SLD Revolution
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Sources
Stream
Sink
Streaming Linked Data Revolution Server
24
Receiver Generic Stream Bus Translator
Stream
Recorder Re-player Processor Decorator
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
25
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
SLD vs SLD Revolution
26
Let¨s be quantitative
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
It is faster, cheaper yet more accurate than SLD.
27
R?	=	0,96413R?	=	0,99891
30
300
3000
1 10 100
Median	Engine	Memory	(MB)
Median	CPU	Load	(%)	
SLD
SLD	Revolution
Expon.		(SLD)
Linear		(SLD	Revolution)
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Discussion & Conclusion
28
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Observations
29
It is faster, cheaper yet more accurate than SLD. :)
It requires to know EPL, SPARQL, JSON path queries. :(
It is optimised and, thus, not flexible. :(
on SLD Revolution
EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Open Problems
30
RSP-QL is not always the best solution in terms of cost/performance
Can we identify an optimum?
Can we define a cost model for RSP-QL ?
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Questions?
Email: riccardo.tommasini@polimi.it?
Twitter: @rictomm
31
Email: marco.balduini@polimi.it?
Twitter: @ balducci85
Pablo Picasso, Les Demoiselles d'Avignon, 1907. ?
Museum of Modern Art (MoMA), New York City, NY, US

More Related Content

SLD Revolution: A Cheaper, Faster yet more Accurate Streaming Linked Data Framework

  • 1. Politecnico di Milano, DEIB Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Marco Balduini, Riccardo Tommasini, Emanuele Della Valle A Cheaper, Faster yet more Accurate ? Streaming Linked Data Framework 1
  • 2. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano RSP is Great! 2
  • 3. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Why RSP? 3 - offers a generic overview over streams and static data - enables query answering across heterogeneous sources - consents to create/publish new streams or graphs
  • 4. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano The RSP Idea 4 in short
  • 5. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC CQL Model 5 Streams Relations Streams-to-Relations Relations-to-Streams Relationsto-Relations The CQL continuous query language ? - Arvind Arasu , Shivnath Babu , Jennifer Widom, 2006, VLDBJ
  • 6. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC RSP-QL Model 6 RDF Streams Solution Mappings S2R operators R2S operators R2R operators The CQL continuous query language ? - E.?Della?Valle, S. Ceri, D.?Barbieri, D. Braga, A.?Campi, 2008, FIS
  • 7. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC RSP in a Nutshell 7 RDF Stream-to-RDF RDF-to-RDF (solution mappings) RDF-to-RDF Stream on RDF Streams
  • 8. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano RSP in Practice 8 With SLD
  • 9. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC An Social Media Example 9 How many micro-posts do occur over time? How often does a hashtag appears in the micro-post stream? Two Information Needs
  • 10. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Streaming Linked Data Server Sources Raw Stream 10 Adapter RDF Stream Bus Publisher Visualizer Recorder Re-player Analiser Decorator
  • 11. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC 11
  • 12. Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano An Important Optimisation REGISTER STREAM sstr AS CONSTRUCT { ?id sma:twitterCount ?tc } FROM STREAM <social> [RANGE 1m STEP 1m] WHERE { SELECT (uuid() AS ?id) ?tc WHERE { SELECT (COUNT (DISTINCT ?mp) AS ?tc) WHERE { ?mp a sma:Tweet } } } 12
  • 13. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC 13
  • 14. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC 14
  • 15. Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Using C-SPARQL REGISTER STREAM countT AS CONSTRUCT { ?uid sma:twitterCount ?tot .} FROM STREAM <sstr> [RANGE 15m STEP 1m] WHERE { SELECT (uuid() AS ?uid) (SUM(?tc) AS ?tot) WHERE { ?id sma:twitterCount ?tc }} 15
  • 16. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Is RSP always great? 16
  • 17. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Observations 17 It is flexible. :) It forces RDF when query results are often relational :( It is not optimal, i.e. RSP-QL vs SQL vs Path Queries on SLD
  • 18. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Revolutionising SLD 18
  • 19. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC A ^Lazy ̄ Processing Model 19 Stream operators can be applied on generic data items. QL-specific operators requires a particular data type. Postpone the data transformation as late as possible. on streams
  • 20. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Generic Programming 20 Generic programming is a style of computer programming in which algorithms are written in terms of types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. an old idea
  • 21. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC A new Processing Model 21 Generic? Streams<T> Generic? Instantaneous<T> S2I<T> I2S<T> I2I<T>
  • 22. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Lazy Transformation by Generic Programming 22 stream-to-instantaneous<T> instantaneous-to-instantaneous<T> instantaneous-to-stream<T> on streams<T>
  • 23. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano RSP in Practice 23 with SLD Revolution
  • 24. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Sources Stream Sink Streaming Linked Data Revolution Server 24 Receiver Generic Stream Bus Translator Stream Recorder Re-player Processor Decorator
  • 25. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC 25
  • 26. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano SLD vs SLD Revolution 26 Let¨s be quantitative
  • 27. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC It is faster, cheaper yet more accurate than SLD. 27 R? = 0,96413R? = 0,99891 30 300 3000 1 10 100 Median Engine Memory (MB) Median CPU Load (%) SLD SLD Revolution Expon. (SLD) Linear (SLD Revolution)
  • 28. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Discussion & Conclusion 28
  • 29. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Observations 29 It is faster, cheaper yet more accurate than SLD. :) It requires to know EPL, SPARQL, JSON path queries. :( It is optimised and, thus, not flexible. :( on SLD Revolution
  • 30. EyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESWC Open Problems 30 RSP-QL is not always the best solution in terms of cost/performance Can we identify an optimum? Can we define a cost model for RSP-QL ?
  • 31. Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Questions? Email: riccardo.tommasini@polimi.it? Twitter: @rictomm 31 Email: marco.balduini@polimi.it? Twitter: @ balducci85 Pablo Picasso, Les Demoiselles d'Avignon, 1907. ? Museum of Modern Art (MoMA), New York City, NY, US