Second RSP workshop co-located with the 17th Extended Seamntic Web Conference.
This is a joint work from Marco Balduini, Emanuele Della Valle at Riccardo Tommasini DEIB, Politecnico of Milano, Milano, Italy.
1 of 31
More Related Content
SLD Revolution: A Cheaper, Faster yet more Accurate Streaming Linked Data Framework
1. Politecnico di Milano, DEIB
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Marco Balduini, Riccardo Tommasini, Emanuele Della Valle
A Cheaper, Faster yet more Accurate ?
Streaming Linked Data Framework
1
2. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
RSP is Great!
2
3. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Why RSP?
3
- offers a generic overview over streams and static data
- enables query answering across heterogeneous sources
- consents to create/publish new streams or graphs
4. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
The RSP Idea
4
in short
5. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
CQL Model
5
Streams
Relations
Streams-to-Relations
Relations-to-Streams
Relationsto-Relations
The CQL continuous query language ?
- Arvind Arasu , Shivnath Babu , Jennifer Widom, 2006, VLDBJ
6. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
RSP-QL Model
6
RDF Streams Solution
Mappings
S2R operators
R2S operators
R2R operators
The CQL continuous query language ?
- E.?Della?Valle, S. Ceri, D.?Barbieri, D. Braga, A.?Campi, 2008, FIS
7. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
RSP in a Nutshell
7
RDF Stream-to-RDF
RDF-to-RDF (solution mappings)
RDF-to-RDF Stream
on RDF Streams
8. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
RSP in Practice
8
With SLD
9. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
An Social Media Example
9
How many micro-posts do occur over time?
How often does a hashtag appears in the micro-post
stream?
Two Information Needs
10. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Streaming Linked Data Server
Sources
Raw Stream
10
Adapter RDF Stream Bus Publisher
Visualizer
Recorder Re-player Analiser Decorator
11. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
11
12. Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
An Important
Optimisation
REGISTER STREAM sstr AS
CONSTRUCT {
?id sma:twitterCount ?tc }
FROM STREAM <social> [RANGE 1m STEP 1m]
WHERE {
SELECT (uuid() AS ?id) ?tc
WHERE {
SELECT
(COUNT (DISTINCT ?mp) AS ?tc)
WHERE {
?mp a sma:Tweet } } }
12
13. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
13
14. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
14
15. Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Using
C-SPARQL
REGISTER STREAM countT AS
CONSTRUCT {
?uid sma:twitterCount ?tot .}
FROM STREAM <sstr> [RANGE 15m
STEP 1m]
WHERE {
SELECT
(uuid() AS ?uid)
(SUM(?tc) AS ?tot)
WHERE {
?id sma:twitterCount ?tc }}
15
16. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Is RSP always great?
16
17. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Observations
17
It is flexible. :)
It forces RDF when query results are often relational :(
It is not optimal, i.e. RSP-QL vs SQL vs Path Queries
on SLD
18. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Revolutionising SLD
18
19. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
A ^Lazy ̄ Processing Model
19
Stream operators can be applied on generic data items.
QL-specific operators requires a particular data type.
Postpone the data transformation as late as possible.
on streams
20. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Generic Programming
20
Generic programming is a style of computer
programming in which algorithms are written in terms of
types to-be-specified-later that are then instantiated
when needed for specific types provided as parameters.
an old idea
21. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
A new Processing Model
21
Generic?
Streams<T>
Generic?
Instantaneous<T>
S2I<T>
I2S<T>
I2I<T>
22. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Lazy Transformation by Generic Programming
22
stream-to-instantaneous<T>
instantaneous-to-instantaneous<T>
instantaneous-to-stream<T>
on streams<T>
23. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
RSP in Practice
23
with SLD Revolution
24. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Sources
Stream
Sink
Streaming Linked Data Revolution Server
24
Receiver Generic Stream Bus Translator
Stream
Recorder Re-player Processor Decorator
25. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
25
26. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
SLD vs SLD Revolution
26
Let¨s be quantitative
27. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
It is faster, cheaper yet more accurate than SLD.
27
R? = 0,96413R? = 0,99891
30
300
3000
1 10 100
Median Engine Memory (MB)
Median CPU Load (%)
SLD
SLD Revolution
Expon. (SLD)
Linear (SLD Revolution)
28. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Discussion & Conclusion
28
29. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Observations
29
It is faster, cheaper yet more accurate than SLD. :)
It requires to know EPL, SPARQL, JSON path queries. :(
It is optimised and, thus, not flexible. :(
on SLD Revolution
30. EyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESWC
Open Problems
30
RSP-QL is not always the best solution in terms of cost/performance
Can we identify an optimum?
Can we define a cost model for RSP-QL ?
31. Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Questions?
Email: riccardo.tommasini@polimi.it?
Twitter: @rictomm
31
Email: marco.balduini@polimi.it?
Twitter: @ balducci85
Pablo Picasso, Les Demoiselles d'Avignon, 1907. ?
Museum of Modern Art (MoMA), New York City, NY, US