WSMED provides a web service query mediation system that allows users to query web services using SQL without programming. It automatically parallelizes queries by creating an adaptive parallel query plan using AFF_APPLYP. This operator builds a process tree to invoke web services in parallel and adapts the tree at runtime based on monitoring execution times to optimize performance without a static cost model. Experimental results show the adaptive parallelization can speed up expensive queries over 4x compared to sequential execution. Future work includes handling more complex query plans and establishing benchmarks to further evaluate parallel web service invocation strategies.
1 of 24
Download to read offline
More Related Content
Web Service Query Service
1. Web Service Query Service Manivasakan Sabesan and Tore Risch Uppsala DataBase Laboratory Dept. of Information Technology Uppsala University Sweden
2. Outline WSMED Research Area Adaptive Query Parallelization Conclusion & Future work
3. WSMED provides general query capabilities over data providing web services. Users only need to provide WSDL URLs of web services. WSMED automatically creates SQL views for each web service operation. It makes every web service operation query-able without any programming . Users can make any SQL query by using the automatically created SQL views. WSMED ( W eb S ervice MED iator) System
4. Service Oriented Architecture of WSMED WSMED Server SQL View 1 WSDL metadata 1 WS Operation 1 WS Operation p WS Operation 1 WS Operation q WS 1 WS n WSDL metadata n Import metadata SQL View m IMPORTWSDL AUTHENTICATION QUERY EXIT_S INIT WSMED Web Service Interface TABLEINFO SOAP call
5. WSMED Demo WSMED provides web service query service . WSMED Demo can be accessible from a web browser. Java Script is used to invoke directly WSMED web service.
6. Outline WSMED Research Area Adaptive Query Parallelization Conclusion & Future work
7. Queries calling data providing web services have a similar pattern :- dependent calls . Web service calls incur high-latency and high message setup cost A na誰ve implementation of an application making these calls sequentially is time consuming A challenge here is to develop methods to speed up such queries with dependent web service calls Research Problems WS 1 WS 2 WS 3 WS n
8. Outline WSMED Research Area Adaptive Query Parallelization Conclusion & Future work
9. Example Query select gl.City , gl.TypeId from GetAllStates gs, GetPlacesWithin gp, GetPlaceList gl where gs.state=gp.state and gp.distance=15.0 and gp.placeTypeToFind='City' and gp.place='Atlanta' and gl.placeName=gp.ToPlace + ' ,' + gp.ToState and gl.MaxItems=100 and gl.imagePresence='true' Finds information about places located within 15 km from each City named Atlanta in all US states. Invokes 300 web service calls and r eturns a stream of 360 tuples <City, TypeId> GetAllStates GetPlacesWithin GetPlaceList <state> <ToPlace, ToState> <15,City,Atlanta> <100,true>
10. Query Processing in WSMED Parallel query plan SQL query Calculus Generator Parallel pipeliner Plan function generator Non-parallel plan optimizer Plan splitter Phase 1 Phase 2 Non-parallel plan
11. Split point 1 Split point 2 PF 1 PF 2 Non-Parallel Plan 粒 GetPlacesWithin(Atlanta, state, 15.0, City) <City, TypeId> 粒 GetPlaceList (str, 100, true) 粒 GetAllStates() < state > <city , state2 > 粒 concat(city,, , state2) <str>
12. Adaptive Parallel Plan <str> < state > AFF_APPLYP( PF 2 , str ) <City, TypeId> 粒 GetAllStates() AFF_ APPLYP( PF 1 , state )
13. Parallel Process Tree q i - query process (i=0,1,......n) PF j - Plan Function (j=1,......m) Level 2 q0 q1 q3 q4 q2 GetAllStates q5 q8 q7 q6 Coordinator Level 1 Query PF 1 GetPlaceList GetPlacesWithin PF 2
14. AFF_APPLYP(Function PF , Stream pstream ) -> Stream result PF plan function pstream stream of parameter values p i result stream of results r i Asynchronous operator q3 q4 q5 PF PF PF p 1 p 2 p 3 Adaptive First Finished Apply in Parallel (A FF_APPLYP ) AFF_APPLYP r 1 r 2 r 3 p 4 p 5 p 6 PF p 1 , p 2 , p 3 r 1 p 4 r 3 p 5 r 2 p 6
15. Functionalities of AFF_APPLYP 1. AFF_APPLYP initially forms a binary process tree by always setting fanout to 2 - init stage . q0 q1 q3 q4 q2 q6 q5 Coordinator Level 1 Level 2
16. .......... 2. A monitoring cycle for a non-leaf query process is defined when number of received end-of-call messages equal to number of children. 2.1 After the first monitoring cycle A FF_APPLYP adds p new child processes - an add stage . 3. When an added node has several levels of children, the init stages of A FF_APPLYP s in the children will produce a binary subtree . q0 q1 q3 q4 q2 q5 Coordinator Level 1 q7 q9 q8 q10 Level 2 q6 q11
17. ...... 4. A FF_APPLYP records per monitoring cycle i the average time t i to produce an incoming tuple from the children. 4.1 If t i decreases more than a threshold ( 25% ) the add stage is rerun. 4.2 If t i increases we either add no more children or run a drop stage that drops one child and its children. q0 q1 q3 q4 q2 q5 Coordinator Level 1 q12 q10 Level 2 q6 q11
19. AFF_APPLYP observations For example query : The execution time with p =4 and no drop stage is the best. It is more than 4 times faster with the sequential execution (non-parallel). The execution time with p=2 and no drop stage is reasonably close to the best execution time ( 80% ). Drop stage makes insignificant changes in the execution time. Fanout of each level on a process tree depends on the execution time of a web service invoked on that level. AFF_APPLYP finds the optimized fanout for each level .
20. Outline WSMED Research Area Adaptive Query Parallelization Conclusion & Future work
21. Related work Similar to WSMS ( U.Srivastava, J.Widom, K.Munagala, and R.Motwani, Query Optimization over Web Services, VLDB 2 006 ) WSMED also invoke parallel web service calls. In contrast, WSMED supports automated adaptive parallelization. In contrast to WSQ/DSQ ( R.Goldman, and J.Widom, WSQ/DSQ: a practical approach for combined querying of databases and the Web, SIGMOD 2000 ) , WSMED produces non-materialized adaptive parallel plans based on parameter streams. Runtime optimization techniques ( A. Gounaris, et al., Robust runtime optimization of data transfer in queries over Web Services, ICDE 2008 ) investigate adaptation of buffer sizes in web service calls, not dealing with adaptive parallelism on web service calls.
22. Conclusion WSMED can be accessed : through a URL http://udbl2.it.uu.se/WSMED/wsmed.html without installing any software . Queries are expressed in SQL to dynamically compose data providing web services without any programming . Makes any web service queryable with SQL AFF_APPLYP: automatically parallelize web service calls . adapts the process tree at runtime , based on the flow of result stream without any static cost model . Adaptive Parallel plan with AFF_APPLYP makes possible to run expensive queries .
23. Future ..... Generalize the strategy for queries mixed with dependent and independent web service calls, as well bushy trees ( Ongoing work ) Investigate different process arrangement strategies with the algebra operators. Setup a benchmark to simulate the parallel invocation of web services.
24. Thank you for your attention ? The un-queried life is not worth living
Editor's Notes
#4: A common need to search information through data providing web services , with out any side effects, returning set of objects for a given set of parameters.
#5: A software system designed for supporting machine-to-machine interaction over a network. We have developed a system , WSMED, provides general query capabilities over data accessible through web services by reading WSDL meta-data descriptions. WSDL url is given to import meta data to its local store. While importing the meta data it automatically creates SQL views to make web service operation query able WSMED is providing a web service to query arbitrary data providing web services. INIT- to inialize a user session. Importwsdl- to consume a webservice user need to give wsdl url and OWFs for ws operations are automatically created. TABLEINFO operation provides information about the SQL view over a given web service operation. In, out, datatypes AUTHENTICATION operation provides authentication information for web service operations that so require accepts SQL queries to the generated views by the QUERY operation : users can make SQL queries , considering these SQL views, calling any date web service without any programming. EXIT_S to exit a user session. User need not to install any software or harwareware setups to utilize the web service.
#6: To illustrate the system we have developed WSMED demo It confirm every thing as a service paradigm wsmed.wsdl-show all operations Importwsdl-placelookup Tableinfo, authentication Query select name from GetAllStates
#10: The views can be queried with SQL GetAllSates & GetPlacesWithin with GeoPlaces web service- GetPlaceList with Terraweb service Our queries are concerning data from data providing web service- sql quite natural to express the queries and still popular around Go to demo Import terraservice and execute query
#11: Central plan heuristic cost model- web service signature- assuming web service call is expensive Sequential execution is slow.
#12: 粒 applies a plan function for a given parameter tuple
#14: Multilevel execution plans generated with several layers of parallelism process tree fanout central query plan to parallel query plan coordinator initiates communication between child processes and ships plan functions. Then it stream of different parameter tuples results delivered as streams from child processes
#20: For different queries P, fanoutvalues may varies according to the execution time of a web service operations involved. Therefore this adptive approach is very useful.