際際滷

際際滷Share a Scribd company logo
Web Service Query Service Manivasakan Sabesan and Tore Risch Uppsala DataBase  Laboratory Dept. of Information Technology Uppsala University Sweden
Outline WSMED  Research Area Adaptive Query  Parallelization Conclusion   &  Future   work
WSMED provides general query capabilities over  data providing  web services. Users only need to provide  WSDL URLs  of web services. WSMED automatically creates  SQL views  for each web service operation. It makes every web service operation query-able  without any programming . Users can make any  SQL query  by using the automatically created SQL views.  WSMED  ( W eb  S ervice  MED iator)  System
Service Oriented Architecture of WSMED WSMED Server SQL View 1 WSDL metadata  1 WS Operation  1 WS Operation  p WS Operation  1 WS Operation  q WS 1 WS n WSDL metadata  n Import metadata SQL View m IMPORTWSDL AUTHENTICATION QUERY EXIT_S INIT WSMED Web Service Interface TABLEINFO SOAP call
WSMED  Demo WSMED provides  web service query service .  WSMED Demo can be accessible from a web browser. Java Script is used to invoke directly WSMED web   service.
Outline WSMED   Research Area Adaptive Query Parallelization Conclusion   & Future work
Queries calling data providing web services have a similar   pattern  :-  dependent calls .   Web service calls incur high-latency and  high message setup cost A na誰ve implementation of an application making these calls   sequentially is time consuming A challenge here is to develop methods to speed up such queries   with  dependent web service calls Research Problems WS 1 WS 2 WS 3 WS n
Outline WSMED   Research Area Adaptive Query Parallelization Conclusion   & Future work
Example Query select  gl.City , gl.TypeId from   GetAllStates gs, GetPlacesWithin gp, GetPlaceList gl where   gs.state=gp.state  and   gp.distance=15.0  and    gp.placeTypeToFind='City'  and   gp.place='Atlanta'  and    gl.placeName=gp.ToPlace + ' ,' + gp.ToState  and   gl.MaxItems=100  and     gl.imagePresence='true' Finds  information about places located within  15  km from each  City  named  Atlanta  in all US states.   Invokes  300  web service calls and r eturns a stream of  360  tuples <City,  TypeId> GetAllStates GetPlacesWithin GetPlaceList <state> <ToPlace,  ToState> <15,City,Atlanta> <100,true>
Query Processing in WSMED Parallel query plan SQL query Calculus Generator Parallel pipeliner Plan function generator Non-parallel plan  optimizer Plan splitter Phase 1 Phase 2 Non-parallel plan
Split point 1 Split point 2 PF 1 PF 2 Non-Parallel Plan 粒 GetPlacesWithin(Atlanta, state, 15.0, City) <City, TypeId> 粒 GetPlaceList (str, 100, true) 粒 GetAllStates() < state  > <city , state2 > 粒 concat(city,, , state2) <str>
Adaptive Parallel Plan <str> < state > AFF_APPLYP( PF 2 , str ) <City, TypeId> 粒 GetAllStates() AFF_   APPLYP( PF 1 , state )
Parallel Process Tree q i - query process (i=0,1,......n) PF j - Plan Function  (j=1,......m) Level 2   q0 q1 q3 q4 q2 GetAllStates q5 q8 q7 q6 Coordinator  Level 1  Query PF 1 GetPlaceList GetPlacesWithin PF 2
AFF_APPLYP(Function   PF ,  Stream   pstream ) ->   Stream   result PF   plan function  pstream   stream of  parameter values  p i   result     stream of results  r i Asynchronous  operator q3 q4 q5 PF PF PF p 1 p 2 p 3 Adaptive First Finished Apply in Parallel (A FF_APPLYP ) AFF_APPLYP r 1 r 2 r 3 p 4 p 5 p 6 PF p 1 , p 2 , p 3 r 1 p 4 r 3 p 5 r 2 p 6
Functionalities of AFF_APPLYP 1.   AFF_APPLYP   initially forms a binary process tree by always setting  fanout to  2  -  init stage . q0 q1 q3 q4 q2 q6 q5 Coordinator  Level 1  Level 2
.......... 2.  A  monitoring cycle   for a non-leaf query process is defined when  number of received end-of-call messages equal to number of children.  2.1  After the first monitoring cycle  A FF_APPLYP  adds   p  new child processes - an  add stage .  3.   When an added node has several levels of children, the init stages of  A FF_APPLYP s  in the children will produce  a binary subtree .  q0 q1 q3 q4 q2 q5 Coordinator  Level 1  q7 q9 q8 q10 Level 2  q6 q11
...... 4.   A FF_APPLYP  records per monitoring cycle  i  the average time  t i  to produce an  incoming tuple from the children. 4.1  If  t i  decreases more than a threshold ( 25% ) the add stage is rerun. 4.2  If  t i  increases we either  add no more children  or    run a  drop stage  that drops one  child and its  children.  q0 q1 q3 q4 q2 q5 Coordinator  Level 1  q12 q10 Level 2  q6 q11
Adaptive Results-  Example Query
AFF_APPLYP observations  For example query : The execution time with  p =4  and  no drop stage  is the best.  It is more than  4  times faster with the sequential execution (non-parallel). The execution time with  p=2   and no drop stage is reasonably close to the best execution time (  80%  ). Drop stage makes insignificant changes  in the execution time. Fanout of each level on a process tree depends on the execution time of a web service invoked on that level.  AFF_APPLYP finds the optimized fanout for each level .
Outline WSMED   Research Area Adaptive Query Parallelization Conclusion   & Future work
Related work Similar to WSMS  ( U.Srivastava, J.Widom, K.Munagala, and R.Motwani, Query Optimization over Web Services,  VLDB 2 006 )  WSMED also invoke parallel web service calls. In contrast, WSMED supports automated adaptive parallelization.  In contrast to WSQ/DSQ ( R.Goldman, and J.Widom, WSQ/DSQ: a practical approach for combined querying of databases and the Web,   SIGMOD 2000 ) , WSMED produces non-materialized adaptive parallel plans based on parameter streams. Runtime optimization techniques ( A. Gounaris, et al., Robust runtime optimization of data transfer in queries over Web Services,  ICDE 2008  )  investigate adaptation of buffer sizes in web service calls, not dealing with adaptive parallelism on web service calls.
Conclusion WSMED can be accessed : through a URL http://udbl2.it.uu.se/WSMED/wsmed.html   without  installing any software . Queries are expressed in  SQL  to dynamically compose data providing web services  without any programming . Makes any web service queryable with SQL AFF_APPLYP: automatically parallelize web service calls . adapts the process tree at runtime  , based on the flow of result stream without any  static cost model . Adaptive Parallel plan with AFF_APPLYP  makes possible to run  expensive queries .
Future ..... Generalize the strategy for queries mixed with dependent and independent web service calls, as well bushy trees ( Ongoing work ) Investigate different process arrangement strategies with the algebra operators. Setup a benchmark to simulate the parallel invocation of web services.
Thank you for your attention ?  The un-queried life is not worth living

More Related Content

Web Service Query Service

  • 1. Web Service Query Service Manivasakan Sabesan and Tore Risch Uppsala DataBase Laboratory Dept. of Information Technology Uppsala University Sweden
  • 2. Outline WSMED Research Area Adaptive Query Parallelization Conclusion & Future work
  • 3. WSMED provides general query capabilities over data providing web services. Users only need to provide WSDL URLs of web services. WSMED automatically creates SQL views for each web service operation. It makes every web service operation query-able without any programming . Users can make any SQL query by using the automatically created SQL views. WSMED ( W eb S ervice MED iator) System
  • 4. Service Oriented Architecture of WSMED WSMED Server SQL View 1 WSDL metadata 1 WS Operation 1 WS Operation p WS Operation 1 WS Operation q WS 1 WS n WSDL metadata n Import metadata SQL View m IMPORTWSDL AUTHENTICATION QUERY EXIT_S INIT WSMED Web Service Interface TABLEINFO SOAP call
  • 5. WSMED Demo WSMED provides web service query service . WSMED Demo can be accessible from a web browser. Java Script is used to invoke directly WSMED web service.
  • 6. Outline WSMED Research Area Adaptive Query Parallelization Conclusion & Future work
  • 7. Queries calling data providing web services have a similar pattern :- dependent calls . Web service calls incur high-latency and high message setup cost A na誰ve implementation of an application making these calls sequentially is time consuming A challenge here is to develop methods to speed up such queries with dependent web service calls Research Problems WS 1 WS 2 WS 3 WS n
  • 8. Outline WSMED Research Area Adaptive Query Parallelization Conclusion & Future work
  • 9. Example Query select gl.City , gl.TypeId from GetAllStates gs, GetPlacesWithin gp, GetPlaceList gl where gs.state=gp.state and gp.distance=15.0 and gp.placeTypeToFind='City' and gp.place='Atlanta' and gl.placeName=gp.ToPlace + ' ,' + gp.ToState and gl.MaxItems=100 and gl.imagePresence='true' Finds information about places located within 15 km from each City named Atlanta in all US states. Invokes 300 web service calls and r eturns a stream of 360 tuples <City, TypeId> GetAllStates GetPlacesWithin GetPlaceList <state> <ToPlace, ToState> <15,City,Atlanta> <100,true>
  • 10. Query Processing in WSMED Parallel query plan SQL query Calculus Generator Parallel pipeliner Plan function generator Non-parallel plan optimizer Plan splitter Phase 1 Phase 2 Non-parallel plan
  • 11. Split point 1 Split point 2 PF 1 PF 2 Non-Parallel Plan 粒 GetPlacesWithin(Atlanta, state, 15.0, City) <City, TypeId> 粒 GetPlaceList (str, 100, true) 粒 GetAllStates() < state > <city , state2 > 粒 concat(city,, , state2) <str>
  • 12. Adaptive Parallel Plan <str> < state > AFF_APPLYP( PF 2 , str ) <City, TypeId> 粒 GetAllStates() AFF_ APPLYP( PF 1 , state )
  • 13. Parallel Process Tree q i - query process (i=0,1,......n) PF j - Plan Function (j=1,......m) Level 2 q0 q1 q3 q4 q2 GetAllStates q5 q8 q7 q6 Coordinator Level 1 Query PF 1 GetPlaceList GetPlacesWithin PF 2
  • 14. AFF_APPLYP(Function PF , Stream pstream ) -> Stream result PF plan function pstream stream of parameter values p i result stream of results r i Asynchronous operator q3 q4 q5 PF PF PF p 1 p 2 p 3 Adaptive First Finished Apply in Parallel (A FF_APPLYP ) AFF_APPLYP r 1 r 2 r 3 p 4 p 5 p 6 PF p 1 , p 2 , p 3 r 1 p 4 r 3 p 5 r 2 p 6
  • 15. Functionalities of AFF_APPLYP 1. AFF_APPLYP initially forms a binary process tree by always setting fanout to 2 - init stage . q0 q1 q3 q4 q2 q6 q5 Coordinator Level 1 Level 2
  • 16. .......... 2. A monitoring cycle for a non-leaf query process is defined when number of received end-of-call messages equal to number of children. 2.1 After the first monitoring cycle A FF_APPLYP adds p new child processes - an add stage . 3. When an added node has several levels of children, the init stages of A FF_APPLYP s in the children will produce a binary subtree . q0 q1 q3 q4 q2 q5 Coordinator Level 1 q7 q9 q8 q10 Level 2 q6 q11
  • 17. ...... 4. A FF_APPLYP records per monitoring cycle i the average time t i to produce an incoming tuple from the children. 4.1 If t i decreases more than a threshold ( 25% ) the add stage is rerun. 4.2 If t i increases we either add no more children or run a drop stage that drops one child and its children. q0 q1 q3 q4 q2 q5 Coordinator Level 1 q12 q10 Level 2 q6 q11
  • 18. Adaptive Results- Example Query
  • 19. AFF_APPLYP observations For example query : The execution time with p =4 and no drop stage is the best. It is more than 4 times faster with the sequential execution (non-parallel). The execution time with p=2 and no drop stage is reasonably close to the best execution time ( 80% ). Drop stage makes insignificant changes in the execution time. Fanout of each level on a process tree depends on the execution time of a web service invoked on that level. AFF_APPLYP finds the optimized fanout for each level .
  • 20. Outline WSMED Research Area Adaptive Query Parallelization Conclusion & Future work
  • 21. Related work Similar to WSMS ( U.Srivastava, J.Widom, K.Munagala, and R.Motwani, Query Optimization over Web Services, VLDB 2 006 ) WSMED also invoke parallel web service calls. In contrast, WSMED supports automated adaptive parallelization. In contrast to WSQ/DSQ ( R.Goldman, and J.Widom, WSQ/DSQ: a practical approach for combined querying of databases and the Web, SIGMOD 2000 ) , WSMED produces non-materialized adaptive parallel plans based on parameter streams. Runtime optimization techniques ( A. Gounaris, et al., Robust runtime optimization of data transfer in queries over Web Services, ICDE 2008 ) investigate adaptation of buffer sizes in web service calls, not dealing with adaptive parallelism on web service calls.
  • 22. Conclusion WSMED can be accessed : through a URL http://udbl2.it.uu.se/WSMED/wsmed.html without installing any software . Queries are expressed in SQL to dynamically compose data providing web services without any programming . Makes any web service queryable with SQL AFF_APPLYP: automatically parallelize web service calls . adapts the process tree at runtime , based on the flow of result stream without any static cost model . Adaptive Parallel plan with AFF_APPLYP makes possible to run expensive queries .
  • 23. Future ..... Generalize the strategy for queries mixed with dependent and independent web service calls, as well bushy trees ( Ongoing work ) Investigate different process arrangement strategies with the algebra operators. Setup a benchmark to simulate the parallel invocation of web services.
  • 24. Thank you for your attention ? The un-queried life is not worth living

Editor's Notes

  • #4: A common need to search information through data providing web services , with out any side effects, returning set of objects for a given set of parameters.
  • #5: A software system designed for supporting machine-to-machine interaction over a network. We have developed a system , WSMED, provides general query capabilities over data accessible through web services by reading WSDL meta-data descriptions. WSDL url is given to import meta data to its local store. While importing the meta data it automatically creates SQL views to make web service operation query able WSMED is providing a web service to query arbitrary data providing web services. INIT- to inialize a user session. Importwsdl- to consume a webservice user need to give wsdl url and OWFs for ws operations are automatically created. TABLEINFO operation provides information about the SQL view over a given web service operation. In, out, datatypes AUTHENTICATION operation provides authentication information for web service operations that so require accepts SQL queries to the generated views by the QUERY operation : users can make SQL queries , considering these SQL views, calling any date web service without any programming. EXIT_S to exit a user session. User need not to install any software or harwareware setups to utilize the web service.
  • #6: To illustrate the system we have developed WSMED demo It confirm every thing as a service paradigm wsmed.wsdl-show all operations Importwsdl-placelookup Tableinfo, authentication Query select name from GetAllStates
  • #10: The views can be queried with SQL GetAllSates &amp; GetPlacesWithin with GeoPlaces web service- GetPlaceList with Terraweb service Our queries are concerning data from data providing web service- sql quite natural to express the queries and still popular around Go to demo Import terraservice and execute query
  • #11: Central plan heuristic cost model- web service signature- assuming web service call is expensive Sequential execution is slow.
  • #12: 粒 applies a plan function for a given parameter tuple
  • #14: Multilevel execution plans generated with several layers of parallelism process tree fanout central query plan to parallel query plan coordinator initiates communication between child processes and ships plan functions. Then it stream of different parameter tuples results delivered as streams from child processes
  • #15: End of call message
  • #20: For different queries P, fanoutvalues may varies according to the execution time of a web service operations involved. Therefore this adptive approach is very useful.