際際滷

際際滷Share a Scribd company logo
Index What are psets? Creation of psets. What is analysis_level parameter? Achieving data lineage for generic graphs using psets. Physical datasets and logical EME datasets. Parameters to handle parallel running jobs calling the same graph. Capturing job statistics details in the EME when using generic graphs.
What are psets and how are they created Creating a set of input parameter and value pairs (psets). You do the above, using the  Input Values Editor  in the  Edit  menu, which allows you to specify a set of values for the graph's formal parameters, then save it as a separate .pset (parameter set) file in any of the directories under the private sandbox.  Steps: a. Select  Edit    Input Values ... from the GDE menu.   This appears same as the graph parameter editor, with two columns in it, the    parameter name and value. b. For each formal parameter enter the required value in the value field. c. Then select  File    Save As  and save the same value set as  <graph    name>.pset  under the private sandboxs pset directory.  Note:  The editor defaults to the project's mp directory as the location of the new .pset file you need to navigate to pset directory in the sandbox.
油
油
Along with the existing formal parameters of the generic graph, define a formal parameter called  analysis_level  and set its value to  none .
Check in the generic graph from common sandbox to the EME.
Dependency analysis will not be performed on the generic graph due to analysis_level parameters value.
Each separate input values set you create in this step represents a separate instance of the graph. To enable the Job Tracking of the generic graph, for different such value sets, simply check these .pset files with different input value sets into the EME data store. This graph instance represented by the .pset file is analyzed and saved in the EME data store as a graph object. For .pset file to be analyzed set analysis_level parameters in each parameter set to  expand . This was mandatory in Abinitio V-13. NOTE: Abinitio V-14 automatically expands the psets when they are checked in.
Achieving data lineage for generic graphs using psets. Distinct values of logical EME datasets are passed from different psets to the same generic graph. This is done to achieve data lineage. When psets are checked in they are expanded and  dependency analysis  takes place. Different instances of the generic graph will show up in EME with unique values of logical datasets.
EME view of distinct instances of generic graph:  As above different data lineage are achieved in two instances of the same graph in EME.
Physical dataset names overwrites the logical EME dataset names passed from psets. Physical dataset names are set and then passed while executing the graph from within the wrapper via pset.  For e.g. exporting physical datasets Calling graph passing parameters
Handling concurrent running multiple instances of a graph AB_JOB_PREFIX    To avoid problems with multiple instances of a graph being run concurrently in the same directory, you can make the AB_JOB value unique by exporting the AB_JOB_PREFIX configuration variable. For e.g. AB_JOB_PREFIX should be assigned any dynamic value. In the e.g. above it is assigned to process id (PID=$$). Alternatively date timestamp in YYYYMMDDHHMISS format can also be assigned to it. Setting this parameter makes sure that AB_JOB will now resolve to ${AB_JOB_PREFIX}${AB_JOB} and thus recovery files also will get created with different names.
Capturing job statistics details in the EME when using generic graphs   AB_AIR_JOB_GRAPH   Specifies the graph/application being run so that it may be linked to the job object.  -  When a generic graph is called the job statistics are stored in the EME under  the name of the generic graph . This causes confusion and discrepancies when tracking stats in EME because a generic graph may be used in multiple projects. The objective is to  store job statistics under the pset name  so that they can be correlated with the logical use of the generic graph. -  This parameter needs to be set in the calling script/program to have a generic graph reposit tracking to the .graph (pset version) of the graph. -  If the graph is generic then you should set AB_AIR_JOB_GRAPH because you want the job to be associated with pset instance of the graph which does the specific task according to values passed through pset.
In Coop Sys 2.14 and above Benefits Job statistics will be reposited with the logical use of the graph The statistics will be accurately reported by the appropriate job group or project  Performance improvement in graph execution time.
Please read the below document for more detail : /opt/abinitio/abinitio-V2-15-5-0/doc/EME_Developer_Guide.pdf /opt/abinitio/abinitio-V2-15-5-0/doc/EME_Reference.pdf
THANK YOU

More Related Content

Generic Graph And Psets

  • 1. Index What are psets? Creation of psets. What is analysis_level parameter? Achieving data lineage for generic graphs using psets. Physical datasets and logical EME datasets. Parameters to handle parallel running jobs calling the same graph. Capturing job statistics details in the EME when using generic graphs.
  • 2. What are psets and how are they created Creating a set of input parameter and value pairs (psets). You do the above, using the Input Values Editor in the Edit menu, which allows you to specify a set of values for the graph's formal parameters, then save it as a separate .pset (parameter set) file in any of the directories under the private sandbox. Steps: a. Select Edit Input Values ... from the GDE menu. This appears same as the graph parameter editor, with two columns in it, the parameter name and value. b. For each formal parameter enter the required value in the value field. c. Then select File Save As and save the same value set as <graph name>.pset under the private sandboxs pset directory. Note: The editor defaults to the project's mp directory as the location of the new .pset file you need to navigate to pset directory in the sandbox.
  • 3.
  • 4.
  • 5. Along with the existing formal parameters of the generic graph, define a formal parameter called analysis_level and set its value to none .
  • 6. Check in the generic graph from common sandbox to the EME.
  • 7. Dependency analysis will not be performed on the generic graph due to analysis_level parameters value.
  • 8. Each separate input values set you create in this step represents a separate instance of the graph. To enable the Job Tracking of the generic graph, for different such value sets, simply check these .pset files with different input value sets into the EME data store. This graph instance represented by the .pset file is analyzed and saved in the EME data store as a graph object. For .pset file to be analyzed set analysis_level parameters in each parameter set to expand . This was mandatory in Abinitio V-13. NOTE: Abinitio V-14 automatically expands the psets when they are checked in.
  • 9. Achieving data lineage for generic graphs using psets. Distinct values of logical EME datasets are passed from different psets to the same generic graph. This is done to achieve data lineage. When psets are checked in they are expanded and dependency analysis takes place. Different instances of the generic graph will show up in EME with unique values of logical datasets.
  • 10. EME view of distinct instances of generic graph: As above different data lineage are achieved in two instances of the same graph in EME.
  • 11. Physical dataset names overwrites the logical EME dataset names passed from psets. Physical dataset names are set and then passed while executing the graph from within the wrapper via pset. For e.g. exporting physical datasets Calling graph passing parameters
  • 12. Handling concurrent running multiple instances of a graph AB_JOB_PREFIX To avoid problems with multiple instances of a graph being run concurrently in the same directory, you can make the AB_JOB value unique by exporting the AB_JOB_PREFIX configuration variable. For e.g. AB_JOB_PREFIX should be assigned any dynamic value. In the e.g. above it is assigned to process id (PID=$$). Alternatively date timestamp in YYYYMMDDHHMISS format can also be assigned to it. Setting this parameter makes sure that AB_JOB will now resolve to ${AB_JOB_PREFIX}${AB_JOB} and thus recovery files also will get created with different names.
  • 13. Capturing job statistics details in the EME when using generic graphs AB_AIR_JOB_GRAPH Specifies the graph/application being run so that it may be linked to the job object. - When a generic graph is called the job statistics are stored in the EME under the name of the generic graph . This causes confusion and discrepancies when tracking stats in EME because a generic graph may be used in multiple projects. The objective is to store job statistics under the pset name so that they can be correlated with the logical use of the generic graph. - This parameter needs to be set in the calling script/program to have a generic graph reposit tracking to the .graph (pset version) of the graph. - If the graph is generic then you should set AB_AIR_JOB_GRAPH because you want the job to be associated with pset instance of the graph which does the specific task according to values passed through pset.
  • 14. In Coop Sys 2.14 and above Benefits Job statistics will be reposited with the logical use of the graph The statistics will be accurately reported by the appropriate job group or project Performance improvement in graph execution time.
  • 15. Please read the below document for more detail : /opt/abinitio/abinitio-V2-15-5-0/doc/EME_Developer_Guide.pdf /opt/abinitio/abinitio-V2-15-5-0/doc/EME_Reference.pdf