Jump to: navigation, search

Sahara/EDP Sequences DefineJobAndExecuteTxt

< Sahara
Revision as of 20:23, 18 July 2013 by Tmckay (talk | contribs)
@startuml
actor client
note right client
The client in this idealized diagram performs all the steps 
necessary to define and execute a job in a new, empty environment.

In an environment where resources have already been defined, a
client might only select and execute a job.
end note
participant "Job Manager Comp" as JM
participant "Job Source Comp" as JS
participant "Data Discovery Comp" as DD
participant "Savanna DB" as DB
participant "Job Code Storage" as JC
note right JC
  Job code storage could be a mechanism
completely outside of savanna such as 
<b>git</b> or <b>svn</b> or something else.

  If the internal savanna db is used to store 
raw job code, there probably is another API to 
write the raw code to the savanna db.
end note
client -->  JS: POST Add job source
note right
This might be an administrative function.
end note
JS --> DB: Store source type\n and location
DB --> JS: Success
JS --> client: JSON job source object
note right
This may contain some id 
for the job source object
end note
client --> JC: Write job source code
note right
At some point the raw job code
must be written to the job code storage
end note
client --> DD: POST Register data source
note right
This might be an administrative function.
end note
DD --> DB: Store data source object
DB --> DD: Success
DD --> client: JSON data source object
client --> JM: POST Create job
note right
  This step defines the job in the 
savanna DB.  Once the job is defined,
it can be retrieved and run as needed.

  The job object includes an identifer
for a job source object which allows the
actual job source code to be stored separately.
end note
JM --> JM: Generate a job id
JM --> DB: Store the job object
DB --> JM: Success
JM --> client: JSON job object
note right
This has the job id filled in
end note
client --> JM: POST Get list of jobs
note right
Maybe the client has defined multiple 
jobs at this point, so it asks for a list.
end note
JM --> DB: Request the job list
DB --> JM: Return the job list
JM --> client: JSON job list
client --> JM: POST Execute job
note right
The job is specified by id
end note
JM --> DB: Request the job object
DB --> JM: Return the job object 
JM --> JS: Get the job source object
note right
The job source object identifier is 
specified by a field in the job object
end note
JS --> DB: Request the job source object
DB --> JS: Return the job source object
JS --> JM: Return the job source object
JM --> JM: Combine information from the job object\n and job source object to find the job code
note right
  The Job Manager component needs some 
method to combine information from the job object 
(name, type, maybe additional path information?)
with location information from the job source object 
to uniquely identify and retrieve the raw job code.

  How this is done will depend on the interface to 
the Job Source Component and the Job Source plugins,
and it may vary by job source type.
end note
JM --> JC: Request the raw job code
JC --> JM: Return the raw job code
JM --> DD: Request job and cluster configuration\nbased on the specified data source
note right
The job execution request specifies a data source.
The job and/or the cluster may need to be configured
for the job to run and make use of the specified source.

This needs more definition. What happens here?
end note
DD ->]: Cluster configuration?
DD --> JM: Success
note left
Is something returned here?
end note
JM ->]: Submit to cluster 
JM --> client: JSON job execution object
@enduml