Difference between revisions of "Sahara/EDP Sequences DefineJobAndExecuteTxt"
< Sahara
(Created page with " <nowiki> @startuml actor client participant "Job Manager Comp" as JM participant "Job Source Comp" as JS participant "Savanna DB" as DB participant "Job Code Storage" as JC n...") |
m (Sergey Lukjanov moved page Savanna/EDP Sequences DefineJobAndExecuteTxt to Sahara/EDP Sequences DefineJobAndExecuteTxt: Savanna project was renamed due to the trademark issues.) |
||
(4 intermediate revisions by one other user not shown) | |||
Line 2: | Line 2: | ||
@startuml | @startuml | ||
actor client | actor client | ||
+ | note right client | ||
+ | The client in this idealized diagram performs all the steps | ||
+ | necessary to define and execute a job in a new, empty environment. | ||
+ | |||
+ | In an environment where resources have already been defined, a | ||
+ | client might only select and execute a job. | ||
+ | end note | ||
participant "Job Manager Comp" as JM | participant "Job Manager Comp" as JM | ||
− | participant "Job | + | participant "Job Origin Comp" as JO |
+ | participant "Data Discovery Comp" as DD | ||
participant "Savanna DB" as DB | participant "Savanna DB" as DB | ||
participant "Job Code Storage" as JC | participant "Job Code Storage" as JC | ||
note right JC | note right JC | ||
− | + | Job code storage could be a mechanism | |
− | completely outside of savanna | + | completely outside of savanna such as |
− | such as <b>git</b> or <b>svn</b>. | + | <b>git</b> or <b>svn</b> or something else. |
− | If the internal savanna db | + | If the internal savanna db is used to store |
− | is used to store raw job code, | + | raw job code, there probably is another API to |
− | there probably is another API | + | write the raw code to the savanna db. |
− | |||
end note | end note | ||
− | client --> | + | client --> JO: POST Add job origin |
− | |||
− | |||
− | |||
note right | note right | ||
− | This may contain some id for the job | + | This might be an administrative function. |
+ | end note | ||
+ | JO --> DB: Store source type\n and location | ||
+ | DB --> JO: Success | ||
+ | JO --> client: JSON job origin object | ||
+ | note right | ||
+ | This may contain some id | ||
+ | for the job origin object | ||
end note | end note | ||
client --> JC: Write job source code | client --> JC: Write job source code | ||
Line 28: | Line 39: | ||
must be written to the job code storage | must be written to the job code storage | ||
end note | end note | ||
+ | client --> DD: POST Register data source | ||
+ | note right | ||
+ | This might be an administrative function. | ||
+ | end note | ||
+ | DD --> DB: Store data source object | ||
+ | DB --> DD: Success | ||
+ | DD --> client: JSON data source object | ||
client --> JM: POST Create job | client --> JM: POST Create job | ||
− | note right | + | note right |
− | + | This step defines the job in the | |
− | + | savanna DB. Once the job is defined, | |
− | + | it can be retrieved and run as needed. | |
− | |||
− | The | + | The job object includes an identifer |
− | + | for a job origin object which allows the | |
− | object | + | actual job source code to be stored separately. |
− | |||
− | |||
end note | end note | ||
+ | JM --> JM: Generate a job id | ||
+ | JM --> DB: Store the job object | ||
+ | DB --> JM: Success | ||
JM --> client: JSON job object | JM --> client: JSON job object | ||
note right | note right | ||
Line 47: | Line 65: | ||
client --> JM: POST Get list of jobs | client --> JM: POST Get list of jobs | ||
note right | note right | ||
− | Maybe the client has multiple jobs at | + | Maybe the client has defined multiple |
− | this point, so it asks for a list | + | jobs at this point, so it asks for a list. |
end note | end note | ||
+ | JM --> DB: Request the job list | ||
+ | DB --> JM: Return the job list | ||
JM --> client: JSON job list | JM --> client: JSON job list | ||
client --> JM: POST Execute job | client --> JM: POST Execute job | ||
Line 55: | Line 75: | ||
The job is specified by id | The job is specified by id | ||
end note | end note | ||
− | JM --> | + | JM --> DB: Request the job object |
− | + | DB --> JM: Return the job object | |
− | JM --> | + | JM --> JO: GET request the job source code |
− | + | note right | |
− | + | The Job Manager passes the job origin id | |
− | JM --> | + | specified in the job object and a destination path |
+ | (for example, hdfs). The Job Origin component uses | ||
+ | plugins to copy the binary job code from the storage | ||
+ | location to the destination path. | ||
+ | end note | ||
+ | JO --> JO: Copy the job source code from job code storage\nto the specified destination | ||
+ | JO --> JM: JSON success, probably returns destination path | ||
+ | JM --> DD: Request job and cluster configuration\nbased on the specified data source | ||
+ | note right | ||
+ | The job execution request specifies a data source. | ||
+ | The job and/or the cluster may need to be configured | ||
+ | for the job to run and make use of the specified source. | ||
+ | |||
+ | This needs more definition. What happens here? | ||
+ | end note | ||
+ | DD ->]: Cluster configuration? | ||
+ | DD --> JM: Success | ||
+ | note left | ||
+ | Is something returned here? | ||
+ | end note | ||
JM ->]: Submit to cluster | JM ->]: Submit to cluster | ||
JM --> client: JSON job execution object | JM --> client: JSON job execution object | ||
+ | ..."<size:12><b><i>Some time goes by, the client checks on jobs and decides to stop one</b></i></size>"... | ||
+ | client --> JM: GET List job instances | ||
+ | JM --> client: JSON list of job instance objects | ||
+ | note right | ||
+ | What is a job instance object? A Job Execution? Something else? | ||
+ | end note | ||
+ | client --> JM: GET job instance status | ||
+ | JM --> client: JSON job instance status | ||
+ | note right | ||
+ | What is a job instance status? | ||
+ | end note | ||
+ | client --> JM: POST terminate job instance | ||
+ | JM ->]: Do something on the cluster to end the job | ||
+ | JM --> client: JSON job instance status | ||
@enduml | @enduml | ||
</nowiki> | </nowiki> |
Latest revision as of 15:41, 7 March 2014
@startuml actor client note right client The client in this idealized diagram performs all the steps necessary to define and execute a job in a new, empty environment. In an environment where resources have already been defined, a client might only select and execute a job. end note participant "Job Manager Comp" as JM participant "Job Origin Comp" as JO participant "Data Discovery Comp" as DD participant "Savanna DB" as DB participant "Job Code Storage" as JC note right JC Job code storage could be a mechanism completely outside of savanna such as <b>git</b> or <b>svn</b> or something else. If the internal savanna db is used to store raw job code, there probably is another API to write the raw code to the savanna db. end note client --> JO: POST Add job origin note right This might be an administrative function. end note JO --> DB: Store source type\n and location DB --> JO: Success JO --> client: JSON job origin object note right This may contain some id for the job origin object end note client --> JC: Write job source code note right At some point the raw job code must be written to the job code storage end note client --> DD: POST Register data source note right This might be an administrative function. end note DD --> DB: Store data source object DB --> DD: Success DD --> client: JSON data source object client --> JM: POST Create job note right This step defines the job in the savanna DB. Once the job is defined, it can be retrieved and run as needed. The job object includes an identifer for a job origin object which allows the actual job source code to be stored separately. end note JM --> JM: Generate a job id JM --> DB: Store the job object DB --> JM: Success JM --> client: JSON job object note right This has the job id filled in end note client --> JM: POST Get list of jobs note right Maybe the client has defined multiple jobs at this point, so it asks for a list. end note JM --> DB: Request the job list DB --> JM: Return the job list JM --> client: JSON job list client --> JM: POST Execute job note right The job is specified by id end note JM --> DB: Request the job object DB --> JM: Return the job object JM --> JO: GET request the job source code note right The Job Manager passes the job origin id specified in the job object and a destination path (for example, hdfs). The Job Origin component uses plugins to copy the binary job code from the storage location to the destination path. end note JO --> JO: Copy the job source code from job code storage\nto the specified destination JO --> JM: JSON success, probably returns destination path JM --> DD: Request job and cluster configuration\nbased on the specified data source note right The job execution request specifies a data source. The job and/or the cluster may need to be configured for the job to run and make use of the specified source. This needs more definition. What happens here? end note DD ->]: Cluster configuration? DD --> JM: Success note left Is something returned here? end note JM ->]: Submit to cluster JM --> client: JSON job execution object ..."<size:12><b><i>Some time goes by, the client checks on jobs and decides to stop one</b></i></size>"... client --> JM: GET List job instances JM --> client: JSON list of job instance objects note right What is a job instance object? A Job Execution? Something else? end note client --> JM: GET job instance status JM --> client: JSON job instance status note right What is a job instance status? end note client --> JM: POST terminate job instance JM ->]: Do something on the cluster to end the job JM --> client: JSON job instance status @enduml