Restartability mechanisms
Moderators: chulett, rschirm, roy
Restartability mechanisms
Hi,
May i know what are various options available for bringing restartability in a server/parallel job?
Something i know are
1)checkpoint in sequencer
2)using HashJobInfo file and routine.
Which one is the efficient one and whats the difference between those?
Thanks,
May i know what are various options available for bringing restartability in a server/parallel job?
Something i know are
1)checkpoint in sequencer
2)using HashJobInfo file and routine.
Which one is the efficient one and whats the difference between those?
Thanks,
pandeeswaran
-
- Premium Member
- Posts: 730
- Joined: Tue Nov 04, 2008 10:14 am
- Location: Bangalore
Re: Restartability mechanisms
Could you explain about HashJobInfo file and the routine you are taking about?pandeesh wrote:2)using HashJobInfo file and routine.
Actualy we are using a hsh file called Hash JObInfo in which the status of each job ll be recorded.
Before the next run of the batch job, it will open that hash file find the status.
if it finds the last job is aborted , it llreset and continue form there.
Or if the last run is fine. it ll start the jobs from beginning.
It'll be usually acomplished by BASIC code in the batch jobs.
let me know if you need any more info
Before the next run of the batch job, it will open that hash file find the status.
if it finds the last job is aborted , it llreset and continue form there.
Or if the last run is fine. it ll start the jobs from beginning.
It'll be usually acomplished by BASIC code in the batch jobs.
let me know if you need any more info
pandeeswaran
-
- Premium Member
- Posts: 730
- Joined: Tue Nov 04, 2008 10:14 am
- Location: Bangalore
The Restart ability you are talking about as what is understand is resetting the job on failure for the next run
If I am right then making the sequencer a checkpointed one alone doesnot make the jobs within it restartable unless you use the option "Reset if required then run". Checkpointing a sequencer makes the seq restartable and if something goes wrong you need to fix that part and compile it then rerun the seq.
And the Hashfile logic you people are using is just one way of doing it where you reset the job if it has failed.
Both ways at the back end the same thing happens "Resetting the job", you do it from the director, unix command, Basic function its up to you.
If I am right then making the sequencer a checkpointed one alone doesnot make the jobs within it restartable unless you use the option "Reset if required then run". Checkpointing a sequencer makes the seq restartable and if something goes wrong you need to fix that part and compile it then rerun the seq.
And the Hashfile logic you people are using is just one way of doing it where you reset the job if it has failed.
Both ways at the back end the same thing happens "Resetting the job", you do it from the director, unix command, Basic function its up to you.
I'm curious what the difference is between the two other than the Reset... between 'from there' and 'from beginning'.pandeesh wrote:if it finds the last job is aborted , it llreset and continue form there. Or if the last run is fine. it ll start the jobs from beginning.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Actually, if any intermediate job got aborted , the hash file wont be cleared and if we triggered the job, it l run from that aborted job.
If there is no job aborts and evrythingis fininshed fine, finally we will clear that hash file, so that the next run starts from the beginning.
Am i clear?
If there is no job aborts and evrythingis fininshed fine, finally we will clear that hash file, so that the next run starts from the beginning.
Am i clear?
pandeeswaran
if we are using a sequencer and the script which is created using the run model(using ln -s)
in this case, if we trigger the sequencer using that script, it ll reset the sequencer and start the jobs from the beginning..
Is there any way to overcome this?
Is it possible to disable the reset feature in that script(which is created by using DS run model).
Actually i am not able to edit the script. i have seen there is a function inside the script for reset.
One way i can think is ,we have to use our own script using DSJOB..
But if we use our own script ,is it possible to read the parameters from the ini file?(Without giving all the parameters in the script itself?)
Thanks
in this case, if we trigger the sequencer using that script, it ll reset the sequencer and start the jobs from the beginning..
Is there any way to overcome this?
Is it possible to disable the reset feature in that script(which is created by using DS run model).
Actually i am not able to edit the script. i have seen there is a function inside the script for reset.
One way i can think is ,we have to use our own script using DSJOB..
But if we use our own script ,is it possible to read the parameters from the ini file?(Without giving all the parameters in the script itself?)
Thanks
pandeeswaran
Be more selective in your resetting. Make it conditional on the job type and if in a 'Restartable' status.pandeesh wrote:Is there any way to overcome this?
All things are possible if you roll your own. A script can certainly read the file and build the appropriate list of "-param" options for the dsjob command. Parameter Sets with value files in the 8.x release can simplify this.pandeesh also wrote:But if we use our own script ,is it possible to read the parameters from the ini file?(Without giving all the parameters in the script itself?)
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Plenty of information out there. For instance here.pandeesh wrote:Can you explain this little more?
Don't see why not.pandeesh also wrote:Is there any way to tweak the script which is generated using run model?
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers