Workload Management for ISD

pbttbis · Post by **pbttbis** » Fri Feb 05, 2016 5:59 am

Hi,

We have a lightweight ISD job that run for 4-5 seconds. The majority of this time is spent waiting for a response from an application request sent via a REST step from the hierarchical stage.

We are expected to handled +- 300 000 calls to this job a day. I have asked the question around peak workloads.

With regards to workload management I see two areas to configure:

1) Options that can specifically be set in the information server console for the services.

2) DataStage workload management. Queue priority, number of active jobs etc

Are there any best practices around setting up the workload management that I can apply to the above scenario?

Thanks,

Shaun

eostic · Post by **eostic** » Fri Feb 05, 2016 6:32 am

Does the Job have the Hierarchical Stage as its source? Most of that 4 - 5 seconds is probably in initialization. See if you can write and test it as an intermediate Stage -- one that uses an ISDinput Stage....that could dramatically increase your throughput because the Job only gets initialized once, and then each invocation is merely sending new "rows" to that instance of the Job.

Another thing that dramatically speeds up the initialization and runtime of Hierarchical Stage is to use the "Schema View" feature and focus only on the parts of the payload that you need (this is in the library manager and alters what you select as the "root" once you get to the json or xml parser step). If you have a tiny schema you won't see "that" much difference, but if the schema is large, it could really help.

Another possibility is to consider a Server Job. The Hierarchical Stage has limitations there, so you may not be able...but if you can get it working in a Server Job, that may also shorten your initialization time, as Server Jobs historically have vastly faster init times.

The best solution, performance wise, if you can get it to work, is to use an ISDinput Stage in the Job -- although a recent poster had issues there with the way the http communications was working....but still, it is worth a try to see if it works in your particular situation.

Ernie

pbttbis · Post by **pbttbis** » Fri Feb 05, 2016 10:02 am

hahaha Ernie I believe I am the poster you are referring to w.r.t to the HTTP plumbing issue I had

Currently in our DataStage test environment I have the ISD service to generate 3 instances of a job "ready and waiting" to process requests.

The job is simply put ISD_Input --> Hierachial (REST Step) --> ISD_Output.

The vast majority of the 4-5 seconds is the REST Step waiting for the external application to return a response. So the DataStage portion itself is very light weight. We are currently running it on one node with all stages set to sequential.

So for me one of the questions is how many instances should I have running of the job that we expect to run in 4-5 seconds. The workload is 300 000 request that need to be handled by the job with a peak work load of around 600 a minute. The requests need to be returned by the job in under 30 seconds. I would think we should design for 600 + a percentage to give us some wiggle room above the current max work loads seen.

Financial impact if requests are not returned in under 30 seconds.

eostic · Post by **eostic** » Fri Feb 05, 2016 12:46 pm

ah. ok...

Then you need a test harness. It's the only way you'll figure out the best settings. I like to start with "3" instances, and then make sure that the pipeline and operation buffers are fairly deep, and then create (write or use one from another tool) a test harness that can:

a) spin up "n" independent client connections (users)
b) fire "n" requests on a per-minute or per-second basis.

...and then measure my throughput and increase instances as needed to find the best number of instances to meet the high water mark, and then make that the minimum instances and lock it in.

With a Job that has ISDinput, and several stages, one single instance of the Job is able to serve many requests concurrently, so you many not need "that" many instances. As you ramp up the test harness, a cheap and easy review, if you are getting responses in the right time frame, is to watch the DataStage Monitor within the DS Director...see if all the instances are processing requests as things ramp up. You probably won't even see the second and third instance get requests (rows) until your rates get fairly high....however, your delay in the service call may cause your requests to back up more quickly and go over to another instance.

I haven't used it in a long time, but I used to really enjoy using Actional SOAP tester (they got purchased by the same folks that bought DataDirect and also Sonic...vendor starts with a P...)....but also I thought that SOAPui also had such features or a legacy testing tool called RoadRunner, if it still exists.

Ernie

qt_ky · Post by **qt_ky** » Sat Feb 06, 2016 9:32 am

Just curious... Would it be possible to load test the service by using a multi-instance job that calls the service, perhaps from a separate Information Server? Any drawbacks to that approach?

eostic · Post by **eostic** » Sat Feb 06, 2016 5:29 pm

i have done that.....it will work too........but it is very difficult in that pattern to tamp up as high as you might need.....like having 100+ concurrent clients each sending 20 requests per minute......

a small footprint java based (probably) testing tool is going to be far more effective at being that test harness.

Ernie

pbttbis · Post by **pbttbis** » Sun Feb 14, 2016 6:56 am

Have being making use of CURL and a shell script to run my test harness.

Have settled now on the number of instances that will be needed.

I have being making use of operations console to monitor resources during my testing and notice that currently I am limited by memory and not CPU. This kinda makes sense to me as the JOB is only doing two singleton inserts into a database. Majority of time spent is waiting for the REST step call to return.

Where can I monitor queued requests hitting the ISD service? The job itself logs when it receives the requests, but I would like to monitor requests waiting to be served up to the ISD jobs.

I also noticed that if i try start up all the instances at once that the DS srver hangs. I am getting around this by setting a workload parameter to only start x amount of jobs in x seconds. Is this behaviour normal I am experiencing? Is there a way to set this workload parameter only for ISD jobs?