Resource Tracker

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
sjfearnside
Premium Member
Premium Member
Posts: 278
Joined: Wed Oct 03, 2007 8:45 am

Resource Tracker

Post by sjfearnside »

I have searched for information on the resource tracker and set it up on the install according to the documentation. I have not seen any documentation on how to use the file produced - machineLog.8001.rtxet01.xxx..

Can anyone tell me if there is a function in DataStage that can use this file to produce a report? or how can this information be used, other than viewing the file?
sima79
Premium Member
Premium Member
Posts: 38
Joined: Mon Jul 16, 2007 8:12 am
Location: Melbourne, Australia

Post by sima79 »

The files produced are used by the performance analysis tool, new in version 8. Selecting the option "Record job performance data" will write out information to an XML file containing:
- Record throughput
- Utilization of your machine (CPU, disk, memory)

You can then view this information in the "Performance Analysis" option in designer (File->Performance Analysis...)

Its handy for analyzing your job and identifying performance bottlenecks
sjfearnside
Premium Member
Premium Member
Posts: 278
Joined: Wed Oct 03, 2007 8:45 am

Post by sjfearnside »

The file produced by the Resource Tracker does not appear to be xml. Is there a native function in datastage that consumes the machinelog... files and converts the information to xml? I thought the performance analysis tool was something different. I have used it before but I am not sure this is related. Please clarify, maybe I am not understanding.

Thanks :?
sima79
Premium Member
Premium Member
Posts: 38
Joined: Mon Jul 16, 2007 8:12 am
Location: Melbourne, Australia

Post by sima79 »

Do you have a sample of the machine log that you can post:

Mine looks like:

Code: Select all

<?xml version="1.0" encoding="UTF-8" ?>
<machine_resource_output version="0.1" start_date="2009-03-31 18:10:12" framework_revision="IBM WebSphere DataStage Enterprise Edition 8.0.1.4665 ">
<machine_description>
        <host name="tcolapp21" domain=""/>
        <platform name="AIX" compiler="" version="5.3"/>
        <cpus count="8" model="PowerPC_POWER5 2147MHz"/>
        <memory totalRAM="10485760" totalSwap="21495808"/>
</machine_description>
Then further down in the file is the performance statistics:

Code: Select all

2009-03-31 18:10:12,4,0,0,-,0,0,100,0,0,0,0,0,0,0,[5]hdisk1,0,0,0,0|hdisk4,0,0,0,0|hdisk3,0,0,0,0|hdisk2,0,0,0,0|hdisk0,0,0,0,0|
2009-03-31 18:10:16,4,0,0,-,0.309693,0.371632,99.3187,0,0,0,0,0,0,0,[5]hdisk1,0,1,0,4|hdisk4,0,0,0,0|hdisk3,0,0,0,0|hdisk2,0,0,0,0|hdisk0,0,0,0,0|
sjfearnside
Premium Member
Premium Member
Posts: 278
Joined: Wed Oct 03, 2007 8:45 am

Post by sjfearnside »

You are correct in that the heading of the machinelog file declares it to be an xml file. The job performance function in the designer/director in V8.0.1 harvest an xml file created by selecting the "record job performance data" in the job properties of the designer. This creates an xml file under the ..../ASBNode/conf/etc/XmlFiles directory.

The machinelog file is created by installing the resource tracker and is a different file. The job performance file above is created as <job name>.xml whereas the resource tracker file is called machinelog.<nodename>.YYCCMMDDXXXXXX.

This is the reason for my statement/confusion.

Your default file names and location may be different depending on your installation but I hope I was able to distinguish the 2 files. If not please let me know and I will try to clarify.

Thanks for taking the time to help me with this question. :)
sima79
Premium Member
Premium Member
Posts: 38
Joined: Mon Jul 16, 2007 8:12 am
Location: Melbourne, Australia

Post by sima79 »

The xml files in the directory IBM\InformationServer\ASBNode\conf\etc\XmlFiles are the operational metadata files. These can be imported into the metadata repository using the run import utility (RunImportStart.sh). This information can then be displayed in Metadata Workbench. Have a look in the document "Guide to Managing Operational Metadata from Job Runs"

Whereas the resource tracker logs logs the processor, memory, and I/O usage on each computer that runs parallel jobs in the format machinelog.<nodename>.YYCCMMDDXXXXXX.

Hopefully this cleans up any confusion
Post Reply