Page 1 of 1
Issue with Dataset performance in RedHat Linux
Posted: Fri Jan 01, 2016 9:45 am
by John Corbin
We have a very similar issue in our shop... writing to or reading from datasets seems to take forever.
as an extreme example, I had a job that read 145 rows from a sequential file and wrote them to a dataset.
We use an 8 node config file running DS 8.5
seqFile------------------>Transformer-------------------->Dataset
2 columns on seq file
COL1 VARCHAR(4)
COL2 VARCHAR(4)
Transformer
Does nothing, I did not write this job or it would not be there
2 columns on Dataset
COL1 VARCHAR(4)
COL2 VARCHAR(4)
Here is the Director job start and end times Not kidding either...
Starts at 2015-12-19 11:26:19 AM
Ends at 2015-12-19 04:01:25 PM
No warnings
nothing captured/ignored in a message handler at the job or project level.
on our older SOLARIS server, the job ran fine in seconds.
I tried the
RowGenerator------------------>Transformer-------------------->Dataset
took seconds to run for 145 rows.
Some other info....
in July 2015, we migrated from a Solaris server to a server running Red Hat Linux 2.6.32-431.5.1.el6.x86_64
when I ran the RowGenerator, it was during a quiet time on our production server so this may explain why the job ran fast
When to original job ran, it was during the busy time on the production server.
Could LINIUX be trying to implement some sort of resource management when the box is busy?
Posted: Fri Jan 01, 2016 11:02 pm
by ray.wurlod
No, that's just a weird result. Can you reproduce it? How big are the Data Set segment files?
Posted: Sat Jan 02, 2016 7:54 am
by John Corbin
Ray
It happens every week since we moved to Linux Red Hat.. There are other jobs experiencing the same issue in performance. The one I wrote about is an extreme example
Not sure how to tell how big each segment is...
is this info from Dataset Management useful?
Code: Select all
##I IIS-DSEE-TFCN-00001 08:47:18(000) <main_program>
IBM WebSphere DataStage Enterprise Edition 8.5.0.6152
Copyright (c) 2001, 2005-2008 IBM Corporation. All rights reserved
##I IIS-DSEE-TUTL-00031 08:47:18(001) <main_program> The open files limit is 16384; raising to 32768.
##I IIS-DSEE-TFCN-00006 08:47:18(002) <main_program> conductor uname: -s=Linux; -r=2.6.32-431.3.1.el6.x86_64; -v=#1 SMP Fri Dec 13 06:58:20 EST 2013; -n=EC24LP4060; -m=x86_64
##I IIS-DSEE-TFSC-00001 08:47:19(000) <main_program> APT configuration file: /disk/temp/datastage/ADW_GST_AUDIT/TMPDIR/aptoa6660cc0f71fc
##I IIS-DSEE-TOIX-00059 08:47:19(000) <APT_RealFileExportOperator in APT_FileExportOperator,0> Export complete; 101 records exported successfully, 0 rejected.
Name: /disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds
Version: ORCHESTRATE V8.5.0 DM Block Format 6.
Time of Creation: 12/19/2015 14:30:39
Number of Partitions: 8
Number of Segments: 1
Valid Segments: 1
Preserve Partitioning: false
Segment Creation Time:
0: 12/19/2015 14:30:39
Partition 0
node : node1
records: 19
blocks : 1
bytes : 168
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0000.0000.aea.d83efb5f.0000.0527aae4 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0000.0001.aea.d83efb5f.0001.ef31cece 0 bytes
total : 131072 bytes
Partition 1
node : node2
records: 18
blocks : 1
bytes : 160
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0001.0000.aea.d83efb5f.0002.6e1bf330 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0001.0001.aea.d83efb5f.0003.0f1ca6f1 0 bytes
total : 131072 bytes
Partition 2
node : node3
records: 18
blocks : 1
bytes : 158
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0002.0000.aea.d83efb5f.0004.0b748e6b 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0002.0001.aea.d83efb5f.0005.287499f0 0 bytes
total : 131072 bytes
Partition 3
node : node4
records: 18
blocks : 1
bytes : 160
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0003.0000.aea.d83efb5f.0006.d8c1bf43 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0003.0001.aea.d83efb5f.0007.b0f249f6 0 bytes
total : 131072 bytes
Partition 4
node : node5
records: 18
blocks : 1
bytes : 160
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0004.0000.aea.d83efb5f.0008.fbbfcb0d 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0004.0001.aea.d83efb5f.0009.432e22ca 0 bytes
total : 131072 bytes
Partition 5
node : node6
records: 18
blocks : 1
bytes : 158
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0005.0000.aea.d83efb5f.000a.8edb13b2 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0005.0001.aea.d83efb5f.000b.7505a734 0 bytes
total : 131072 bytes
Partition 6
node : node7
records: 18
blocks : 1
bytes : 162
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0006.0000.aea.d83efb5f.000c.3e74c98b 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0006.0001.aea.d83efb5f.000d.be714afc 0 bytes
total : 131072 bytes
Partition 7
node : node8
records: 18
blocks : 1
bytes : 160
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0007.0000.aea.d83efb5f.000e.48465cc3 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0007.0001.aea.d83efb5f.000f.0a10ea2d 0 bytes
total : 131072 bytes
Totals:
records : 145
blocks : 8
bytes : 1286
filesize: 1048576
min part: 131072
max part: 131072
Schema:
record
( ADW_Office_Value: string;
ORG_office: string;
)
##I IIS-DSEE-TFSC-00010 08:47:19(001) <main_program> Step execution finished with status = OK.
Posted: Sat Jan 02, 2016 3:18 pm
by ray.wurlod
Yes, your segment files are minimally sized (128KB each). Therefore that's not the problem. Data are moved to/from data sets in units of not less than 32KB, so you should be seeing very few I/O operations.
Time to involve your official support provider, methinks.
Posted: Mon Jan 04, 2016 4:28 am
by PaulVL
APT file on a temp disk... are you running on a GRID?
Posted: Tue Jan 05, 2016 6:20 am
by John Corbin
Just confirmed with our support area.. we are not on a GRID
Posted: Tue Jan 05, 2016 9:09 am
by ArndW
Do you know what filesystem is used on "/disk/data/datastage/" and does that reside on a SAN or mounted/remote disk?
Posted: Wed Jan 06, 2016 6:23 am
by John Corbin
Red Hat Enterprise Linux on mounted/remote disk.
Update...
Posted: Sat Feb 24, 2018 4:49 pm
by John Corbin
Update...
We have just upgraded to 11.5. We were on 8.5 without support as we were past our license expire date by a year.
We contacted IBM as soon as we were in 11.5... here is what were told...
Datasets make use of a linux system call named fsync. As a test, IBM told our support area how to disable calls to fsync. Jobs then ran in Seconds with out fail.
Sadly, we cannot disable fsynch permanently but this proved the isdue not with Datastage itself but rather our setup...
I did not know at the time, but we are also using vmware over top of linux... vmware is interfering with calls to fsync.
Upshot... we will soon move to hardware using linux but vmware..
Posted: Sat Feb 24, 2018 7:52 pm
by chulett
Thanks for the update!
Posted: Wed May 16, 2018 9:13 am
by thompsonp
John - do you have any further details or a case number from IBM that you could share?
Posted: Sun Apr 28, 2019 6:03 pm
by John Corbin
apologies for reviving this thread....
sorry no CASE number from IBM only because I am no in the support gropu that would deal with them.
I asked my support area how they fixed it and this is what I was told:
the group that maintains our Data stage added this line to the dsenv file
APT_DATASET_FLUSH_NOSYNC=1; export APT_DATASET_FLUSH_NOSYNC