Project File system Location

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
PeteM2
Premium Member
Premium Member
Posts: 44
Joined: Thu Dec 15, 2011 9:17 am
Location: uk

Project File system Location

Post by PeteM2 »

We are currently experiencing I/O bottlenecks on the unix file system that contains all our datastage projects.

Is there any benefit in creating seperate file systems for each project?
thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Maybe and possibly.

It depends whether or not that I/O is related to DataStage projects or to something else on those file systems.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

It depends on your disk type also. If your server has local physical disks and all projects on one physical disk, then change your configuration. If you're on a SAN, you should not be having this issue, but if you are, then the SAN admin people should be able to fix it for you transparently.
Choose a job you love, and you will never have to work a day in your life. - Confucius
PeteM2
Premium Member
Premium Member
Posts: 44
Joined: Thu Dec 15, 2011 9:17 am
Location: uk

Post by PeteM2 »

The datastage projects reside in one file system and nothing else is on this file system.
thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

It can also depend on the capabilities of the disks themselves and how their filesystem is configured. Had one client where we found that DataStage had been installed on (their words) "the crappy disks" and performance suffered horribly for it, simply moving it elsewhere helped. Others have needed to "tune" the filesystem, which involved the SAs monitoring the usage and then tweaking parameters related to reads v. writes, caching, etc until we saw performance improvements.

So what O/S and filesystem are we talking about here? If applicable.
Last edited by chulett on Fri Mar 30, 2012 5:29 am, edited 1 time in total.
-craig

"You can never have too many knives" -- Logan Nine Fingers
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

Just make sure they're not on "the crappy disks" then! :lol:

Our servers don't have any local disks (no crappy disks). All SAN. Works very well...
Choose a job you love, and you will never have to work a day in your life. - Confucius
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Exactly... forgot they specifically said "crappy internal disks". :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
PeteM2
Premium Member
Premium Member
Posts: 44
Joined: Thu Dec 15, 2011 9:17 am
Location: uk

Post by PeteM2 »

The disks are on SAN and their performance is fine.

The issue is that on the AIX server side we are seeing queuing whilst the disks respond to each request for data.

The issue is caused by batch wanting to access the same disks for data - when you have 40 batch jobs running at the same time all wanting data from the same disks.
thanks
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

I still think that the SAN / disk admin people should be able to fix it for you transparently, ... assuming that those are all read requests.

In any case, ask them how many physical disk spindles that particular file system are sharing.

Also ask for details of any memory cache that's built into the SAN. Ask to see and understand those statistics.

If your 40 jobs are all trying to write to the same file at once and queuing up in line to wait for exclusive write access, then you probably have to address the bottleneck with changing DataStage job designs or schedules.

Your original question was about benefit in create separate file systems per project. It will give you the benefit of a controlled allocation of disk space per project, so if a job or developer has a run-away job log, it will only fill the disk of that one project and limit the risk of corruption. So yes, there is a benefit.

Whether or not creating separate file systems per project will help a performance issue is going to depend upon how each file system is allocated across physical disk spindles (back to your SAN people). One project may land on disk all to yourself. A critical project may be mapped automatically to a few disks that are hammered to begin with.

Please share what you find out next.
Choose a job you love, and you will never have to work a day in your life. - Confucius
Post Reply