| Author |
Message |
DKostelnik
Participant
Joined: 30 Jan 2007
Posts: 31
Location: Central Florida
Points: 278
|
|
| DataStage® Release: 7x |
| Job Type: Parallel |
| OS: Unix |
|
Greetings!
The managment where I work has made an arbitrary decision to consolidate all the various "work" volumes on our AIX system into two file systems. The rules they are about to put in place is this:
1) All files in the production work file system are to be purged after 3 days.
2) All files in the development work file system are purged after 7 days.
This means that all DS work files older than 3 days that are generated by production processes will be purged without any consideration.
What I need to know is: Do I really need to worry about this rule? I can't tell from documentation if there are cases where files sent to the tmpdir file system (from DS) are truely ALWAYS useless after the job completes. What things, if any, should I worry about with these rules from a DataStage point of view?
|
_________________ RadToy
AAA ACS
Listen to:
Porcupine Tree
Nosound
Days Between Stations |
|
|
|
 |
Sreedhar
Participant
Joined: 30 Oct 2006
Posts: 186
Points: 1330
|
|
|
|
|
|
HI
Welcome to Dsxchange...!
We can find the files which are older then certain period of time by using the following command.
1) find . -mtime -n -print
2) find . -ctime -n -print # c indicates the creation time.
where n represent number of days old.
but it give you all the files which have been modified say in less then n number of days.
Ideally the files from the temdir will be of no use once the job has been
completed.
hope this helps you.
|
_________________ Regards,
Shree
785-816-0728 |
|
|
|
 |
DKostelnik
Participant
Joined: 30 Jan 2007
Posts: 31
Location: Central Florida
Points: 278
|
|
|
|
|
|
|
I am really looking for any solid ramifications of deleting DataStage files in the filesystem defined to TMPDIR.
|
_________________ RadToy
AAA ACS
Listen to:
Porcupine Tree
Nosound
Days Between Stations |
|
|
|
 |
DSguru2B
 since February 2006
Group memberships: Premium Members, Heartland Usergroup
Joined: 09 Feb 2005
Posts: 6855
Location: Houston, TX
Points: 35663
|
|
|
|
|
|
|
Its not good. Say you have a stream of 50 jobs. All dependent upon each other. The 49th job creates a staging file thats needed by the 50th job. The 50th job fails for some reason on friday. If you are not able to fix the problem in the next 72 hours, the file will be gone. You need to give enough time. 3 days is not long enough
|
_________________
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
|
|
|
|
 |
chulett
 since January 2006
Group memberships: Premium Members, Inner Circle, Server to Parallel Transition Group
Joined: 12 Nov 2002
Posts: 36583
Location: Denver, CO
Points: 186480
|
|
|
|
|
|
Let's take a step back. Temp is temp.
Anything
that goes there is fair game for nukage after the process that created it... ends. If you are 'staging' data there then you are asking for trouble.
Here's what the documentation says on the subject:
| Quote: |
|
TMPDIR.
This defaults to /tmp. It is used for miscellaneous internal temporary data, including FIFO queues and Transformer temporary storage. As a minor optimization, it can be better to ensure that it is set to a file system separate to the DataStage install directory. |
Once the process that creates them completes, anything of a 'miscellaneous internal temporary' nature can be deleted. That's at least my considered opinion, if you are looking for a iron-clad guarantee you'll need to pose this question to IBM.
And if your jobs take days to run and trouble-shoot, you've got other problems.
|
_________________ -craig
It's a scheme of devices to get at low prices all goods from cough mixtures to cables
Which tickled the sailors by treating retailers as though they were all veg-e-tables
|
|
|
|
 |
DSguru2B
 since February 2006
Group memberships: Premium Members, Heartland Usergroup
Joined: 09 Feb 2005
Posts: 6855
Location: Houston, TX
Points: 35663
|
|
|
|
|
|
|
We have temp directory and we stage our files there. We have a cleanup process that cleansup these files after 30 days. We can even call it TempStg. Regardless of the name, if you are staging files in that particular directory, dont get rid of them that soon. Its better to move them or archive them in an archive folder and then clearn compressed files after considerable number of days. If its a true /tmp folder as Craig explained, then it shouldnt be a problem to clean it up every few days.
|
_________________
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
|
|
|
|
 |
chulett
 since January 2006
Group memberships: Premium Members, Inner Circle, Server to Parallel Transition Group
Joined: 12 Nov 2002
Posts: 36583
Location: Denver, CO
Points: 186480
|
|
|
|
|
|
Right - this wasn't a generic 'temporary storage' question, but rather a very specific
TMPDIR
setting related question.
At least I hope it was... maybe we're both off the mark. Hold on, I'm sure we'll find out soon enough. Perhaps
/TMPDIR
(missed the slash in the first go around) is something specific to AIX that I'm not aware of.
|
_________________ -craig
It's a scheme of devices to get at low prices all goods from cough mixtures to cables
Which tickled the sailors by treating retailers as though they were all veg-e-tables
|
|
|
|
 |
DKostelnik
Participant
Joined: 30 Jan 2007
Posts: 31
Location: Central Florida
Points: 278
|
|
|
|
|
|
Yes, it is specific to the setting of TMPDIR. Sorry for any confusion. In my environment, the variable TMPDIR is set to /worktmp. /worktmp is used buy processes other than datastage.
I am not worried about any files that get staged there because someone chose the directory set for their temporary/work files - that is their fault.
I am only worried about the ramification of deleting files created "internally" by the DataStage product and placed in the TMPDIR. A couple of our jobs are huge and take up to a week to complete the entire run from beginning to end.
I do have a PMR open with IBM and am waiting for a reply. I wanted to get the user community input as well (sometimes IBM misses things).
|
_________________ RadToy
AAA ACS
Listen to:
Porcupine Tree
Nosound
Days Between Stations |
|
|
|
 |
DSguru2B
 since February 2006
Group memberships: Premium Members, Heartland Usergroup
Joined: 09 Feb 2005
Posts: 6855
Location: Houston, TX
Points: 35663
|
|
|
|
|
|
|
I would say any OS level activities would use the tmp directory like sorting etc. by default. Other processes occupy specific directories present in the Project home directory.
|
_________________
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
|
|
|
|
 |
DKostelnik
Participant
Joined: 30 Jan 2007
Posts: 31
Location: Central Florida
Points: 278
|
|
|
|
|
|
The IBM official response:
Once a job finishes, the files in the /tmp are not required. You should have no problem purging them after 3 days.
|
_________________ RadToy
AAA ACS
Listen to:
Porcupine Tree
Nosound
Days Between Stations |
|
|
|
 |
DSguru2B
 since February 2006
Group memberships: Premium Members, Heartland Usergroup
Joined: 09 Feb 2005
Posts: 6855
Location: Houston, TX
Points: 35663
|
|
|
|
|
|
So now you have to make sure your jobs finish within 3 days
|
_________________
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
|
|
|
|
 |
|
|