Hi all,
I am facing a strange issue that I've never come across before, around hash file creation. I have a process that does two SQL extracts from a database, performs some basic transformation and outputs to a hash file. Here is a screenshot of the job:
The problem is that the other day one of the transforms (bottom one in the image above) outputted individual records to separate files and I cannot make sense of it. So instead of the transformer outputting records to a single hash file, it ended up outputting 150,000ish individual files to the hash directory (the 176K in the above image is from the latest run, but this count read 150Kish on problematic run in question). The impact was that we ended up running out of inodes on our UNIX box because the file creation meant we exceeded the inode threshold. As a result, other DS jobs fell over as there was no space to write to logs, temp directories etc.
On recompile and restart of the job, it restarted and ran as normal, creating a typical hash file with a DATA.30 of 85MB and a OVER.30 of 26MB.
If anyone has encountered a similar issue, I'd really like to hear what you found to be the cause and how you ensured it didn't happen again. I'm hesitant to keep this job running (even though it is seemingly running fine now) in case it topples every job running in production.
Thanks in advance.
Hash file job - individual records going to separate files
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 22
- Joined: Wed Jul 23, 2008 5:59 pm
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Check that you have exactly the same file name on both input links to the Hashed File stage - not only identically spelled, but also identically cased. Incidentally, your link row count is 1.7 million, not 176K.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 22
- Joined: Wed Jul 23, 2008 5:59 pm
- Contact:
Hi Ray, my apologies - it aborted around 150K (I should've left out the bit about the record count in the image).
L_CSTID_TF_O is outputting to hash file named ac_ar_lookup
L_CST_MKT_SEGMENT_TF_O is outputting to a hash file called cst_mkt_segment_id_lookup
It is the former that is having the issue. I can confirm that spelling and casing is all ok.
I should add that this job has been in production since 2008 and this is the first time I, or anyone in my team, has come across this problem.
Cheers
L_CSTID_TF_O is outputting to hash file named ac_ar_lookup
L_CST_MKT_SEGMENT_TF_O is outputting to a hash file called cst_mkt_segment_id_lookup
It is the former that is having the issue. I can confirm that spelling and casing is all ok.
I should add that this job has been in production since 2008 and this is the first time I, or anyone in my team, has come across this problem.
Cheers
-
- Premium Member
- Posts: 22
- Joined: Wed Jul 23, 2008 5:59 pm
- Contact: