Wait for file issues

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Wait for file issues

Post by UCDI »

I am trying do something super simple. I have a shell script that I am calling from a sequence job via "execute command" stage. This script redirects to a file. Then in a "job activity" I want to read the file in a sequential file reader and proceed. The sequence fails because the file is not there when the job activity stage fires so I have tried inserting a "wait for file" stage and I have tried inserting a routine stage that calls the built-in wait for file and no matter what I try with the triggers etc it does not work. I have read a dozen threads on the wait for file functionality and I still do not get it.

I do not have a lot of experience with the triggers, and I suspect that is the problem. Can someone explain what the trigger setting should be for this task? All I want it to do is wait for the file to exist, and then proceed. The file has 2-3 lines and about 20 bytes, so if it exists, its ready to go.

Thanks!
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The WFF stage is pretty straight-forward. When you tell it to wait for a specific file it will 'fire' the success link once the file is found. Any failure link will fire if the file does not show up within the time you've told it to wait.

Make sure you're not using a wildcard in the name of the file you are waiting for as they are not supported.
-craig

"You can never have too many knives" -- Logan Nine Fingers
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

Thanks.
No it is job parameter for a file name, which is very specific. I thought the stage would be easy to use as well but its not working.

I am seeing something really odd. I wrote my own wait for file in C and pushed that into a routine. It also does not work (it works fine standalone but fails in the sequence). I can see the file appear in unix shell. I can see my wait for file get stuck forever and not register this happening. And my next stage, if I just "sleep" for a bit to let the file appear, still fails to open in in the sequential file stage "file does not exist".

Something to do with the file creation is not registering, its like datastage is taking a snapshot of the folder upon launching the job and changes made after that are not seen. I even manually created the file (echo blah > filename) and it still never registered. So maybe "wait for file" stage is actually working and something to do with creating the file the way I am doing it is not...
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Like I said, it's pretty straight-forward so I'm guessing that where you are looking for the file and where you are putting it may be two different places. I would make sure you have fully pathed the filename in the WFF stage, double-check that.
-craig

"You can never have too many knives" -- Logan Nine Fingers
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

Right, I wish it were that simple.

Its a fully qualified path and even more: the jobs work at the parallel job level. Each individual piece works when run one at a time by hand. Putting them together in a sequence, it does not work -- it behaves as if the file (which I can see appear in the correct folder using a console/putty) does not exist (but it does!).

As far as I can tell, since I wrote my own wait for file and it does the same thing, that its not the wait for file stage or my use of it, its some issue either with our environment or something I did that is failing to register the file's creation. I can force it to work with a sleep instead of a wait for file but that is not robust.

I appreciate the suggestions. Still looking into it...
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Be curious what you find. Perhaps you should involve support because when push comes to shove it really should be that simple, so perhaps there's an issue with the stage in your platform / version. Never mind that it seems to me that we are way past remote suggestions being helpful, sounds like it really needs boots on the ground, a different set of eyeballs looking at the same thing you are.

Sorry I couldn't help more. :(
-craig

"You can never have too many knives" -- Logan Nine Fingers
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

I wonder if I am seeing THIS

http://www-01.ibm.com/support/docview.w ... wg21605954

I tried doing a "printenv TEMP" both from console and Datastage and got nothing (TEMP does not exist in environment) but the root cause may still be something like this post. Its an old post though... The symptoms match but the solution does not.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sorry but how do 'the symptoms match' here? Is this an ISD job failing with the error that they mention? I thought it a normal Sequence job that just waited... forever... without any kind of failure. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

If I remove the wait for file stage (lets say I use a routine that runs a C++ sleep function for 10 seconds), I get the same error after the sleep (file not found) even though it exists (visible to LS in unix console). Datastage cannot see the file that is created from "execute command" doing something simple like "ls filename >file.txt" as the command. This is the root cause really -- I am convinced now that wait for file is not the real error but a symptom of this root cause.

It may not exactly match the article, but its at least very similar.
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

Could it be a file ownership / permissions issue? Are you certain the file is being created by the same ID as the one executing the job and that the ID has permissions to see the file?
Choose a job you love, and you will never have to work a day in your life. - Confucius
rschirm
Premium Member
Premium Member
Posts: 27
Joined: Fri Dec 13, 2002 2:53 pm

Post by rschirm »

The WFF stage is not intended for it to monitor for the file that a job is suppose to execute. Many time the file may have been created and populated but the process that wrote it has not completely closed yet. so to a job trying to pick it up cannot get a open because the file is still open by another process. Now that might not be your issue. Can you send me an export of the sequence job? I will look at it.
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

qt_ky wrote:Could it be a file ownership / permissions issue?
Yes, I am sure it is not permissions. Again, the stages can read the file when executed manually one by one, but not in a sequence.
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

Edit the dsx was small so I tried to send it via email here as plain text.


rschirm - I will create a simpler job with the problem and send you that. I have made a large mess chopping things into more and more jobs trying to isolate the issue.

I know the file might not be "done" .. hence the "wait for file". If nothing else though waiting 10 seconds should finish and close it. Its literally like 30 bytes.

All I want to do is
- unix command redirect to file (this will be a cksum command)
- open said file
-look at its value
-issue a failure notification if chksum does not match.


How exactly do I send you the export? I will cook something up and send it in the next 1/2 hour or so.

If the wait for file stage is incapable, I can write C that can do it. I already have messed with it a little and it seems to work:

int ds_WFF(char* fname)
{
FILE *inf = 0;
while(!inf)
{
inf = fopen(fname,"r");
if(!inf)
sleep(0.5);
}
return 0;
}
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I'm not really sure what Rick meant by his "not intended" comment, hoping he'll clarify when next he swings by.

As we all know, the whole "wait for file" doesn't mean anything like wait for it to be complete, it means wait for it to appear (or disappear) with no regard for if is in transit or not. I don't recall having any locking issues in the past but do recall getting bitten by larger files starting to arrive and triggering the waiting process and then the consumer falling over dead when it hit the incomplete trailing edge of the file. Which is one reason that many years ago we switched from waiting for the actual file being transferred to waiting for an empty trailing semaphore / flag / go file that is transferred once after everything before it has completed.

All that being said, it doesn't sound like you need any kind of WFF functionality for your "all I want to do" list of steps. You are creating the file, you're not waiting for it to show up at some unknown time from an external source, so why wait? Create the file and then in the next step, open said file, etc etc. Or am I missing something here?
-craig

"You can never have too many knives" -- Logan Nine Fingers
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

I totally agree. I waited because my sequential file reader stage was crashing on "file not found". So I tried a wait, which waited forever. I tried a sleep, and that failed on file not found again after finishing. All the while, the file was there to be seen in my unix console faster than I could refresh the command to look at it. The wait was just in case the nanosecond it took to write the file was longer than it took to try to open it in the next stage --- turns out that is not really the issue.

The issue seems to be that while the file appears, datastage can't see it. Ever, apparently.

It gets weirder. I started completely over on the whole thing and it is working now. The code is virtually identical to what I was doing before. I am not sure what the difference is, but the original works piecemeal if I run each piece manually one by one and fails in its sequence. The new version is working in the sequence. There is some tiny difference in them, probably a setting somewhere, that breaks the original effort. If I figure out what it is (I am going to try) I will post it here. Otherwise I am just going to proceed now that, 2 days later, this 5 min task is finally done.

I appreciate everyone trying to help. This has certainly been a weird and grueling ordeal for such a simple thing.
Post Reply