I have used DataStage for several years now, but this is actually my first foray into the Complex File Format stage - up until now, we have simply used sequential file stages and work arounds. I am trying to see if there is a more efficient way to handle multiple file formats in a single file.
So, with that said, I am doing a simple test. I have the following file layout with a header record and detail records (the CFD below is actually generated from the CFF stage):
I have heard rumors that the CFF stage can do redefines. In fact, I have defined this layout already. I would like the header record to go to one dataset and the detail records to go to another. How do I do this?
Okay, I never got this example working back then and let it sit until now. Still can't get it working... at least not the way I want it to.
The job is simple. The CFF uses the file layout from above and has 3 outputs: reject, header and data (write directly to datasets).
The file imports and outputs to the datasets just fine. I get 0 records in the rejects (expected), but get all records in BOTH the header and data datasets.
How does the CFF know which record to send to which link? I would have expected all but 1 record to go the data link, and 1 record to go to the header link.
One other note. When I look at the schemas for the data and header datasets, they do have the correct layouts. The header just has all records in it...
I'm not a site where I can test this, but recall that you need a record-id column that is used, in a manner similar to a switch stage, to identify a column position that differentiates the different record types. I know I've gotten it working, but I did struggle with it.
Sure, many of us occasionally read the manuals but make sure we are not watched or disturbed in the process. I will usually only admit under duress that I even own a manual, and only the utmost coercion will force me to admit having actually read from it :D
If I recall correctly, we used @INROWNUM=0 constraint in the Header Output link. But not able to recall whether it is on a Server Job or parallel. I don't have access to 7.5.2 right now.
Also on the Header file input link selected 'first row column names' option.
arndt is a stud so I'll defer to him. But I will say in our organization that if you have completely different record formats (cff stage can definitely handle small redefines) in the input (an a in column 1 means header record - format x, a b in column 1 means detail - format y) we split them out and process each separately. cff stage is a technical marvel, but one transformer the other side of cff stage handling two completely different input record format seems very confusing. Think about how you would do this with ascii rows, each row would have a completely different format.
arndt is a stud so I'll defer to him. But I will say in our organization that if you have completely different record formats (cff stage can definitely handle small redefines) in the input (an a in column 1 means header record - format x, a b in column 1 means detail - format y) we split them out and process each separately. cff stage is a technical marvel, but one transformer the other side of cff stage handling two completely different input record format seems very confusing. Think about how you would do this with ascii rows, each row would have a completely different format.