Page 1 of 1

Dynamic file structure

Posted: Tue Sep 11, 2018 9:58 pm
by drkumar
Hi All
I have Source file which is pipe delimited, it's load to Teradata table, Here my source file structure will change dynamically its should handle below scenarios


1. My ETL Job should not fail it has to process with existing columns.

2. If any changes in source file structure, Need to get EMAIL notification.


Thanks
Ratna Kumar

Posted: Wed Sep 12, 2018 5:55 am
by chulett
:?

In what world does a source file "change dynamically" and can you give us some ideas as to what exactly that may mean? Specifically wondering what kind of changes you are expecting. With the mention of "existing columns" are we talking about possible new columns being added to the end of the record? That's something that can be both checked for and fairly easily handled. But if we go full dynamic here - columns can be swapped around, new columns added in random spots, those kind of things - then that's a whole different kettle of fish.

Please clarify for us.

Posted: Thu Sep 13, 2018 7:00 pm
by ray.wurlod
How do you propose to detect a change?

That will affect how your processing runs.

Posted: Fri Sep 14, 2018 8:42 am
by drkumar
Thank you Craig..
The new columns will be added at end of the record.

Posted: Fri Sep 14, 2018 9:06 am
by chulett
Then it seems to me you will need to have a pre-check process for the presence of new columns and it sounds like that's as simple as counting the number of pipe delimiters in any record. There are multiple ways to handle it, could be a DataStage job but seems to me a script would be perfectly acceptable as well. When it finds more than the expected number of pipes, sound the alarm.

Then continue to run the job with the expected number of columns and see what kind of warnings / errors you get when there is a "short read" i.e. there are additional columns not included in the metadata. From what I recall, there was a checkbox on the Server side to suppress them, not sure how PX handles it but I would guess not gracefully. Build a small test harness to see and post your findings... unless someone chimes in with an answer for you before you get that together. :wink:

Posted: Mon Sep 24, 2018 5:30 am
by rrcr
Like chulett suggested we need to have before job script in which we can check for the number of columns . If the columns are more/less than the expected then send a mail notification.

coming to column definations we have to use schema file with RCP.
the schema file needs to be updated in before job script based on the number of columns

thanks,
Ramireddy Ch