Page 1 of 1

Multiple records based on value in column

Posted: Wed Feb 14, 2018 7:01 am
by esivaprasad
Hi ,

we are using datastage 8.1.
my requirement is :

Input is like this
customer|Count
A|4
B|1
C|2
Based on customer count value , need to generate that many output columns for the same customer.

output required as
Column1 | Column2
A|1
A|1
A|1
A|1
B|1
C|1
C|1
please help me how to get output as requested format.

Thank you.

Posted: Wed Feb 14, 2018 7:53 am
by chulett
Well... first thing I would have suggested was 'transformer looping' but I'm pretty sure it's not available in your ancient version. What is the source - a file, a database table or something you are willing to load into a table? Thinking there's a SQL solution available if so. Or perhaps someone else has a clever solution... perhaps something written in C++ by UCDI? Command line filter Perl shenanigans? :wink:

Posted: Wed Feb 14, 2018 8:11 am
by esivaprasad
My source and target are both CSV file format.
Tranformer looping option not available in 8.1 version.


Thank you.

Posted: Wed Feb 14, 2018 8:56 am
by chulett
Right. Ancient. :wink:

Still curious if a database is an option, if that's where your skills are - perhaps even an Oracle external table. I'd also be curious what you've tried so far.

Posted: Wed Feb 14, 2018 9:53 am
by UCDI
Im open to C if you need it, but I would personally attack this one in excel if at all possible (to the point that I might even consider elimination of datastage entirely depending on the circumstances).

Another idea.. .what is the largest # that can appear? If its less than 10, you can just split out a transformer and manufacture up to that many rows simply. A little fragile if the # can get big in the future, but sometimes, you know it won't...?

I have found that the VB transformer using a VB routine is pretty much equally fast to C for simple tasks. C is marginally faster on encrypt or bit manipulation or hard math, but string processing, its a wash.

Posted: Thu Feb 15, 2018 10:41 am
by chulett
Unless I'm misremembering how things work (or just confused) I think you meant:

- The Filter stage limits the output using the condition COUNT <= ROWNUM

Yes?

Posted: Thu Feb 15, 2018 11:17 am
by Thomas.B
We need to send the record to the output unless the row number is higher than the number of rows to generate, in my head, my logic is right.

Anyway, Siva will test it if he decides that is the right solution.

Posted: Thu Feb 15, 2018 12:03 pm
by chulett
Ah... guess I had the two values backwards in my head. D'oh.

Posted: Fri Feb 16, 2018 9:57 am
by rkashyap
Another option could be leveraging Unix ... invoke following command in either External Source stage or using Filter option of Sequential file stage

Code: Select all

awk 'BEGIN {FS="|";RS="\n"};  (NR> 1) {for(i=0; i<$2; ++i) {print $1}}' <inputfile>