Hi,
I need to implement a scenario and pass it to a file with one header and multiple footer records like
Input:
emp_id
100
100
200
200
200
300
Output:
Header--------------
100
2(count)
200
3(count)
300
1(count)
Your suggestions is appreciated!
Thanks in Advance
Footer record for each group by value
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 59
- Joined: Wed Dec 31, 2008 5:49 am
Declare your output format as a single VarChar column. In your job, detect the value group changes using either stage variables in a transform or within a sort stage. When your job detects a group change it outputs an additional record with the footer values as indicated in your example.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 135
- Joined: Tue Aug 14, 2007 4:27 am
- Location: Mumbai
Hi Ravi,
Sort on EMP ID
Use Sort Stage again do create change key cluster column
Result will be as below
Use transformer --- stage variables ---
Change_key emp_id
1 100
0 100
1 200
0 200
0 200
1 300
Now use stage variables in transformer to group same records.
Let me know if you need more details.
Sort on EMP ID
Use Sort Stage again do create change key cluster column
Result will be as below
Use transformer --- stage variables ---
Change_key emp_id
1 100
0 100
1 200
0 200
0 200
1 300
Now use stage variables in transformer to group same records.
Let me know if you need more details.
ravireddy25 wrote:Hi,
I need to implement a scenario and pass it to a file with one header and multiple footer records like
Input:
emp_id
100
100
200
200
200
300
Output:
Header--------------
100
2(count)
200
3(count)
300
1(count)
Your suggestions is appreciated!
Thanks in Advance
Thanks
Swapnil
"Whenever you find whole world against you just turn around and Lead the world"
Swapnil
"Whenever you find whole world against you just turn around and Lead the world"
-
- Participant
- Posts: 59
- Joined: Wed Dec 31, 2008 5:49 am
Hi,ArndW wrote:Declare your output format as a single VarChar column. In your job, detect the value group changes using either stage variables in a transform or within a sort stage. When your job detects a group change it outputs an additional record with the footer values as indicated in your example.
Thanks for your reply,i can take single varchar column but how to detect the value group by sort stage or in transformers and how an additional record with count will be added after each group, can you explain in detail plz
Thanks in advance
Ravi
-
- Participant
- Posts: 527
- Joined: Thu Apr 19, 2007 1:25 am
- Location: Melbourne
You could have multiple ordered sources with the following format:
Key1 Key2 RowContents
Src 1 (header):
-1 0 Header-----------------
The -1 is just to make sure the header comes first. You could have another key that comes first in the order that does nothing more than order the sources (eg: 0 for the header and 1 for the other 2).
The other sources come from the data.
Start with an Aggregator to get the values (eg: 100, 200, etc) and counts, then run it through a Transformer to create the split.
Src2 (1 Transformer output to create the value lines)
Inp.Value 0 Inp.Value
Src3 (2nd Transformar output to create the count lines)
Inp.Value 1 Inp.Count : "(count)"
Now take the 3 sources and funnel them, then sort on Key1 asc, Key2 asc.
For Src2 and Src3, Key1 uses the Value out of the Aggregator, and Key2 is an artificial value to make sure that for each value, the value line is ordered before the count line.
I'm sure there are better ways to do this, but this is one way that would work.
Key1 Key2 RowContents
Src 1 (header):
-1 0 Header-----------------
The -1 is just to make sure the header comes first. You could have another key that comes first in the order that does nothing more than order the sources (eg: 0 for the header and 1 for the other 2).
The other sources come from the data.
Start with an Aggregator to get the values (eg: 100, 200, etc) and counts, then run it through a Transformer to create the split.
Src2 (1 Transformer output to create the value lines)
Inp.Value 0 Inp.Value
Src3 (2nd Transformar output to create the count lines)
Inp.Value 1 Inp.Count : "(count)"
Now take the 3 sources and funnel them, then sort on Key1 asc, Key2 asc.
For Src2 and Src3, Key1 uses the Value out of the Aggregator, and Key2 is an artificial value to make sure that for each value, the value line is ordered before the count line.
I'm sure there are better ways to do this, but this is one way that would work.
-
- Participant
- Posts: 91
- Joined: Wed Apr 20, 2005 7:59 pm
- Location: U.S.
Easy Solution using 2 jobs:
Job 1: Read the input file, process you data and write the Header Row in File1.
Job 2: Read the input file and use Aggregator stage using count rows for key column. This will give you total number of rows for every group (in your case emp_id). Now write this into File1 created in Job 1. While writing make sure to use Append to existing File1 otherwise it will wipe out your previously written header row.
Note: You can do the same thing in one job by controlling the flow ( first write the header and then write count) by link odering in transformer but in parallel mode it is not advisable.
Job 1: Read the input file, process you data and write the Header Row in File1.
Job 2: Read the input file and use Aggregator stage using count rows for key column. This will give you total number of rows for every group (in your case emp_id). Now write this into File1 created in Job 1. While writing make sure to use Append to existing File1 otherwise it will wipe out your previously written header row.
Note: You can do the same thing in one job by controlling the flow ( first write the header and then write count) by link odering in transformer but in parallel mode it is not advisable.
-
- Participant
- Posts: 59
- Joined: Wed Dec 31, 2008 5:49 am
stuartjvnorton wrote:You could have multiple ordered sources with the following format:
Key1 Key2 RowContents
Src 1 (header):
-1 0 Header-----------------
The -1 is just to make sure the header comes first. You could have another key that comes first in the order that does nothing more than order the sources (eg: 0 for the header and 1 for the other 2).
The other sources come from the data.
Start with an Aggregator to get the values (eg: 100, 200, etc) and counts, then run it through a Transformer to create the split.
Src2 (1 Transformer output to create the value lines)
Inp.Value 0 Inp.Value
Src3 (2nd Transformar output to create the count lines)
Inp.Value 1 Inp.Count : "(count)"
Now take the 3 sources and funnel them, then sort on Key1 asc, Key2 asc.
For Src2 and Src3, Key1 uses the Value out of the Aggregator, and Key2 is an artificial value to make sure that for each value, the value line is ordered before the count line.
I'm sure there are better ways to do this, but this is one way that would work.
Hi Guys,
Thanks I will try this
Ravi
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom