| Author |
Message |
sarathcr

Group memberships: Premium Members
Joined: 29 May 2008
Posts: 3
Points: 28
|
|
| DataStage® Release: 8x |
| Job Type: Parallel |
| OS: Windows |
|
Hi,
I am trying to use the filter stage to output the data. The structure of the job is SeqFile -----> Filter ----> SeqFile
My source data looks like as described below.
EmpNo EmpName DeptNo
1 ABC 10
2 BCD 20
1 ABC 10
In the filter stage I set the where clause as DeptNo=10 and selected the Option=Output rows only once . In my outputfile I am getting two records. As per the option that I set I am expecting only one record. Please let me know where I am doing wrong.
Thanks in Advance,
C
|
|
|
|
|
 |
jneasy
Participant
Joined: 29 Jan 2012
Posts: 9
Location: Australia
Points: 60
|
|
|
|
|
|
Hi sarathcr,
I'm still relatively new to DS but it looks like that the option "Output Row Only Once" set to true will output the row only to the first where clause it matches and wont remove duplicates.
I would suggest using the "Remove Duplicates" stage and use DeptNo as the key.
|
_________________ Joseph Neasy
rxp Services |
|
|
|
 |
jwiles

Group memberships: Premium Members
Joined: 14 Nov 2004
Posts: 1236
Points: 9942
|
|
|
|
|
|
That is correct. The "Output Rows Only Once" option refers only to individual rows...when selected a row will be output only once, no matter how many clauses it matches. The Filter stage has no ability to identify and remove duplicate rows, which is what you are apparently looking for. A combination of Sort and Remove Duplicates, or the Sort stage alone can accomplish this for you.
Regards,
|
_________________ - james wiles
All generalizations are false, including this one - Mark Twain.
Last edited by jwiles on Fri Jun 01, 2012 8:54 am; edited 1 time in total |
|
|
|
 |
sarathcr

Group memberships: Premium Members
Joined: 29 May 2008
Posts: 3
Points: 28
|
|
|
|
|
|
Thank You Jwiles. It solves my issue. Just I want to add little bit explanation to it.
If the Filter stage contains two where clauses 1) Dept=10 2) EmpName=ABC and the option= OutputRow only once. Then the output of 1st where clause (Dept=10) get two rows, but the 2nd where clause (EmpName-ABC) get no rows.
|
_________________ With Regards
Sarath |
|
|
|
 |
jwiles

Group memberships: Premium Members
Joined: 14 Nov 2004
Posts: 1236
Points: 9942
|
|
|
|
|
|
Yes...that was completely understood from your original explanation. There were
two
rows where Dept = 10, thus you had
two
rows output by that filter clause. Filter operates at the individual row-level ONLY---one row at a time.
Regards,
|
_________________ - james wiles
All generalizations are false, including this one - Mark Twain.
|
|
|
|
 |
|
|