Aggregator stage constraint thus use of Remove Duplicates
Moderators: chulett, rschirm, roy
-
TonyInFrance
- Premium Member

- Posts: 288
- Joined: Tue May 27, 2008 3:42 am
- Location: Luxembourg
Aggregator stage constraint thus use of Remove Duplicates
I have this dataset of 2 columns - ID Number and Loading Date.
I need to consider the most recent Loading Date in case of duplicate IDs. However the MAX function in my aggregator stage insists on storing the result in a DOUBLE datatype whereas my source is a timestamp. Is there a workaround?
I have also tried to achieve the result by using a Remove Duplicates stage where I am first sorting my data on ID (ascending) , Loading Date (ascending) and then trying to retain the last record for each repetitive column, my key being ID. However it is not working here and it is retaining identical occurences of ID, albeit with different dates whereas I need only the most recent date for the same ID. Here's what I'm getting:
ID LOADING_DATE
0000000000001542 15/11/2008 22:30:50
0000000000001542 05/01/2009 23:01:25
0000000000002459 30/07/2009 09:51:59
0000000000002459 04/06/2009 18:54:41
0000000000005494 14/10/2008 15:00:11
0000000000005494 30/07/2009 09:51:59
0000000000005690 04/06/2009 18:54:41
0000000000005690 30/07/2009 09:51:59
0000000000007668 30/07/2009 09:51:59
0000000000007668 12/01/2009 10:54:26
I can't seem to figure out where I'm going wrong. Any ideas people?
Thanks
S. BASU
I need to consider the most recent Loading Date in case of duplicate IDs. However the MAX function in my aggregator stage insists on storing the result in a DOUBLE datatype whereas my source is a timestamp. Is there a workaround?
I have also tried to achieve the result by using a Remove Duplicates stage where I am first sorting my data on ID (ascending) , Loading Date (ascending) and then trying to retain the last record for each repetitive column, my key being ID. However it is not working here and it is retaining identical occurences of ID, albeit with different dates whereas I need only the most recent date for the same ID. Here's what I'm getting:
ID LOADING_DATE
0000000000001542 15/11/2008 22:30:50
0000000000001542 05/01/2009 23:01:25
0000000000002459 30/07/2009 09:51:59
0000000000002459 04/06/2009 18:54:41
0000000000005494 14/10/2008 15:00:11
0000000000005494 30/07/2009 09:51:59
0000000000005690 04/06/2009 18:54:41
0000000000005690 30/07/2009 09:51:59
0000000000007668 30/07/2009 09:51:59
0000000000007668 12/01/2009 10:54:26
I can't seem to figure out where I'm going wrong. Any ideas people?
Thanks
S. BASU
Tony
BI Consultant - Datastage
BI Consultant - Datastage
Re: Aggregator stage constraint thus use of Remove Duplicate
Enable the Preserve Type property.BASU wrote:However the MAX function in my aggregator stage insists on storing the result in a DOUBLE datatype whereas my source is a timestamp. Is there a workaround?
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
ray.wurlod
- Participant
- Posts: 54595
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
TonyInFrance
- Premium Member

- Posts: 288
- Joined: Tue May 27, 2008 3:42 am
- Location: Luxembourg
-
TonyInFrance
- Premium Member

- Posts: 288
- Joined: Tue May 27, 2008 3:42 am
- Location: Luxembourg
Re: Aggregator stage constraint thus use of Remove Duplicate
Craig - Where do I do this? In the aggregator stage I have Preserve Partitioning. Is this what you mean by Preserve Type property?chulett wrote:Enable the Preserve Type property.
Regards
S. BASU
Edit:
I read through this thread:
viewtopic.php?t=135260&start=0&postdays ... e+property
The problem is I can't find the subproperty preserve type.
Anything I have missing in my installation?
Tony
BI Consultant - Datastage
BI Consultant - Datastage
-
TonyInFrance
- Premium Member

- Posts: 288
- Joined: Tue May 27, 2008 3:42 am
- Location: Luxembourg
For further info:
In my Aggregator stage, Tab 'Stage' sub tab 'Properties' under Options I have only 2 possible heads - Allow Null Output (which I have set to false) and Method (which I have set to Sort since the number of groups I have is more than 1000)
There is no Preserve Type option here or under any other tab / sub tab in this stage.
In my Aggregator stage, Tab 'Stage' sub tab 'Properties' under Options I have only 2 possible heads - Allow Null Output (which I have set to false) and Method (which I have set to Sort since the number of groups I have is more than 1000)
There is no Preserve Type option here or under any other tab / sub tab in this stage.
Tony
BI Consultant - Datastage
BI Consultant - Datastage
-
PhilHibbs
- Premium Member

- Posts: 1044
- Joined: Wed Sep 29, 2004 3:30 am
- Location: Nottingham, UK
- Contact:
I'm pretty sure that Preserve Type doesn't apply to dates and timestamps. Max and Min aggregations only work on numeric data. For a Date, I think it uses the ICONV value of the date. Not sure what it uses for Timestamps. Maybe the fractional part of the Double is the proportion of a day that the time represents, which I think can be easily turned back into a Timestamp afterwards.
Phil Hibbs | Capgemini
Technical Consultant
Technical Consultant
-
TonyInFrance
- Premium Member

- Posts: 288
- Joined: Tue May 27, 2008 3:42 am
- Location: Luxembourg
Phil
Thanks for the heads up. However what I read in this thread:
viewtopic.php?t=135260&start=0&postdays ... e+property
is quite the opposite. They claim here that this property is the one to use in order to solve this problem. Apparently it is possible to see this property (Preserve Type) somewhere in the Aggregator stage.
Thanks for the heads up. However what I read in this thread:
viewtopic.php?t=135260&start=0&postdays ... e+property
is quite the opposite. They claim here that this property is the one to use in order to solve this problem. Apparently it is possible to see this property (Preserve Type) somewhere in the Aggregator stage.
Tony
BI Consultant - Datastage
BI Consultant - Datastage
-
TonyInFrance
- Premium Member

- Posts: 288
- Joined: Tue May 27, 2008 3:42 am
- Location: Luxembourg
Got it guys. Don't bother anymore.
The property exists in the right hand window.
However even setting this property to true does not work.
I have exactly the same message as before:
Aggregator_65: Error when checking operator: When binding output interface field "MAX_DAT_CHARGT" to field "MAX_DAT_CHARGT": No default type conversion from type "timestamp" to type "dfloat".
The property exists in the right hand window.
However even setting this property to true does not work.
I have exactly the same message as before:
Aggregator_65: Error when checking operator: When binding output interface field "MAX_DAT_CHARGT" to field "MAX_DAT_CHARGT": No default type conversion from type "timestamp" to type "dfloat".
Tony
BI Consultant - Datastage
BI Consultant - Datastage
Please stick to your own topic for discussion of this issue rather than fishing around for related posts made by others.
No need to repeat what is here in other posts. Also, marking this post as having a 'workaround' is probably keeping people from visiting it again...
No need to repeat what is here in other posts. Also, marking this post as having a 'workaround' is probably keeping people from visiting it again...
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
