Teradata User Query or Datastage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
maneesh_shahjee
Participant
Posts: 2
Joined: Thu Feb 03, 2005 8:25 am

Teradata User Query or Datastage

Post by maneesh_shahjee »

Hi,
While designing jobs I have stuck at one point, Whether should I use userdefined query in Teradata enterprise stage to bring columns and calculate avg, sum, top records etc.. Or first I should select columns from teradata stage and perform sum avg , get top records in transformer stage. Which one would be beneficial since source data is very high and query can be bit complex.

Any one have any idea which option should I choose ?

Thanks in Advance.
Maneesh
meena
Participant
Posts: 430
Joined: Tue Sep 13, 2005 12:17 pm

Post by meena »

Maneesh,
I think user defined query would be better instead of using transformer stage for this.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The Transformer stage would not be the appropriate stage type in any case; you would use an Aggregator stage to aggregate the data in this fashion. Sort the data as they are extracted, and use Sort method in the Aggregator stage. Or do the aggregations as part of the extraction (which means that you must use an SQL-capable stage to do so).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

It really depends on whether you have the capacity in your Teradata engine or on your DataStage server. DataStage offers a GUI inteface to do the aggregation while Teradata requires you writing and maintaining SQL and perhaps losing some of your data lineage. I would recommend asking for a Teradata view of the aggregated data so Teradata support can properly tune it, especially if it is joining tables. There is no doubt a Teradata aggregate SQL will be faster as DataStage will need to retrieve all the detail rows onto the server before aggregating it but DataStage may be more scalable if your Teradata server is under constant user stress.
Post Reply