Teradata User Query or Datastage

maneesh_shahjee · Post by **maneesh_shahjee** » Thu Mar 01, 2007 7:36 am

Hi,
While designing jobs I have stuck at one point, Whether should I use userdefined query in Teradata enterprise stage to bring columns and calculate avg, sum, top records etc.. Or first I should select columns from teradata stage and perform sum avg , get top records in transformer stage. Which one would be beneficial since source data is very high and query can be bit complex.

Any one have any idea which option should I choose ?

Thanks in Advance.
Maneesh

meena · Post by **meena** » Thu Mar 01, 2007 7:47 am

Maneesh,
I think user defined query would be better instead of using transformer stage for this.

ray.wurlod · Post by **ray.wurlod** » Thu Mar 01, 2007 3:54 pm

The Transformer stage would not be the appropriate stage type in any case; you would use an Aggregator stage to aggregate the data in this fashion. Sort the data as they are extracted, and use Sort method in the Aggregator stage. Or do the aggregations as part of the extraction (which means that you must use an SQL-capable stage to do so).

vmcburney · Post by **vmcburney** » Thu Mar 01, 2007 4:37 pm

It really depends on whether you have the capacity in your Teradata engine or on your DataStage server. DataStage offers a GUI inteface to do the aggregation while Teradata requires you writing and maintaining SQL and perhaps losing some of your data lineage. I would recommend asking for a Teradata view of the aggregated data so Teradata support can properly tune it, especially if it is joining tables. There is no doubt a Teradata aggregate SQL will be faster as DataStage will need to retrieve all the detail rows onto the server before aggregating it but DataStage may be more scalable if your Teradata server is under constant user stress.