I am trying to convert an sql to a datastage job. This excercise is to look for performance improvement in datastage, if any. The existing application unloads the data from informix database using unload command.
The sql is joining on 8 tables and out of these 8 tables, 3 are having million+ rows.
I have used ODBC stage to unload the data from informix for each table. Then I have joined the data using Join Stage. The join is done as below
Atable + BTable = AB.out
AB.out + CTable = ABC.out.....
and so on.....
The partitioning is set to "Auto". Joins are taking too long. My existing sql is taking around 2 minutes to execute and the joins are taking 10 minutes. Thats huge time !!.... Ne obvious performance improvements ??
One thing I can think of is that in every Join Stage operation, partitioning and sorting is taking a lot of time. But thats the requirement of Join Stage.
Any help in this regard is greatly appreciated.
Regards,
Join Stage Design Question
Moderators: chulett, rschirm, roy
