Hi Everyone,
I have a design issue.
In the source DRS I need to get data from 28 tables.The structure of the tables are same but having very huge amount of data. (all the rows are distinct)
I would kindly like to know which one will give me a better performance
My options are
1.Using a UNION in the source DRS
2.Using a UNION ALL in the source DRS
3.Using 28 sepreate DRS in the source and using LINK COLLECTOR to combine the data.
I thought it would be the UNION ALL since it merges data from the Database itself but
I read in one of the DSxchange post that its LINK COLLECTOR which will give a better performance than UNION ALL
viewtopic.php?t=113306&highlight=Link+Collector
(please refer to the post from balasumit21 )
Please advise
Thanks
George Sebastian
Better Performance
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Define "performance" in an ETL context. Only then might one be able to address whether whatever it is can be made "better" (whatever that is).
Please do not cite rows/sec as your performance metric. Search DSXchange for many rants from me about why it is meaningless.
Please do not cite rows/sec as your performance metric. Search DSXchange for many rants from me about why it is meaningless.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
union all is more performant than UNION because a UNION is a distinct.
you have a lot of screws for to find the optimal performance, there are many ways and there is not best of all for each situation. To find the best way for your case you need some investigation. Sometimes the bottleneck for performance is not the way of reading but the way of writing and its syncronisation.
you have a lot of screws for to find the optimal performance, there are many ways and there is not best of all for each situation. To find the best way for your case you need some investigation. Sometimes the bottleneck for performance is not the way of reading but the way of writing and its syncronisation.
Wolfgang Hürter
Amsterdam
Amsterdam
-
- Participant
- Posts: 83
- Joined: Tue Dec 19, 2006 8:38 am
Specifically, a UNION ALL simply returns the results of all queries. A UNION must first get the results of all queries and then so a 'SORT DISTINCT' on the results, so it obviously will take longer and more processing resources.WoMaWil wrote:union all is more performant than UNION because a UNION is a distinct.
You make the choice based on whether you need duplicates removed from the result set, not on which one is 'faster'.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers