Work-around for SPARSE lookups on Windows

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
jzparad
Charter Member
Charter Member
Posts: 151
Joined: Thu Apr 01, 2004 9:37 pm

Work-around for SPARSE lookups on Windows

Post by jzparad »

I'm currently on a project using 7.5x2 EE on Windows with ODBC connections to a DB2 database on a mainframe. I have a need to do database lookups to populate a number of fact tables from an ODS which is also on DB2. The obvious solution would be to use sparse database lookups. However, since this feature doesn't appear to work on Windows I'm trying to find some work-around. Normal lookups are not an option due to the size of the tables. We simpy run out of memory. I thought of using a Server job but the performance drops from about 1700 rows/sec to 45 rows/sec. Other options are pulling complete tables across and performing joins and/or lookups from within DS but I'm not too keen on this.

Has anyone out there come up with a better solution?
Jim Paradies
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes, it's called Linux. Same hardware, sparse lookups are supported.

I don't believe there's much you can do at the software level - convince "them" that more memory is required. And more disk, because you may need to spill into the scratch disk areas.

Is it possible to perform a left outer join in the database itself, perhaps using DB2 Connect functionality? At least this will shift the blame from the ETL tool. It's not the ETL tool that's at fault; it's a matter of not being able to handle the volume of data in the reference Data Set.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jzparad
Charter Member
Charter Member
Posts: 151
Joined: Thu Apr 01, 2004 9:37 pm

Post by jzparad »

Ray,

Thanks for your response.

Trust me that if I had any sort of choice, Windows would be right down the bottom.

More memory might solve the problem for now but it will certaintly blow out after some time as these are historic tables that I'm working with. I've discounted outer joins because of the nature of the problem. Loading fact tables means that each dimension key has to be computed based on a query which may involve up to 5 different tables plus values taken from the driving table.
Jim Paradies
Post Reply