Lookup/join using something other than equal

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
rmcclure
Participant
Posts: 48
Joined: Fri Dec 01, 2006 7:50 am

Lookup/join using something other than equal

Post by rmcclure »

I am having a problem with doing a lookup where the value is <. I have searched the forum and found some suggestions but they do not apply to my situation.
DataStage stream should look like:
Source ======> Lookup1 DS========> transform =======> Lookup2 DS =======> Target
Source: gets data by joining a number of tables.
Lookup1: the SQl to generate Lookup1 calculates values using ROUND, SUM and Case to create the keys plus numeric values.
transform: Transforms the numeric values from Lookup1 depending on certain criteria

To this point I am Ok.

The problem is in Lookup2. It needs to take one of those transformed numeric values and do a lookup using > and <
LKP_MIN_VALUE<=SRC_VALUE3
AND LKP_MAX_VALUE>=SRC_VALUE3

The range option on the lookup stage is very limiting
I tried with the range on the lookup hoping to trick it with:
LKP_MIN_VALUE<=SRC_VALUE3
AND LKP_MIN_VALUE<=SRC_VALUE3
But it doesn't like the fact that I am using LKP_MIN_VALUE twice. it want a range for LKP_MIN_VALUE but I only want LKP_MIN_VALUE<=SRC_VALUE3

It is not possible to move the three Selects into one source select. The source and first lookup select are pretty big and the code for the transform would also need to be incorporated.
Using a Join stage joining on a dummy field then a filter to check the > and < works except there are 2.3 million records in the source and luckily only 83 in the lookup, but it still creates 197 million records to be filtered.

I could do the source and lookup1 in one job and write to a working table. Then join the working table to lookup2 but that would require a DBA and I am sure they would not be happy creating a working table for this issue since there may be many in the future.

What I find strange is this is really simple to do using a server job.
Does any one know some simple trick to do a lookup using >= without having to over generate records?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I don't think the Ranged Lookup is quite as limiting as you think, mostly because what you are needing is not a range. :wink:

So basically you are looking for a way to do "<" check in a lookup rather than an equality check?
-craig

"You can never have too many knives" -- Logan Nine Fingers
rmcclure
Participant
Posts: 48
Joined: Fri Dec 01, 2006 7:50 am

Post by rmcclure »

Yes, that is exactly what I am trying to do.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Hmmm... was hoping someone would wander along and throw us a bone. Have you had any luck with this in the interim?
-craig

"You can never have too many knives" -- Logan Nine Fingers
rmcclure
Participant
Posts: 48
Joined: Fri Dec 01, 2006 7:50 am

Post by rmcclure »

I did find a work around...sort of.

My lookup table contains the following fields:
LKP_MIN_VALUE
LKP_MAX_VALUE

My source contains:
SRC_VALUE3

I am trying to find were:
LKP_MIN_VALUE<=SRC_VALUE3
AND LKP_MAX_VALUE>=SRC_VALUE3

With the range option in the lookup stage, the range is on the lookup field not the source field:
LKP_MIN_VALUE<=somevalue
AND LKP_MIN_VALUE>=somevalue

but I don't want LKP_MIN_VALUE>=somevalue I just want the LKP_MIN_VALUE<=somevalue part.

My ugly work around is:
1) I first use another lookup stage where I do a "pre" lookup to my lookup table and get:
Select
MAX(LKP_MAX_VALUE) as MAX_MAXVALUE,
MIN(LKP_MAX_VALUE) as MIN_MINVALUE

I pass these to the next lookup stage.

Then in my real lookup my range is:
LKP_MAX_VALUE>=SRC_VALUE3
AND LKP_MAX_VALUE<=MAX_MAXVALUE (this will always be true)

LKP_MIN_VALUE<= CopyofSRC_VALUE3 (Copyof because it will not let you use the same input variable on two ranges)
AND LKP_MIN_VALUE>=MIN_MINVALUE (this will always be true)

I am lucky that my lookup table is small. This works but if there are any other solutions I would like to hear about it.
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Re: Lookup/join using something other than equal

Post by qt_ky »

Range Lookup can be a little tricky to get it to work as expected, but maybe it can be made to work...
rmcclure wrote:What I find strange is this is really simple to do using a server job.
Does any one know some simple trick to do a lookup using >= without having to over generate records?
Could you use a server job shared container in your parallel job?
Choose a job you love, and you will never have to work a day in your life. - Confucius
Post Reply