Speed up a hash Lookup (Partial Key) created and build index

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
sarjushah
Premium Member
Premium Member
Posts: 40
Joined: Thu May 12, 2005 3:59 pm

Speed up a hash Lookup (Partial Key) created and build index

Post by sarjushah »

File size 4.4 million records.
total keys 5
Type 30 dynamic.

Accessing via a ODBC connection to do a partial look key lookup using 3 keys .

Multiple row lookup set to active.

Read the forums and Created the index and build it for the 3 keys.

Approx speed

When hash file size is 100,000 rec --> 400rows/sec
When hash file size it 500,000 rec --> 40 rows/sec
When hash file size it 1,000,000 rec --> 4 rows/sec

How can I speed up things ?

Any pointers appreciated
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Going through ODBC to do a direct hashed file lookup is slow. Using partial keys is slow. Could you give an example of the partial 3 key lookups you are doing, as you might be able to create an index on the hashed file and use that directly for the looking or at least avoid full table scans when using ODBC.
sarjushah
Premium Member
Premium Member
Posts: 40
Joined: Thu May 12, 2005 3:59 pm

Post by sarjushah »

i did create the index.. but no speed
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You created what index? How, exactly?
-craig

"You can never have too many knives" -- Logan Nine Fingers
sarjushah
Premium Member
Premium Member
Posts: 40
Joined: Thu May 12, 2005 3:59 pm

Post by sarjushah »

I created the index on the 3 columns that I am using as the key in my lookup using the odbc .(The hash file has a total of 5 columns defined as the key )

I excuted the following command to create and build the index

CREATE.INDEX PolicyMChashLKPnEW LKP_POLMB_ID LKP_SBNT_KEY LKP_CVRPL_ID


BUILD.INDEX PolicyMChashLKPnEW LKP_POLMB_ID LKP_SBNT_KEY LKP_CVRPL_ID
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Try the generated SQL query from the Administrator command interface, substituting a known value for each parameter marker and adding the word EXPLAIN ahead of the semi-colon. This will tell you whether or not the query will use any index.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sarjushah
Premium Member
Premium Member
Posts: 40
Joined: Thu May 12, 2005 3:59 pm

Post by sarjushah »

I did the explain plan and below is the result. It is not using th index's . How can force it to use it.

The result is Access Method : File Scan

Do you have something like a oracle hint clause to make it use the index here
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

REQUIRE.INDEX is the "hint clause" you need.

If an index can not be used your query containing this keyword will fail.

Index usage is cost-based - if the query will return more than about 30% of the rows in the table, then the index is not used because, on balance of probabilities, every page in the table will need to be touched anyway.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sarjushah
Premium Member
Premium Member
Posts: 40
Joined: Thu May 12, 2005 3:59 pm

Post by sarjushah »

A dumb question , Where do it add it.?

I tried adding it before the ";" and it failed.

Do I need to put it after the select or something else.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Your example clearly shows why hashed files should be used in equi-joins as a first and best choice. I suggest you involve a database and use work tables there.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
sarjushah
Premium Member
Premium Member
Posts: 40
Joined: Thu May 12, 2005 3:59 pm

Post by sarjushah »

It is currently a OCi query and went thru a series of tuning exercises and still pretty slow so now I want to try this and give it a shor.

if this is also slow I might try the hash roll up thing described in one of Ken's posting.

Do you know where I should put the require.index in my query ?

SELECT POLMC_ID, CVRG_STRT_DT, CVRG_END_DT, LKP_POLMB_ID, LKP_SBNT_KEY, LKP_CVRPL_ID FROM PolicyMChashLKPnEW WHERE (LKP_POLMB_ID = 3126657 and LKP_SBNT_KEY = 16 and LKP_CVRPL_ID = 1733) ;
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

What makes you think you've got it in the wrong spot? Ray did say that the keyword would cause a failure if it couldn't be used. If you still think you've got a syntax problem, post the error you received.

A 'trick' to try would be to declare three keys rather than your five when building the hashed file. The first key field would be your previous three key fields concatenated together, in other words a composite of them. Then keys two and three would be your original four and five. Build an index over the first key field and you're back to an equi-join in your multi-row lookup. Make sure you build the key expression with the same logic used to build the key, including any null handling or delimiters if applicable.
-craig

"You can never have too many knives" -- Logan Nine Fingers
sarjushah
Premium Member
Premium Member
Posts: 40
Joined: Thu May 12, 2005 3:59 pm

Post by sarjushah »

Thanks chulett

Here is the Final Solution.

Hash File Combined Key (Colum1 + Column 2+ Column 3)
Other Keys Column 4 and 5

The combined key is used for the Partial lookup

defined a VOC entry

SETFILE /opt/datastage/files/dev/polmc_lkp_new2 PolicyMChashLKP2

Created index

CREATE.INDEX PolicyMChashLKP2 Key

Build the index

BUILD.INDEX PolicyMChashLKP2 Key

Created a lookup using ODBC and set multiple rows returned to yes

and Bingo -- 12000/sec for the lookup the hash file size is 4.4 mil rows

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

8)
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply