Joining A sequential file and Database table

raj_konig · Post by **raj_konig** » Mon Jan 30, 2006 2:13 am

Folks,

Could you please let me know how can i join Sequential file and a Database table?

I tried using SQL Builder but couldnt get tht.

Thanks in advance

rajesh

WoMaWil · Post by **WoMaWil** » Mon Jan 30, 2006 2:17 am

without any details what you finally want to join, we cannot tell you the best way, but in any case you can fill the sequential file also in a database table and than you can join as you like.

raj_konig · Post by **raj_konig** » Mon Jan 30, 2006 2:38 am

I am usign sequential file as source and after the transformation stage i am trying to join the transformed data coming from transformer and database table then insert into the target

My intension is not to loopkup but to join the database table and the transformed data from sequential file .

thanks
rajesh

ArndW · Post by **ArndW** » Mon Jan 30, 2006 2:44 am

Rajesh,

take a look at the documentation on the "merge" stage in DataStage Server edition, you will find that this addresses your question, at least partially. You might also think about loading the table to a hashed file and doing a lookup on that (you can load it to memory if it fits to further increase speed).

raj_konig · Post by **raj_konig** » Mon Jan 30, 2006 2:57 am

Thanks Arnd.

But I am not trying to lookup but to join.

And even tried Merge. BU this is not accepting any inputs.

rajesh

loveojha2 · Post by **loveojha2** » Mon Jan 30, 2006 3:05 am

raj_konig wrote:Thanks Arnd.

But I am not trying to lookup but to join.

And even tried Merge. BU this is not accepting any inputs.

rajesh

But what is your Join criteria or is it like a cartesian product?

sudhakar_viswa · Post by **sudhakar_viswa** » Mon Jan 30, 2006 3:07 am

Hi,

In this scenario the table act as a look up.Drag the look up column from sequential stage to oracle stage in transformer.

bye,
sudhakar

ArndW · Post by **ArndW** » Mon Jan 30, 2006 3:10 am

Raj, PLEASE read the documentation on the merge stage; it does not accept input links but uses input sequential files to do the type of join you want; so you would need to get an extract from the database table into a flat file to make that work. The hashed file lookup suggestion might perform better than your other options, which is why I mentioned it. Your requirement to not use a lookup is probably based on preconceptions about performance that might not be correct.

sekr · Post by **sekr** » Mon Jan 30, 2006 3:26 am

correct if im wrong !
raj wants how to load data which comming from seq file to oracle database .

seqfile------->transformer------>Oracle .

raj_konig · Post by **raj_konig** » Mon Jan 30, 2006 3:43 am

thank you for all the replies.

Yes, my intenstion is a cartesian product.

I cannt use merge Arnd. Because I am trying to join these 2 in the mid of the mapping but not at the beggining.
Hope u guys understand the situation.

If the final soultion for this is merge only then I have no other go than to split the mapping into 2.

rajesh

ArndW · Post by **ArndW** » Mon Jan 30, 2006 3:51 am

Raj,

<sigh> you can use the merge stage, it is up to your job design on what is considered to be the "middle of mapping". If you write to two sequential files (which can also be named pipes to save on disk I/O) you can then read from those using a merge stage in the same job; this is an accepted and frequently used method of controlling your job flow.

You really should try loading your table selection to a hashed file that is used as a reference lookup; this might give you better performance than other solutions.

ray.wurlod · Post by **ray.wurlod** » Mon Jan 30, 2006 4:44 pm

A lookup IS a join.

A lookup is a left outer join, in that it returns NULL if the lookup does not succeed.

You can convert this to an inner join by constraining the Transformer output to only those rows for which the lookup succeeded.

However it is not possible to achieve a full outer join (or a Cartesian product) using a Transformer stage lookup.

A Merge stage can do a full outer join. It requires two sequential files as its inputs. Arnd's suggestion of using pipes could be useful, but using actual seqential files may not be all that slower, depending on data volume.