2 calls to Oracle function in PX

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ds_is_fun
Premium Member
Premium Member
Posts: 194
Joined: Fri Jan 07, 2005 12:00 pm

2 calls to Oracle function in PX

Post by ds_is_fun »

I have a function call to oracle using a sparse lookup.
The function creates an entry in an "audit trail" table with batch_id and all that info and returns the batch_id.
My default config.apt has 2 nodes. So every time the function gets called it is creating 2 entries for a batch_id in the audit trail table while it should be only one.
How do I control that?
Thanks!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You can't. Each node operates independently. You need to re-think your design and/or your parallelism.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ds_is_fun
Premium Member
Premium Member
Posts: 194
Joined: Fri Jan 07, 2005 12:00 pm

Post by ds_is_fun »

This doesn't seem right since when we insert we don't expect it to duplicate a record. Similarly, any client :) would expect one function call from the Oracle Enterprise Stage to the function. There should be some way to control this. Thanks!
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

You do need to re-think your design. Think about it. It's fundamental parallel processing. You certainly would expect a process (i.e. function) to run on every node. Your function manufacturers a data row... it will manufacture a data row on every node.

The reason you don't see duplicates when you are inserting is because the data is partitioned. Each node process only inserts a subset of the total data.

Two quick options:
1) If the stage supports it, run it in sequential mode.
2) Pass the partition number to the function and only manufacture a row for one partition (i.e. one node).

Mike
ds_is_fun
Premium Member
Premium Member
Posts: 194
Joined: Fri Jan 07, 2005 12:00 pm

Post by ds_is_fun »

Nice suggestions!
When we use a oracle function call and join it using a sparse lookup it is by default set to "Sequential mode". Now, if it is set to "sequential mode" how is it creating 2 records for my functionality. My default apt config files has 2 nodes running.

So, going forward if I run this in sequential mode and pass a PARTITIONNUM as an input parameter to the ORA function to manufacture a single row, I suspect it still will return/manufacutre 2 rows due the current run logic explained above.

Thanks!
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

Check the execution mode of the lookup stage (not the Oracle stage).

I've also made the assumption that there is only 1 row driving the lookup on the stream input.

Regardless of whether the execution mode of the lookup stage is sequential or parallel, adding logic inside your Oracle function that makes it conditional on partition number will work as long as there is only a single driving row on the stream input.

Mike
ds_is_fun
Premium Member
Premium Member
Posts: 194
Joined: Fri Jan 07, 2005 12:00 pm

Post by ds_is_fun »

Cool!
So that would mean once the PARTITIONNUM is passed. Manufacture a record only for the first partition number.

Partition numbers usually begin with 0. If 2 partitions then 0,1;if 4 then 0,1,2,3.

So I would manufacture a record only for PARTITIONNUM= 0.
Does that make sense or is that too static if partition numbers could differ?
Thanks!
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

That should work. There will always be a zero partition when running in parallel. Take care if you use the partition number logic and switch the lookup stage to sequential execution mode. I don't know what that does to the PARTITIONNUM variable.

Mike
ds_is_fun
Premium Member
Premium Member
Posts: 194
Joined: Fri Jan 07, 2005 12:00 pm

Post by ds_is_fun »

whoops! stuck one last time..
I'm using the below in a ORA Ent stage using a sparse lookup.
I know in transformer stage we can use a system variable @PARTITIONNUM.

How could I use that system variable in my ORA functional query call before.

1. select FNC_UTL_logfilestart(@PARTITIONNUM,sysdate,'Loading for F_FX_RATE','Loads into fact table F_FX_RATE for type T','dsgstadm','GST_F_FX_RATE','dbaxsna915','GLBSALES.F0015') as CRT_LOG_ID from dual?
That doesn't work!!
2. ??
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

Now that I see what you want to accomplish... why not just call sqlplus in an after job routine? Or a routine activity?

Mike
Post Reply