Job to move Data in Vertical format to Horizontal Format

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

asvictor
Participant
Posts: 31
Joined: Tue Sep 02, 2003 3:06 am
Location: Singapore
Contact:

Job to move Data in Vertical format to Horizontal Format

Post by asvictor »

Hi,

I have problem in designing a Job for the Requirements. Here the input data
is

CT_VAL_GEN_KEY CT_ELE_UNQ_KEY CT_ELEMENT_VALUE
1649 216 13
1649 182 45
1649 217 67
1649 218 87
1649 219 123
1649 220 86
1649 1 EXISTING

1650 216 15
1650 182 56
1650 217 77
1650 218 84
1650 219 145
1650 220 89
1650 1 EXISTING

Each CT_VAL_GEN_KEY will have 7 CT_ELE_UNQ_KEY and wil have 7 values for each UNQ Key.

I have to read Gen Key and based on the UNQ KEY I need to populate in the fields of the Out put file as given below


CT_VAL_GEN_KEY CT_CD_FORM CT_CD_TAX CT_DC_FORM_15
CT_ID_MODULE CT_ID_TYPE_WORK CT_ID_TYPE_WORK_1 STATUS
-------------- ---------- --------- ------------- ------------ --------------- ----------------- ------
1649 13 45 67 87 123 86 EXISTING

1650 15 56 77 84 145 89
EXISTING


IS there any way I can do it in DS? I tried to use Pivot stage and I believe it doesn't support this requirement.

Can any one suggest me how to work around??


Cheers
Victor Auxilium
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Yeah, pivot stage only supports horizontal pivots, not vertical pivots. (Or is that the other way around?) I've managed to pivot the type of data you have by using stage variables to store up the record and output it once all values are present.

Create 7 stage variables, store in each one the CT_ELEMENT_VALUE value that row corresponds to.

Code: Select all

StageVar  Derivation
CDFORM   IF input.CT_ELE_UNQ_KEY = 216 THEN CT_ELEMENT_VALUE ELSE CDFORM
CDTAX    IF input.CT_ELE_UNQ_KEY = 182 THEN CT_ELEMENT_VALUE ELSE CDTAX
DCFORM   IF input.CT_ELE_UNQ_KEY = 217 THEN CT_ELEMENT_VALUE ELSE DCFORM
...
By the time 7 rows are read in you have the 7 element values. Add a constraint "input.CT_ELE_UNQ_KEY = 1" on your output, that ensures you only write out a record when you reach the seventh value. Set your output field derivations to the stage variables.

Code: Select all

Field             Derivation
output.CT_CD_FORM  CD_FORM
output.CT_CD_TAX    CD_TAX
...
It's an easy pivot because you have exactly 7 rows and you can easily find the 7nth row to trigger the output. It gets a lot harder when you are pivoting an unknown number of rows, but that's another story...
inter5566
Premium Member
Premium Member
Posts: 57
Joined: Tue Jun 10, 2003 1:51 pm
Location: US - Midwest

Post by inter5566 »

If your data does not allow for an easy way to detect the last row or if your data is out of order, you can resolve the pivot with summation. When you read in a row you need to write it out with key fields followed by the columns you want to pivot. Each of the pivot columns will receive either a zero or the appropriate value from the input (for integer values), or a space or the input value (for character fields). This output is then run into an agregator stage, grouped by key values and 'summed' by integer columns and maxed by character columns.

input data:
GEN_KEY UNQ_KEY ELEMENT_VALUE
1649 216 13
1649 182 45
1649 217 67
1649 218 87
1649 219 123
1649 220 86
1649 1 EXISTING

transform output:
1649 13 0 0 0 0 0 " " *quotes used to show space, not in field
1649 0 45 0 0 0 0 " "
1649 0 0 67 0 0 0 " "
1649 0 0 0 87 0 0 " "
1649 0 0 0 0 123 0 " "
1649 0 0 0 0 0 86 " "
1649 0 0 0 0 0 0 EXISTING

group by key and sum or max the columns
1649 13 45 67 97 123 86 EXISTING


For the transform just use an if then else structure on the input value for each column derivation.
i.e.
if trimb(inputlink.cd_element_value) = "216" then
trimb(inputlink.ce_element_value) else " "


Steve

OT....I think this is my first post since tools closed and I can't find the post spell check. Is this no longer available or do I need to activate it somewhere?
smovva
Participant
Posts: 4
Joined: Mon Jan 26, 2004 3:55 pm

Flattening records

Post by smovva »

Hi Everybody,

I have a file coming in with Id and page fields

ex File:

ID Page
101 10
101 11
102 19
102 20
102 30
104 70
104 72
104 73
110 1
110 2
110 3
110 4
110 55
Output requirement is :
I have to flatten out depends on the Id field and if for example for an Id if I have more than 4 pages I have to discard those records(I mean i have to consider first 4 record's pages)

Target will have
Id page1 page2 page3 page4
101 10 11 0 0
102 19 20 30 0
104 70 72 73 0
110 1 2 3 4

Please Let me know if somebody knows how to achieve this.

Thanks in advance.
Sunanda.
Last edited by smovva on Wed Jan 28, 2004 2:50 pm, edited 1 time in total.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Probably the easiest is to run the data into a UV table, with the "Page" column defined as multi-valued, normalized on the Page column.

Then use either the synthetic key @ASSOC_ROW to limit the values that you retrieve from the dynamically normalized table.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
smovva
Participant
Posts: 4
Joined: Mon Jan 26, 2004 3:55 pm

Post by smovva »

My Target is Oracle database, can I do as you suggested ?

Its an urgent requirement, please let me know if you got any ideas and I am using DataStage 5.2 and target is Oracle.

Thanks
Sunanda.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Did you even stop to ponder my solution? I have many years experience with this product, as my more than 2000 posts here suggest. This solution will work irrespective of source and target.

See also viewtopic.php?t=86415
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

If you don't like Ray's solution, here's another one:

If your source data is sorted in the order you want, then use a couple of stage variables. One stage variable to track the current group you are processing, another to track the count of how many for that group have been seen, and a third to accumulate a concatenated string of the values received to date.

On a group change, you will allow the previous row to output. Your only technical issue to resolve is to "force" an output of the last group received. This can be done by adding an extra row to the source data set to act as a trailer record that you will use in the constraint for determining whether to output the row. Basically, you output the row on a group change or when you see the trailer record, but never if @INROWNUM=1.

You can search the forum for using stage variables to accumulate data across rows. We have EXACT examples of how you need to do what you require.

If you need it urgently, I do accept payment in kind. I'd code this logic for you in trade for some authentic Tazo Chai tea, this American stuff would be better used as lawn fertilizer. 8)

By the way, Ray is a legend. If he gives you a solution, it's best to atleast try it, because he knows every trick in the book, because he wrote the book.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
smovva
Participant
Posts: 4
Joined: Mon Jan 26, 2004 3:55 pm

Post by smovva »

Hi Ray,

I didn't try your solution because, when you said use an UV table I thought I cannot because i am working on Oracle.

When you say a UV table you mean UniVerse Table, right!.
so, can I create a UV table in Oracle or in DataStage memory?

I am sorry i have never worked on UV tables.

Thanks
Sm.
Last edited by smovva on Wed Jan 28, 2004 11:13 am, edited 1 time in total.
Teej
Participant
Posts: 677
Joined: Fri Aug 08, 2003 9:26 am
Location: USA

Post by Teej »

Psst.

DataStage Server's engine is based on Universe.

-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

smovva wrote:Hi Ray,

I didn't try your solution because, when you said use an UV table I thought I cannot because i am working on Oracle.

When you say a UV table you mean UniVerse Table, right!.
so, can I create a UV table in Oracle or in DataStage memory?

I am sorry i have never worked on UV tables.

Thanks
Sm.
I used the term UV table because they're accessed via a UV stage. However, they're created by the DataStage server on its host machine.
You can not create a UV table in Oracle.
The UV stage gives you a "create table" check box, which means that your job can create the table. The only trick is that you must have defined the metadata (not unusual), but also that you must have declared the Page column to be multivalued (in the Type column). In the table definition you need to check "supports multivalued fields" to enable this capability.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
siddharthkaushik
Participant
Posts: 20
Joined: Thu Nov 27, 2003 3:45 am
Location: New Delhi, India

Post by siddharthkaushik »

It's an easy pivot because you have exactly 7 rows and you can easily find the 7nth row to trigger the output. It gets a lot harder when you are pivoting an unknown number of rows, but that's another story...
I am greatly interested in this other story as of now....

Thanks,
Siddharth
maximus
Participant
Posts: 10
Joined: Wed Mar 03, 2004 5:57 am

Post by maximus »

[quote="ray.wurlod"]Probably the easiest is to run the data into a UV table, with the "Page" column defined as multi-valued, normalized on the Page column.

Then use either the synthetic key @ASSOC_ROW to limit the values that you retrieve from the dynamically normalized table.[/quote]



Hi Ray

I was trying this out and set the type of Page as Multi valued...But how do I use This Synthetic key @ASSOC_ROW...Is it a System variable or Shud I create one and accordingly retrieve values :?: ...Cud You explain as to how I go about it...Moreover

Regards
Max
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

"Synthetic" means generated by the underlying "UV" engine (now DataStage Engine). It's like a system variable; you don't have to do anything; it's just there.

To create a column in a UV table as multi-valued, specify the keyword MULTIVALUED in the column definition. For example:

Code: Select all

CREATE TABLE MyTable (
  KeyCol INTEGER NOT NULL DEFAULT NEXT AVAILABLE,
  OtherCol VARCHAR(64) NOT NULL MULTIVALUED,
  CONSTRAINT MyTable_PKC PRIMARY KEY (KeyCol)
);
You may need to edit the DDL in the UV stage to achieve this.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
maximus
Participant
Posts: 10
Joined: Wed Mar 03, 2004 5:57 am

Post by maximus »

Hi Ray

I am trying say vertical pivoting(ID,Page)discussed above...So i created Page as a Multiple valued field...and I normalized on that field...And for the association also I gave Page...But I have one doubt

Even If I create my UV table with "Page" as Multi valued...I see only the first value when I view the data...Am I going wrong anywhere...
1.So How will the muliple values be stored??
2.wat will be the delimiter to store them??
3.If I come to know about the Delimiter we can use the field function with the position to retrieve those diff values right

Could you please Clarify
Regards
Max
Post Reply