Create multiple Records from single record

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

shivaprasad
Participant
Posts: 17
Joined: Mon Aug 15, 2005 12:57 am

Create multiple Records from single record

Post by shivaprasad »

Hi All Need a urgent help.
I have records like this

33979|21014589|1536|06/13/2005|V00020|2.1|400|400|050903010013534923|050903010013535322


Now i want it like this
33979|21014589|1536|06/13/2005|V00020|2.1|400|400|050903010013534923
33979|21014589|1536|06/13/2005|V00020|2.1|400|400|050903010013534924
33979|21014589|1536|06/13/2005|V00020|2.1|400|400|050903010013534925
And increment will go on till the last range reaches that means 050903010013535322

And then i will have one more records like

33979|21014589|1536|06/13/2005|V00022|4.1|100|100|050303010008325718|050303010008325817
Then again same above steps have to repeated.

Now my question is how to do duplicate rows taking last two columns in that first column as starting point and then increment the no with all other columns same as it is and till it gets the last column no.

Hope every body will get clear cut idea about the problem now.

So can any body help me.
I need it urgently
Thanks in advance
Shiv
roy
Participant
Posts: 2598
Joined: Wed Jul 30, 2003 2:05 am
Location: Israel

Post by roy »

Hi,
There is a nice FAQ topic covering this here it is.

(1700)
IHTH,
Roy R.
Time is money but when you don't have money time is all you can afford.

Search before posting:)

Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Image
nag0143
Premium Member
Premium Member
Posts: 159
Joined: Fri Nov 14, 2003 1:05 am

Post by nag0143 »

search for looping a record. you will get nice examples.

this is tony's idea, it will work for you with some additions according to your requirement.

Another thing you might consider:

Write the data to a hash file. Add a unique key field (use @Outrownum as the value) and another field, F2. The derivation for F2 is a function (pass in Count as a parameter) (that you have to write) that returns a string of values from 1 to count, separated by @VM chars. When reading from this hash file, in the metadata for F2, set "Type" to "MV" (Association should go to F2), then, at the top, set "Normalize On" to F2. Now when you read the data from the hash file you'll get one row for each value in the F2 field. You can drop the key and F2 after reading from the hash file, if you so desire.

Quick, easy code and easy to implement.



Thanks
Nag
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

We don't do "urgent". You have been posting here long enough to know that.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
shivaprasad
Participant
Posts: 17
Joined: Mon Aug 15, 2005 12:57 am

Post by shivaprasad »

Urgent means i need it badly sir,
hope you help me, i have monday a delivery for this.
So pls help me.
Thanks
shiv
ray.wurlod wrote:We don't do "urgent". You have been posting here long enough to know that.
shivaprasad
Participant
Posts: 17
Joined: Mon Aug 15, 2005 12:57 am

Post by shivaprasad »

THis doesnt work sir, since my ranges are dynamic and based on last two fields. last but one field is starting range and last field is ending no.
now when u generate the nos with those two fields as ranges u will get all other fields dumplicated....

So hope you are clear now.
Thanks
Shiv


nag0143 wrote:search for looping a record. you will get nice examples.

this is tony's idea, it will work for you with some additions according to your requirement.

Another thing you might consider:

Write the data to a hash file. Add a unique key field (use @Outrownum as the value) and another field, F2. The derivation for F2 is a function (pass in Count as a parameter) (that you have to write) that returns a string of values from 1 to count, separated by @VM chars. When reading from this hash file, in the metadata for F2, set "Type" to "MV" (Association should go to F2), then, at the top, set "Normalize On" to F2. Now when you read the data from the hash file you'll get one row for each value in the F2 field. You can drop the key and F2 after reading from the hash file, if you so desire.

Quick, easy code and easy to implement.



Thanks
Nag
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

First things first. Your ability to supply a result is your responsibility. If you want fast and accurate answers, consider the price you are paying. $0 US. That's pretty cheap. We don't bill for time or answers, we just ask that you don't request URGENCY, because we have lives too.

Plus, as is the case sometimes, the URGENCY comes from someone who is unqualified for the contract position they are filling which came at the expense of a better qualified person who was asking for higher compensation. The URGENCY tends to mean that we on this forum have to help the less qualified person avoid being tossed out on their rear from the customer site when the customer realizes they are ill-prepared to do the work.

Now, to your problem.

PURE DS SOLUTION:

Your last two columns indicate two integer values that are the starting and ending range. You will need to write a simple function that is passed three arguments. The first argument is the entire row of data as you need it duplicated across the repeating rows. The second argument is the beginning range, the third argument the ending range.

Now, in this function, initialize the Ans variable to blank, that is "". You will start a loop from your beginning range to your ending range. For each pass thru the loop, append to the Ans variable the argument holding the entire row, your delimiter, the loop variable, your delimiter, and the end variable. If you are not on the last pass thru the loop, also append the linefeed (and carriage-return if doing this for DOS files) character.

When this function is finished, it will have populated the Ans variable with all of the expanded rows, delimited by linefeed(+carriage return). In your DS transformer stage, call this function, AND THIS IS IMPORTANT, output the result as the only column to a Sequential file stage. Now read that Sequential file with an output link and continue processing. You will see that when the job runs, N number of rows will process to the Sequential stage, but when reading from the Sequential file N + M rows will read out.

The magic here is that the "row" written to the Sequential file is actually a big text string, but because the string contains LF+CR, it will physically become separate rows in the text file. DO NOT TRY TO CONTINUE TO PROCESS THE ROW WITHOUT FIRST WRITING TO A FILE, that's the trick.

DATABASE SOLUTION:

Bulk load your data into a relational database work table. Create another work table holding a single integer column and populate that table with values 0 to as large a range between your beginning and ending values as you will expect. Maybe be safe and use 1 million. Something like "create table numtable (x integer)".

Join the two tables together, but use sql like this:
select a,b,c,d,e,range1+numtable.x,range2 from datatable, numtable
where datatable.range1+numtable.x <= datatable.range2
This should give you a product of the two tables, returning all rows from the datatable, as well as a cartesian product result row where range1 plus that x value is less than or equal to the range 2 value.



One more aside, these are tricks of the data integration trade. If you have an urgent problem in the future, we can provide premium support that includes a phone number, or even us actually doing the work, but it comes at a price.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
elavenil
Premium Member
Premium Member
Posts: 467
Joined: Thu Jan 31, 2002 10:20 pm
Location: Singapore

Post by elavenil »

Hi Shiv,

This can be achieved reading the input file (hope this input can be written into seq file) in the job control code and reading the records in the loop.

OpenSeq/ReadSeq/WriteSeq would help you reading/writing from/into the file. And last value to be incremented till you reach the target range.

HTWH.

Regards
Saravanan
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Your range values are too large for integer arithmetic, and the leading zeroes need to be preserved. Search the forum for the code for my OpenSequentialFile routine and implement same. Then pre-process your file with the following before-stage subroutine. In its input value field supply the pathname of the original file and the pathname of another file that DataStage will subsequently process.

Code: Select all

SUBROUTINE GenerateMultipleRecords(InputArg, ErrorCode)
$COPYRIGHT "Copyright (c) 2005, Ray Wurlod.  All rights reserved.  May be used with this copyright notice intact."
* Generates N rows per input row, depending on values in final two fields.

DEFFUN OpenSequentialFile(FilePath, OpenMode, WriteMode, Logging) Calling "DSU.OpenSequentialFile"

Equate RoutineName To "GenerateMultipleRows"
Equate FieldDelimiter To "|"

ErrorCode = 0 ; * set this to a non-zero value to halt execution

ReadCount = 0
WriteCount = 0
ErrorCount = 0

* Process InputArg to determine file pathnames

Msg = ""
TrimInputArg = Trim(InputArg)
InputFilePath = Field(TrimInputArg, " ", 1, 1)
OutputFilePath = Field(TrimInputArg, " ", 2, 1)

If Len(InputFilePath) = 0 
Then
   Msg<-1> = "No input file path supplied."
End
If Len(OutputFilePath) = 0
Then
   Msg<-1> = "No output file path supplied."
End

If Len(Msg) > 0
Then
   Call DSLogWarn(Msg, RoutineName)
   ErrorCode = 1  ; * halt execution of job on return
   RETURN
End


* Open both files, one for reading, one for (over)writing.

fvInputFile = OpenSequentialFile(InputFilePath, "R", "A", "Y")
fvOutputFile = OpenSequentialFile(OutputFilePath, "W", "O", "Y")
If FileInfo(fvInputFile,0) And FileInfo(fvOutputFile,0)
Then

   * Outer loop reads input file one line at a time.

   Loop
   While ReadSeq InputLine From fvInputFile

      ReadCount += 1

      * Parse generating values from input line.

      Delimiters = Count(InputLine, FieldDelimiter )
      ConstantFields = Field(Inputline, FieldDelimiter , 1, Delimiters - 1)
      StartValue = Field(InputLine, FieldDelimiter , Delimiters, 1)
      FinalValue = Field(InputLine, FieldDelimiter , Delimiters + 1, 1)

      * Make sure each is numeric.

      If Not(Num(StartValue) And Num(FinalValue))
      Then
          Call DSLogWarn("Non-numeric values for loop in line " : ReadCount " :.  Row skipped." : @FM : InputLine, RoutineName)
          ErrorCount += 1
          Exit   ; * exit from this iteration of outer (read) loop.
      End

      If FinalValue >= StartValue
      Then

         * Inner loop generates output lines.

         FinalFieldValue = StartValue

         Loop
         While FinalFieldValue <= FinalValue

            * Generate and write line to output file.

            Line = ConstantFields : FieldDelimiter : FinalFieldValue
            WriteSeq Line To fvOutputFile
            On Error
                Call DSLogWarn("Error writing to " : OutputFilePath : ", status = " : Status(), RoutineName) 
                ErrorCode = 2
                ErrorCount += 1
                GoTo MainExit  ; * no point continuing
            End
            Then
               WriteCount += 1
            End
            Else
                Call DSLogWarn("Unable to write to " : OutputFilePath : ", status = " : Status(), RoutineName) 
                ErrorCode = 2
                ErrorCount += 1
                GoTo MainExit  ; * no point continuing
            End

            * Use string arithmetic to preserve precision, Fmt() to preserve leading zero(es).

            FinalFieldValue = Fmt(SAdd(FinalFieldValue, "1"), "R%":Len(FinalFieldValue))

         Repeat

      End
      Else

         Call DSLogWarn("Final value less than start value in line " : ReadCount : @FM : InputLine, RoutineName)
         ErrorCount += 1

      End

   Repeat

MainExit:

   * Pass back counters through system variables
   @USER0 = ReadCount
   @USER1 = WriteCount
   @USER2 = ErrorCount

   If FileInfo(fvOutputFile, 0) Then CloseSeq fvOutputFile
   If FileInfo(fvInputFile, 0) Then CloseSeq fvInputFile

End

RETURN
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
trammohan
Participant
Posts: 47
Joined: Thu Nov 13, 2003 12:47 pm

Post by trammohan »

Hi,
You can create multiple records with the help of Custom build OPD..
in the opd call Buildrecord() and 'C' control structures if and else

ram
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

In a Server job? :?

Please make sure you don't give PX answers when Server questions are being asked.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

shivaprasad wrote:Urgent means i need it badly sir,
hope you help me, i have monday a delivery for this.
So pls help me.
Thanks
So, did you get it working?????
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Olap
Participant
Posts: 1
Joined: Thu Dec 25, 2003 9:41 pm

Post by Olap »

kcbland wrote:
shivaprasad wrote:Urgent means i need it badly sir,
hope you help me, i have monday a delivery for this.
So pls help me.
Thanks
So, did you get it working?????

In your previous message, you have told that the records can be read in a single row using the routine and process that into a file. But I've one question on this.

1. Lets say there are 4 cols. AA|BB|1|5
Once we process the record into a single row using Char(10) - line feed, it appears like this "AA|BB|1
AA|BB2
AA|BB|3
AA|BB|4
AA|BB|5"

2. How do we read this into another file with only 3 cols as different rows.

Hope the question is clear??
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard. :D

Re-read the thread from the top, where the original line had ten columns of which the final two delimited fields specified what should occur in the output.
Your question substitutes four for ten and can be solved in exactly the same ways. Not even any requirement to change the routine (though with such small numbers you could get by with integer arithmetic rather than needing the string math function SAdd()).

Olap. That's a Scandinavian name, no? :lol: Please edit your Profile and let us know a little about yourself.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Olap wrote: 2. How do we read this into another file with only 3 cols as different rows.
The job looks like this:

Code: Select all

SEQ#1 --> XFM#1 w/routine --> SEQ#2 --> XFM#2 --> whatever
The metadata in SEQ#1 is a single column. The XFM calls the routine, passing it the entire row which is just defined as a single column. SEQ#2 is the same single column metadata on the input side. The output link actually now has all of the columns defined.

The first sequential file has to make the entire row available. The XFM#1 parses the row using FIELD(inlink.col, "|", #) functions where # is the element in the delimited col that contains the data you need to pass to the function. Since the last two columns are the ranging values, you can get them with the FIELD function.

A single hour of experimentation should be all that's required to get this to work. This is a rather simple task in DataStage.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Post Reply