Merging Records

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Merging Records

Post by pravin1581 »

Hi All,

How can we merge records from two files and load in a single file . Basically the requirement is to append the two files in a single file ?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Use the UNIX command cat file1 file2 > file3

You don't need DataStage for this one.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

ray.wurlod wrote:Use the UNIX command cat file1 file2 > file3

You don't need DataStage for this one.
It cannot be done through DataStage ?
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

ray.wurlod wrote:Use the UNIX command cat file1 file2 > file3

You don't need DataStage for this one.
Basically the requirement is cartesian join between two files.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Cartesian join (your second post) is a different requirement from union all (your first post - "append the two files in a single file").

If file1 has N lines and file2 has M lines, the approach I suggested will yield (N + M) lines. A Cartesian join will yield (N * M) lines. Which do you want?

You can use a Merge stage in server jobs to perform joins between two text files.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

ray.wurlod wrote:Cartesian join (your second post) is a different requirement from union all (your first post - "append the two files in a single file").

If file1 has N lines and file2 has M lines, the approach I suggested will yield (N + M) lines. A Cartesian join will yield (N * M) lines. Which do you want?

You can use a Merge stage in server jobs to perform joins between two text files.
But a merge stage requires a key to defined for the two files on the basis of which the merging will be done.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

ANY join, Cartesian or otherwise, requires a key upon which to base the join.

Generate a key (line number) if you must.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

ray.wurlod wrote:ANY join, Cartesian or otherwise, requires a key upon which to base the join.

Generate a key (line number) if you must.
There isn't any matching column between the two tables to join on that basis. "Generate a key (line number) if you must" please explain these one.
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

ray.wurlod wrote:ANY join, Cartesian or otherwise, requires a key upon which to base the join.

Generate a key (line number) if you must.
I am giving you the full requirement , suppose there are three columns in a file and another file has 3 columns . I want 1 columns from the first file and 1 column from another file. Select A from B and C from D. I hope the requirement is pretty clear now. The output file should contain the reqd. columns from both the files.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

pravin1581 wrote:
ray.wurlod wrote:Use the UNIX command cat file1 file2 > file3

You don't need DataStage for this one.
Basically the requirement is cartesian join between two files.
Make sure this is really what you want before you go further down the path - a cartesion join. Reread Ray's explaination of the difference as it is totally different from your original request to "merge records from two files and load in a single file". :?

If you just want to 'merge' the two files, choose Complete Set as the Join Type. Or just cat the dang things together before job and then process the cat'd file - that's a very common solution and preferrable to use of the Merge stage. IMHO.
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

pravin1581 , for more clarification, give us a couple of records from both the files and how it will look after transformation.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

chulett wrote:
pravin1581 wrote:
ray.wurlod wrote:Use the UNIX command cat file1 file2 > file3

You don't need DataStage for this one.
Basically the requirement is cartesian join between two files.
Make sure this is really what you want before you go further down the path - a cartesion join. Reread Ray's explaination of the difference as it is totally different from your original request to "merge records from two files and load in a single file". :?

If you just want to 'merge' the two files, choose Complete Set as the Join Type. Or just cat the dang things together before job and then process the cat'd file - that's a very common solution and preferrable to use of the Merge stage. IMHO.
I am unable to use the merge stage as it requires a key to be defined for joining , which is not present in my case . I require the SQL select A from B and C from D to generate the output file.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

pravin1581 wrote:I am unable to use the merge stage as it requires a key to be defined for joining , which is not present in my case . I require the SQL select A from B and C from D to generate the output file.
This doesn't help at all to explain your requirements. :?

At a high level, let's say your two files have record counts like this:

FileA: 2 records
FileB: 10 records

How many records are you expecting in your output? 2? 10? 12? 200? All are valid answers and require different solutions to implement.
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

....and for this very reason, I requested the OP to provide sample data for clarity.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
pravin1581
Premium Member
Premium Member
Posts: 497
Joined: Sun Dec 17, 2006 11:52 pm
Location: Kolkata
Contact:

Post by pravin1581 »

DSguru2B wrote:....and for this very reason, I requested the OP to provide sample data for clarity.
The scenario is something like this :

File A File B
--------- --------
UPC ORG StORE DATE
01 ALB 001 01/24/2007
02 ALB 002 01/25/2007
03 ALB 003 01/26/2007
04 ALB 004 01/27/2007
05 ALB 005 01/28/2007

The output file contains the following fields and the corresponding values are :

File C
---------

UPC ORG StORE DATE
01 ALB 001 01/24/2007
02 ALB 002 01/25/2007
03 ALB 003 01/26/2007
04 ALB 004 01/27/2007
05 ALB 005 01/28/2007
01/29/2007
01/30/2007
01/31/2007

I hope it is clear now .Thanx in advance.
Post Reply