REG. DATASETS

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
milandesai82
Participant
Posts: 3
Joined: Mon Aug 27, 2007 11:44 pm

REG. DATASETS

Post by milandesai82 »

PROBLEM DESCRIPTION:

I am using datasets in jobs instead of seq. file to save time for reads/writes.
But in the following path two identical files are created which is causing space issue on server "/local/data1/IBM/InformationServer/Server/Datasets".
There are "10" kinds of people one who understands binary and one who doesn't........
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard.

Spend some time learning about configuration files, in which you specify the location of the Data Set data files.

The pathname you have given is not a file, as you state; it's a directory in which data files are written. It's the default location because it is guaranteed to exist when DataStage server is installed.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
milandesai82
Participant
Posts: 3
Joined: Mon Aug 27, 2007 11:44 pm

Post by milandesai82 »

ray.wurlod wrote:Welcome aboard.

Spend some time learning about configuration files, in which you specify the location of the Data Set data files.

The pathname you have given is not a file, as you state; it's a ...


Thanks for your response.

Let me clarify, the path that i have given is where my dataset files are created, problem is as follows, DATASTAGE is creating TWO INDENTICAL FILES, actually it should create only one file.

and one more thing i am really happy to have my first reply from you actually i have worked in RELIANCE for 2+ years in DSS and have heard lot abt you.
There are "10" kinds of people one who understands binary and one who doesn't........
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are they really identical, or merely the same size? On a two-node configuration file you would expect there to be two data files for a data set (at least for one over 128KB). On a four-node configuration you would expect there to be four data files for the Data Set, and so on. With a round robin partitioning algorithm, you would expect the files to be the same size (plus or minus a block or so).

If the files really are identical then somewhere in your job - perhaps through inappropriate choice of partitioning algorithm - you have generated two copies of your data. This could happen in a Lookup File Set with the partitioning algorithm set to Entire, for example.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
milandesai82
Participant
Posts: 3
Joined: Mon Aug 27, 2007 11:44 pm

Post by milandesai82 »

ray.wurlod wrote:Are they really identical, or merely the same size? On a two-node configuration file you would expect there to be two data files for a data set (at least for one over 128KB). On a four-node configur ...


:) thanks.... i am facing one more situation as follows,when i am running my job for the first time it creates fresh dataset files, but when i am running job again it should ideally overwrite the old files but datastage is creating new set of files kindly guide me on this......
There are "10" kinds of people one who understands binary and one who doesn't........
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are you using the same configuration file? Are you specifying Overwrite or Append? Does the job score include a composite operator that incorporates deletion of the Data Set?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply