Hi.
I've been working with QualityStage for some years...but
I understood just some days ago a big limitation using datafile inside the same QS project.
PROBLEM
The same datafile can't be used, inside the same project, either as FileA and FileB in two different jobs;
for example using the file as FileA in an undup stage and as FileB in a geomatch stage
MY SOLUTIONS
1] Create 2 distinct projects, one for the undup job, and another one for the geomatch job. Then I need to copy (or move) the common file in the Data directory of the second project
2] create a new datafile, identical to the datafile in common, and use this in the geomatch, job for example. Then I need to create, on the filesystem, a symbolic link to the real file and whose name is the same of the new datafile created (the one used in geomatch job)
Please, can someone suggest me another way ?
Thanks,
Andrea
QS: limitation using datafile
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Welcome aboard! :D
Are you running your QualityStage jobs independently or from DataStage. If the latter, why not have it split the source data into two streams for the separate QualityStage jobs? OK it's two copies of the data, but it's in memory rather than two physical files.
Are you running your QualityStage jobs independently or from DataStage. If the latter, why not have it split the source data into two streams for the separate QualityStage jobs? OK it's two copies of the data, but it's in memory rather than two physical files.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Hi Ray :Dray.wurlod wrote:Welcome aboard! :D
Are you running your QualityStage jobs independently or from DataStage. If the latter, why not have it split the source data into two streams for the separate QualityStage jobs? OK it's two copies of the data, but it's in memory rather than two physical files.
I'm running QS independently....
I'm not familiar with running QS from DS, I've just tried a couple of time
but...
I think that also with your method the problem will persist !
The problem is with the deploy informations, and these informations will not change dependentely on how you run the QS job, isn't it?
The problem is not the actual data used, but how to use it "File A" rather than "File B"....
Many thanks :D
To be precise, if the name of the data file is INPUT (for example),ponzio wrote: The problem is with the deploy informations
....
The problem is not the actual data used, but how to use it "File A" rather than "File B"....
The deploy will create the file INPUT.DIC in the DIC directory under the project directory...
Consider 2 jobs that use that file, one of these is a undup job and the other a geomatch job.
INPUT is the reference file (File B) of the geomatch job, and it is the only input file for the undup job (File A).
We have 2 jobs but only 1 file INPUT.DIC!!
So if we deploy the undup job first, the deploy of the geomatch job will override the INPUT.DIC created for the undup job...
if the geomatch job will be deployed before, the deploy of undup job will override the INPUT.DIC created for the geomatch job
The difference in the 2 versions of the 2 files is the line
FILE ${DATAA}
in the file created for the undup job
FILE ${DATAB}
in the file created for the geomatch job
This line indicates how to use the file when used in a job
Different problems will occur depending on the deploy order and the running order of the 2 job
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
(I don't have an immediate answer. I shall give it some thought.) Meanwhile, if you post on Developer Net you may get an earlier response.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
After this discovery I've read in the QS documentation file MatchConcepts.pdf this sentence
Record Linkage Projects
You assign each record linkage application a project name. This
project name is used in all of the steps of the linkage except for data
dictionary creation
That sentence strengthens what we saw in the files
Record Linkage Projects
You assign each record linkage application a project name. This
project name is used in all of the steps of the linkage except for data
dictionary creation
That sentence strengthens what we saw in the files