reading the records from thesored dataset using transformer

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ravikiran2712
Participant
Posts: 38
Joined: Thu Nov 04, 2004 10:36 am

reading the records from thesored dataset using transformer

Post by ravikiran2712 »

iam unable to get the logic in the transfomer stage for retrieving the first 200 records of the sorted data on two fields say store and date.i need to direct only first 200 records for a particular store on a particular date.
thanx in advance and
subhashinijeyaraman
Participant
Posts: 4
Joined: Thu Oct 14, 2004 12:54 am

Re: reading the records from thesored dataset using transfor

Post by subhashinijeyaraman »

ravikiran2712 wrote:iam unable to get the logic in the transfomer stage for retrieving the first 200 records of the sorted data on two fields say store and date.i need to direct only first 200 records for a particular store on a particular date.
thanx in advance and
Hi

For this logic you can either use Column Generator Stage or Surrogate Key generator stage. I will generate the sequence number for the number of records. Then using a transformer stage you can define a constraint like linkname.columnname <=200. It will give the first 200 records.
subhashinijeyaraman
Participant
Posts: 4
Joined: Thu Oct 14, 2004 12:54 am

Re: reading the records from thesored dataset using transfor

Post by subhashinijeyaraman »

ravikiran2712 wrote:iam unable to get the logic in the transfomer stage for retrieving the first 200 records of the sorted data on two fields say store and date.i need to direct only first 200 records for a particular store on a particular date.
thanx in advance and

Hi

If you are using the column generator stage means you have to edit the meta data in the output columns tab in the column generator stage so as to give the initial value to 1. otherwise by default it will start from 0.
ravikiran2712
Participant
Posts: 38
Joined: Thu Nov 04, 2004 10:36 am

reading the records

Post by ravikiran2712 »

hi subhashini,
thank you for your reply. the situation here is that i will have to look at the dataset which is sorted by date/store i.e two key fields and take the first two hundred records of each date/store group in the data set.what i mean to say is the stages i have are DATASET(SORTED ON DATE/STORE) --------> TRANSFORMER.I WANT TO EMBED LOGIC IN TRANSFORMER BY WHICH I WANT FIRST 200 HUNDRED RECORDS OF EACH DATE/STORE GROUP IN THE DATA SET. THE PROBLEM HERE IS AFTER I READ 200 RECORDS IN THE FIRST GROUP OF DATASET I WILL HAVE TO NEXT GROUP AND NOT EXIT. I GOT YOUR LOGIC FOR FIRST 200 RECORDS OF A DATASET BUT I AM REQUIRED TO TAKE 200 RECORDS OF A GROUP IN A DATASET.
T42
Participant
Posts: 499
Joined: Thu Nov 11, 2004 6:45 pm

Re: reading the records

Post by T42 »

Do a search on here for counters, and be creative.
lakshmipriya
Participant
Posts: 31
Joined: Tue Jul 13, 2004 5:26 am
Location: chennai
Contact:

Post by lakshmipriya »

You can use the stage variables effectively in this. Counter logic can be incorparated in stage variables and this can be used as a constraint which might give you a better solution.
Lakshmi
dsxdev
Participant
Posts: 92
Joined: Mon Sep 20, 2004 8:37 am

Post by dsxdev »

hi,
There is a simpler solution if your are ready to forgone a bit on performance side and change job design.

Your job looks like this.

Dataset ->Sort stage(sort keys:date and Store, options: create Key Change column=true; preserve partitioning=set)-> Transformer(run in sequential mode)->output

In the transformer
define a Stage variable
sgKeyChange= If Input.clusterKeyChange=1 Then 1 Else sgKeyChange+1)

On the output links put a constraint
sgKeyChange<=200

Then for each date/store combination only 200 records will be output.
Happy DataStaging
ravikiran2712
Participant
Posts: 38
Joined: Thu Nov 04, 2004 10:36 am

THANKS GUYS

Post by ravikiran2712 »

THANKS GUYS,
THIS IS MY FIRST QUERY AND IAM VERY HAPPY WITH THE RESPONSE AND I WILL CONTINUE PUTTING QUERIES
ravikiran2712
Participant
Posts: 38
Joined: Thu Nov 04, 2004 10:36 am

Post by ravikiran2712 »

HEY GUYS,
DO YOU HAVE ANY IDEA WHICH DOESNOT USE THE SORT STAGE COZ THE SORT STAGE TAKES NEARLY 3/4TH OF THE WHOLE TIME OF RUNNING THE JOB
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Surely if, as your original post suggests, you have sorted data, you don't need to sort it again in DataStage.

You can still use stage variables for counters even though the data aren't sorted, but all you get are blocks of 200 with no relationship to the meaning in the data.

What's the requirement for 200 anyway? At a push you could write to tape with a blocking factor of 200, and set up a pseudo tape device per partition that writes to a text file instead! :lol:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply