Loading metadata in stage using basic

girishoak · Post by **girishoak** » Thu Feb 05, 2004 7:47 pm

Hi All,

I tried to search something similar, I am looking for.

Basically, I want to create a generic extraction job that can extract data from given table. I want to use only 5 parameters such as OracleDSN, OracleUser, OraclePassword, TableName(by which data is to be extracted), TableName(where data is to be loaded, I am using Orabulk stage to create ctl file).

Now, I want to load the column defination in my Oracle Stage, that to dynamically. I have all table definations in my Metadata set.I want to use. The idea is that it should dynamically pickup the metadata based on tablename.

Can anybody suggest me solution for this or any work around to get same result.

Thanks

Girish Oak

ray.wurlod · Post by **ray.wurlod** » Thu Feb 05, 2004 8:49 pm

Quite simply that's not how DataStage works. It can't be done with any tools that Ascential would regard as "legal", and is very difficult and may void your warranty by other means.
Table definitions are link properties. If you change the properties in a job you have to recompile the job, so that the compiled program knows about the new properties.
Do you think you could write job control code that

attaches a job,
determines the table name (names?),
extracts the table definition (presumably from your repository, otherwise your task is even more difficult),
detaches the job,
correctly updates certain records in the DS_JOBOBJECTS table,
compiles the job, attaches the job again (what if it didn't compile?)
then runs it,

getting every little piece right?

Even if you could, changing the column definitions can destroy the various expressions that use the column names.

girishoak · Post by **girishoak** » Thu Feb 05, 2004 9:29 pm

Thanks Ray for your reply.

Few things probably missed out are as follows :

1.Table definations are stored in my repository.
2. Source Tables and Target have same metadata.
3. I want to create a single job and want to run that with different instances.

I want to attach this meta data to the stage on the fly. But dont knw, whether it is possible or not? if yes, how?

I hope this will clear view to understand my exact requirement.

girishoak · Post by **girishoak** » Thu Feb 05, 2004 9:30 pm

One more thing to add is "I dont need any expression in between"

kcbland · Post by **kcbland** » Thu Feb 05, 2004 9:35 pm

girishoak wrote:Can anybody suggest me solution for this or any work around to get same result.

Visit http://www.asktom.oracle.com and search for his pro*c program that is a bulk unloader. He has posted generic program code for mass-spooling output from a SQL query to a file. It does not use the dreadful UTL_FILE package so that you are not limited to a local file system declared in the init.ora. It works across sqlnet so it's compatible with your current Oracle client and backwards.

You can simply build a wrapper to this program to generate the appropriate query. Imagine, you run the wrapper script (perl, ksh, whatever) and give it an instance and database object (view, table, whatever). It queries the catalog to get the columns and constructs an ordinal based query (select cola, colb, colc, cold, cole from passed object) and then runs the pro*c program giving it that query and the output file. This high performance program will spool your output.

This is elementary stuff. Everyone needs to be doing bulk spooling as well as bulk loading. You should have a generic wrapper script to manage both in your arsenal of tools. Kyte's code is really well written, and a must have. I've just finished convincing a client of mine to switch from using DataStage jobs that spool queries to files using OCI stages to BATCH/SEQUENCE jobs that run the command line wrapper script for bulk unloading. Since a significant number of queries are "select *" type, we've coded for an optional WHERE clause to passed in the wrapper. In the case of joins, we will create views to handle complicated joins so that our spooler has an easy time. This will vastly improve spool time on Server based jobs, as well as simplify that whole mess of upgrading 8i jobs to 9i jobs, because then it doesn't matter.

girishoak · Post by **girishoak** » Fri Feb 06, 2004 1:51 am

Hi Ken,

Can't we load table defination on the fly in oracle Stage. Also, I was looking for "work around in datastage".

Looking for response.
Girish Oak

kcbland · Post by **kcbland** » Fri Feb 06, 2004 6:41 am

girishoak wrote:Can't we load table defination on the fly in oracle Stage.

No, the product is about fixed metadata. Good or bad, that's the way it works.

girishoak wrote:Also, I was looking for "work around in datastage".

Well, I just described how you could write your own "plugin" type stage. Instead of an OCI stage, you have a Command stage execute a shell script layered around Kyte's pro*c program. You won't have to have anything dynamic in the job, because the program takes care of it. You won't have to deal with a transformer and a sequential file stage with fixed metadata, as the program spools the data for you. And, I believe Kyte even has extra logic to write the .ctl bulk load file for you.

So, someone out there has written exactly what you want and its free. Any DS solution is less than satisfactory. I'm sorry you feel the need to stay within the confines of the tool. However, I always take the approach that DS is one of the tools in my arsenal, not the only one. You're attempting to use a sledgehammer when a screwdriver is the more appropriate tool.

ray.wurlod · Post by **ray.wurlod** » Sun Feb 08, 2004 3:37 pm

girishoak wrote:Thanks Ray for your reply.

Few things probably missed out are as follows :

1.Table definations are stored in my repository.
2. Source Tables and Target have same metadata.
3. I want to create a single job and want to run that with different instances.

I want to attach this meta data to the stage on the fly. But dont knw, whether it is possible or not? if yes, how?

I hope this will clear view to understand my exact requirement.

There is a non-metadata solution under these conditions, which indeed I have implemented at my current project (Red Brick not Oracle but it makes no difference).

The job design is as follows.

Code: Select all

HashedFile  --->  Transformer  --->  ODBC

The hashed file stage has to generate one row. I achieve this by setting the file name to DS_JOBS and the Selection to @ID = "\\NextNumber". The only column selected is @ID.
The only reason the row is needed is to fire the user-defined SQL in the ODBC stage.
The user-defined SQL in my job is conditional; sounds like yours isn't.

Code: Select all

INSERT INTO #TableName#_CURRENT
SELECT * FROM #TableName# WHERE CURRENT_IND = 'Y';

This job is multi-instance, and I choose to use the table name as the instance identifier.
The Transformer stage does nothing to the row being processed; including an explicit active stage is a habit I've gotten into; it means you can use the Monitor (no value in this case!).

girishoak · Post by **girishoak** » Tue Feb 10, 2004 12:56 am

Hi Ray,

Thanks for your reply.

I have found out some solution similar on your lines. I have created shell script which takes parameters for source table and target table and I am using simple sqlcommand
Insert into tablename1 Select * from table name2

I think the solution what you have given is difficult to implement I suppose because unless you provide data definition your job doesnt get compiled. Please correct me if I am wrong.

Thank you all for putting efforts in.

- Girish Oak

kduke · Post by **kduke** » Tue Feb 10, 2004 9:53 am

Girish Oak

MetaRecon or whatever is called does what you want to do but it looks at metadata and generates a DSX. If you take a simple job which does a straight table copy and export it. You may be able to reverse engineer the DSX file. This would save a lot of work. It may not be as clean as you would like because you have one job per table but it would work.

ray.wurlod · Post by **ray.wurlod** » Tue Feb 10, 2004 2:57 pm

girishoak wrote: I think the solution what you have given is difficult to implement I suppose because unless you provide data definition your job doesnt get compiled. Please correct me if I am wrong.
- Girish Oak

My job design uses only one column. Its name and data type are largely irrelevant, though VarChar is safest. It works; we have it running here. The source stage extracts one row containing one column (from anywhere) which is used purely to trigger user-defined SQL in the target stage. That SQL does not make use of any parameter markers, so the incoming value is ignored by it.

DSXchange

Loading metadata in stage using basic

Loading metadata in stage using basic

Re: Loading metadata in stage using basic