Search found 3045 matches

by vmcburney
Wed Nov 23, 2005 8:36 pm
Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
Topic: error "Null value on accessor interfacing to field"
Replies: 4
Views: 3182

One place you can get into problems with null fields is within the Transformer. Find out what source field has nulls in it and then add null handling for that field (such as NullToValue) in the Transformer. Do not try to use that field in any derivation function until it has been null handled.
by vmcburney
Wed Nov 23, 2005 5:15 pm
Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
Topic: How to connect to DataStage for test automation
Replies: 11
Views: 5741

We are struggling to understand because the type of test automation you are trying to get is something most of us have never tried, nor would ever try. The cost benefit of automating the running of DataStage jobs through a test tool is dubious. This is not a end user GUI tool where test automation e...
by vmcburney
Wed Nov 23, 2005 10:54 am
Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
Topic: How can we handle unstructured data?
Replies: 7
Views: 3471

There was a session on unstructured data as IILive2005. IBM have various projects going ahead using a unified unstructured information management architecture (UIMA). They have software research efforts looking at multi-media, taxonomy generation, translation, search, text analysis, applications and...
by vmcburney
Tue Nov 22, 2005 11:20 pm
Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
Topic: How to handle different DB schemas for Dev, Test and Prod
Replies: 6
Views: 1446

Avoid anything that requires a modification and recompile of your job in testing or production. Any job property that you need to change to get it to work in testing should be changed to a job property. This includes database details and file paths. You can change job parameters without recompiling ...
by vmcburney
Tue Nov 22, 2005 6:05 pm
Forum: IBM<sup>®</sup> DataStage TX
Topic: Clearcase/Datastage
Replies: 6
Views: 3671

This manual import/export check in/check out is why I recommend only using ClearCase for production ready components or UAT ready components and not the more frequent versions that get submitted to functional or regression testing projects.
by vmcburney
Tue Nov 22, 2005 4:48 pm
Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
Topic: PX Aggregator Stage(Last & First Values)
Replies: 5
Views: 1705

I'm assuming the first and last functions are on the same field as the sum function. Since both the remove duplicates and aggregation stages need sorted data it will be more efficient to have a sort stage followed by copy with one path to remove duplicates and the other to aggregation. That way you ...
by vmcburney
Tue Nov 22, 2005 4:41 pm
Forum: Data Integration
Topic: Consultant adamant in not using SDLC for data warehouse dev
Replies: 7
Views: 14487

chinek wrote:ps vince,he's from another consulting co.

Thought so. We are not allowed near a client site without waving our Proven Course Methodology or our MIKE DW methodology around. There are a very large number of DW projects out there that crashed and burnt because they didn't have a good methodology.
by vmcburney
Tue Nov 22, 2005 2:25 pm
Forum: Data Integration
Topic: Consultant adamant in not using SDLC for data warehouse dev
Replies: 7
Views: 14487

There is some discussion on the spiral or iterative approach to project management over on GanttHead . There seems to be some agreement that scope creep can be a problem when you are constantly going for customer feedback. It can be hard to keep to a project schedule when you don't know what the end...
by vmcburney
Tue Nov 22, 2005 1:52 pm
Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
Topic: Oracle Enterprise Stage
Replies: 1
Views: 616

You are using enterprise edition so you don't want this SQL select to become a bottleneck, ie. you don't want a million dollars worth of ETL hardware twiddling it's thumbs while your Oracle source spends hours trying to run a large and complex SQL. So one objective is to use SQL that comes back quic...
by vmcburney
Tue Nov 22, 2005 1:41 pm
Forum: IBM<sup>®</sup> DataStage TX
Topic: Clearcase/Datastage
Replies: 6
Views: 3671

Consider using ClearCase for production ready releases and not for the traffic between development and testing. It can be useful to get all parts of a release into ClearCase including the database changes, documentation and environment changes. If you didn't mind burning up disk space you could also...
by vmcburney
Mon Nov 21, 2005 11:46 pm
Forum: Data Integration
Topic: Consultant adamant in not using SDLC for data warehouse dev
Replies: 7
Views: 14487

Sounds like a perfect consulting project. He can spend as long as he likes doing whatever he wants! There is no scope or requirements so he doesn't need to produce anything. There is no documentation so you cannot check what he is doing or replace him. I think a prototype work best if you assume you...
by vmcburney
Mon Nov 21, 2005 10:48 pm
Forum:
Topic: Version Control and Metastage
Replies: 6
Views: 2272

The VERSION repository saves information into the job long description field so there is some metadata there. It also saves different job versions with different suffixes so if you import the VERSION metadata the job name will give some version information. Beyond that I don't know where the batch n...
by vmcburney
Mon Nov 21, 2005 5:03 pm
Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
Topic: PX Aggregator Stage(Last & First Values)
Replies: 5
Views: 1705

I was afraid you were going to say that. You could use good old min and max although these may give you different results. You could split the primary stream sending one set through remove duplicates for first and last and the other set through aggregation for sum, then join them together again. You...
by vmcburney
Mon Nov 21, 2005 4:48 pm
Forum: IBM<sup>®</sup> DataStage TX
Topic: Need for DS TX
Replies: 10
Views: 6354

DataStage EE and Server are mainly used to handle relational data. If they have a complex data source such as a complex flat file or XML they usually have to convert it to flat relational records in order to transform them. TX is better with complex data such as Electronic Data Interchange (EDI), He...
by vmcburney
Sat Nov 19, 2005 12:18 am
Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
Topic: How to connect to DataStage for test automation
Replies: 11
Views: 5741

Running jobs has always been the easy part of testing, verifying results has always been the hard part. Automation of jobs can be done via the API but all that time would probably give you very little benefit. It's having a good regression testing strategy with static test data and a tool to verify ...