Search found 3045 matches
- Wed Nov 23, 2005 8:36 pm
- Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
- Topic: error "Null value on accessor interfacing to field"
- Replies: 4
- Views: 3182
- Wed Nov 23, 2005 5:15 pm
- Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
- Topic: How to connect to DataStage for test automation
- Replies: 11
- Views: 5741
We are struggling to understand because the type of test automation you are trying to get is something most of us have never tried, nor would ever try. The cost benefit of automating the running of DataStage jobs through a test tool is dubious. This is not a end user GUI tool where test automation e...
- Wed Nov 23, 2005 10:54 am
- Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
- Topic: How can we handle unstructured data?
- Replies: 7
- Views: 3471
There was a session on unstructured data as IILive2005. IBM have various projects going ahead using a unified unstructured information management architecture (UIMA). They have software research efforts looking at multi-media, taxonomy generation, translation, search, text analysis, applications and...
- Tue Nov 22, 2005 11:20 pm
- Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
- Topic: How to handle different DB schemas for Dev, Test and Prod
- Replies: 6
- Views: 1446
Avoid anything that requires a modification and recompile of your job in testing or production. Any job property that you need to change to get it to work in testing should be changed to a job property. This includes database details and file paths. You can change job parameters without recompiling ...
- Tue Nov 22, 2005 6:05 pm
- Forum: IBM<sup>®</sup> DataStage TX
- Topic: Clearcase/Datastage
- Replies: 6
- Views: 3671
- Tue Nov 22, 2005 4:48 pm
- Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
- Topic: PX Aggregator Stage(Last & First Values)
- Replies: 5
- Views: 1705
I'm assuming the first and last functions are on the same field as the sum function. Since both the remove duplicates and aggregation stages need sorted data it will be more efficient to have a sort stage followed by copy with one path to remove duplicates and the other to aggregation. That way you ...
- Tue Nov 22, 2005 4:41 pm
- Forum: Data Integration
- Topic: Consultant adamant in not using SDLC for data warehouse dev
- Replies: 7
- Views: 14487
chinek wrote:ps vince,he's from another consulting co.
Thought so. We are not allowed near a client site without waving our Proven Course Methodology or our MIKE DW methodology around. There are a very large number of DW projects out there that crashed and burnt because they didn't have a good methodology.
- Tue Nov 22, 2005 2:25 pm
- Forum: Data Integration
- Topic: Consultant adamant in not using SDLC for data warehouse dev
- Replies: 7
- Views: 14487
There is some discussion on the spiral or iterative approach to project management over on GanttHead . There seems to be some agreement that scope creep can be a problem when you are constantly going for customer feedback. It can be hard to keep to a project schedule when you don't know what the end...
- Tue Nov 22, 2005 1:52 pm
- Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
- Topic: Oracle Enterprise Stage
- Replies: 1
- Views: 616
You are using enterprise edition so you don't want this SQL select to become a bottleneck, ie. you don't want a million dollars worth of ETL hardware twiddling it's thumbs while your Oracle source spends hours trying to run a large and complex SQL. So one objective is to use SQL that comes back quic...
- Tue Nov 22, 2005 1:41 pm
- Forum: IBM<sup>®</sup> DataStage TX
- Topic: Clearcase/Datastage
- Replies: 6
- Views: 3671
Consider using ClearCase for production ready releases and not for the traffic between development and testing. It can be useful to get all parts of a release into ClearCase including the database changes, documentation and environment changes. If you didn't mind burning up disk space you could also...
- Mon Nov 21, 2005 11:46 pm
- Forum: Data Integration
- Topic: Consultant adamant in not using SDLC for data warehouse dev
- Replies: 7
- Views: 14487
Sounds like a perfect consulting project. He can spend as long as he likes doing whatever he wants! There is no scope or requirements so he doesn't need to produce anything. There is no documentation so you cannot check what he is doing or replace him. I think a prototype work best if you assume you...
- Mon Nov 21, 2005 10:48 pm
- Forum:
- Topic: Version Control and Metastage
- Replies: 6
- Views: 2272
The VERSION repository saves information into the job long description field so there is some metadata there. It also saves different job versions with different suffixes so if you import the VERSION metadata the job name will give some version information. Beyond that I don't know where the batch n...
- Mon Nov 21, 2005 5:03 pm
- Forum: IBM<sup>®</sup> DataStage Enterprise Edition (Formerly Parallel Extender/PX)
- Topic: PX Aggregator Stage(Last & First Values)
- Replies: 5
- Views: 1705
I was afraid you were going to say that. You could use good old min and max although these may give you different results. You could split the primary stream sending one set through remove duplicates for first and last and the other set through aggregation for sum, then join them together again. You...
- Mon Nov 21, 2005 4:48 pm
- Forum: IBM<sup>®</sup> DataStage TX
- Topic: Need for DS TX
- Replies: 10
- Views: 6354
DataStage EE and Server are mainly used to handle relational data. If they have a complex data source such as a complex flat file or XML they usually have to convert it to flat relational records in order to transform them. TX is better with complex data such as Electronic Data Interchange (EDI), He...
- Sat Nov 19, 2005 12:18 am
- Forum: IBM<sup>®</sup> Infosphere DataStage Server Edition
- Topic: How to connect to DataStage for test automation
- Replies: 11
- Views: 5741
Running jobs has always been the easy part of testing, verifying results has always been the hard part. Automation of jobs can be done via the API but all that time would probably give you very little benefit. It's having a good regression testing strategy with static test data and a tool to verify ...