I am currently reviewing some decade old server jobs. Having worked with parallel jobs all the while, I'm running into a few conceptual road-blocks. Please help me understand the following:
Code: Select all
1. There's a job that uses a Universe table as a reference for a lookup. Why and why not a hash file?
2. What exactly are Universe tables? I thought they used to be DataStage internal tables but these jobs are creating tables in the uv database
3. I see that there is no 'JOIN' stage in a server job. What then would be the ideal approach to join 2 voluminous datasets?
4. DataSet is a parallel concept. What comes closest in nature to it in a server job? What are the advantages of using a hash file as an intermediate data store over a sequential file?