BigIntegrate Questions

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dsuser_cai
Premium Member
Premium Member
Posts: 151
Joined: Fri Feb 13, 2009 4:19 pm

BigIntegrate Questions

Post by dsuser_cai »

Hi

We have BI 11.5 on Linux, Hortonworks Data platform. We are building a huge data lake in hadoop. I have few questions about how BI 11.5 works. I couldnt find answers to these in google. So could you please share some light.

1. Hive Queries - If i want to run any HIVE query on interactive mode using PUTTY, by default the execution engine is on Tez, but we can change this to use map reduce engine. But how does BigIntegrate works? For Example when i use a HIVE stage does it runs on Tez or MR engine? What about if i use File Connector stage to build a HIVE table?
2. Dynamic configuration File - Can someone provide the IBM link for some details on what is Dynamic config file and how it differs from traditional DS config file?
3. is there a way to start a tez session and keep it open, then run all my HIVE queries and then once i complete all HIVE queries, i would like to close the tez session. I would like to do this in a BI JOB. for example... I want to open the tez session using an execute command stage (shell script), then run all my jobs that uses HIVE load and extractions, once complete then finally close tez session. I dont know if BI uses tez in the background, but when i run H-SQL's it runs on tez... so i'm assuming that BI would also use tez in the background. but i may be completely wrong..

So any advice on these would be very helpful.
Thanks
Karthick
Timato
Participant
Posts: 24
Joined: Tue Sep 30, 2014 10:51 pm

Post by Timato »

1) AFAIK the hive stage is merely a wrapper for the hive jdbc connector - my recollection is that the execution is deferred to the default for your environment (doesnt switching to MR require restarting hive anyways?). If you're using the file connector to create a hive table i dont imagine it would need to invoke a MR/tez job anyways - it should be restricted to the name node/metastore.
2) The DS Knowledge centre pages are ok - but i found this random IBM doco floating on linkedin to be massively useful:
https://app.box.com/s/b0wonh8vv5bn8g8eaaj76cy7deui27cx (specifically page 30 onwards for your question)
(credit: https://www.linkedin.com/pulse/informat ... -malhotra/)
3) Sorry no idea on this one. you may not have that sort of level of granular control within DataStage/BI.
Post Reply