Page 1 of 1

Real Time Data Extract from OSI PI

Posted: Mon Jun 16, 2014 12:58 am
by neeraj
Hello,

There is a requirement that we need to extract the real time data available in OSI PI system and generate the files which would be used by business for analysis.

We need to extract data after every 20 minutes from the source system i.e. OSI PI.

Request you all to please let me know which option/stage of DataStage, we should use to process the real time data.

Regards
Neeraj

Posted: Mon Jun 16, 2014 1:01 am
by ray.wurlod
Every 20 minutes <> real time.

Please get clear exactly what you want to do - capture the data in true real time, near real time, or in batches 20 minutes or so apart.

DataStage can do all of those things, but I don't really feel like writing a major essay at this time.

Posted: Mon Jun 16, 2014 1:05 am
by neeraj
Hi,

We can say for the intial release, plan is to get the data near to real time and in the next release it should be real time.

I hope I answer your question.

Regards
neeraj

Posted: Mon Jun 16, 2014 5:15 am
by eostic
True real-time in the ETL space of DataStage means something very different --- with real time transport vehicles and protocols like message queues, web services, and the like.

If 20 minutes, or even "every 30 seconds" is your objective, first focus on how to get data from this source. Mini-batches will probably be your solution for now, being executed on a timely or even driven basis.

Ernie

Posted: Mon Jun 16, 2014 5:04 pm
by ray.wurlod
As Ernie noted, true real time involves additional technologies such as message queues, which are written to at source and read from by "always on" DataStage jobs.

Near real time, or mini-batches, might use changed data capture technologies, with changes being read from (for example) transaction logs and transmitted by some means to DataStage.