pgp decryption / streaming data using SOA

Dedicated to DataStage and DataStage TX editions featuring IBM<sup>®</sup> Service-Oriented Architectures.

Moderators: chulett, rschirm

Post Reply
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

pgp decryption / streaming data using SOA

Post by asorrell »

Ok, first, I have to explain the problem in somewhat general terms because of security constraints on what I'm allowed to post.

Here's the situation:

A pgp encrypted file containing confidential data (several million records) is being sent to a server running a Cast Iron appliance (some sort of encryption / semi-ETL tool I think!) that grabs the file and archives the encrypted file in a secure location on an internal server.

DataStage then needs to pick up the file from that server, do some minor transformations on the data (so data must be decrypted) and load it into an Oracle table (encrypted database) - without landing any of the data anywhere other than the target table. In other words, no staging of the file, just stream the data from the remote encrypted file to the remote Oracle table.

On top of that, I have to be able to say that "yes it can be done" and "here's how we are going to do it" practically overnight. There are some regulatory reasons the application has to be slammed into place over the next several days and we must let management know for certain that DataStage is the right tool for the job.

The reason that I'm posting in the SOA forum is that one of the technical staff believes that he can write a Cast Iron Orchestration (custom app?) that will sit out there as a SOA service that will decrypt the file and send us the data if we send it the appropriate info (path, password, etc.). I'm assuming the data would then hit DataStage in an XML format from the service.

I'm pretty sure from scanning documents today (including a quick pass at Ernie's blog), that a SOA-enabled DataStage job can be written to invoke that external service, receive the data, and stuff it straight into the Oracle table.

-However-

With half the staff already out for the holiday weekend - I have to make a statement to management about feasibility without the luxury of building a working test job first.

-So-

1) Anyone out there see any problem with this approach?
2) Are there any "gotcha's" I should be aware of / research?
3) If it will work - any requirements I need to send to the guy doing the Cast Iron part of it?
4) Any alternatives that spring to mind in case Mr. Cast Iron can't get the SOA service to work? I was thinking of maybe using SSH to grab and decrypt the file, but again, no time to test!

Any input appreciated. Thanks!
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

So... no pressure, eh Andy? :?

Are you saying that you have no shot at getting the original pgp file from the server before this "Cast Iron" thing puts it securely away somewhere you can't get at? Or is that the ssh approach?

Nothing is jumping out at me in the "it can't be done" category. If you can get that service to deliver the unencrypted file contents to you, the rest should be fairly straight-forward... I would think. Without them decrypting it for you, however, I'm not sure how you'd do it on the fly without landing anything. And just as an FYI, SOA / web service does not automatically mean XML.

Gah, reading back over this it is... disjointed... and not really all that helpful. Sorry, long day. Hopefully Ernie will be along and provide some actually useful pointers on the SOA / WISD approach.
-craig

"You can never have too many knives" -- Logan Nine Fingers
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

I probably don't have a chance of getting the file before the Cast Iron appliance grabs it and stores it. This was already an emergency project and our piece is really a last minute solution to an unanticipated problem. Considering we need to have a working solution ready early next week, they are asking us to use what is already in place (ie: the Cast Iron archive).

And yes, I was thinking of using SSH to maybe grab the pgp encrypted file from the source. However, I've always used SSH to grab the file and land it before, so that solution needs tweaking as well. I'm thinking maybe using the decode stage or the external stage might be a solution that could call an SSH command in some sort of a script to stream the data into the job without landing it.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

asorrell wrote:I'm thinking maybe using the decode stage or the external stage might be a solution that could call an SSH command in some sort of a script to stream the data into the job without landing it.
Not exactly all that PX-savvy yet but that does sound feasible to me.
-craig

"You can never have too many knives" -- Logan Nine Fingers
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

Well, thanks to some quick reviews of Mr. Ostic's blogs, I've been able to setup the job for the WSDL invocation and get it to work using some test services on the web. At this point I know I can do the DataStage piece of the project. I guess I have to assume the Cast Iron piece will work as advertised...
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

[crosses fingers]
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Hi Andy

Sorry I wasn't able to chime in earlier. How did it go? I like your idea of the web/SOA approach, except I'd be concerned about volumes of data....how many records are in this encypted packet? Several k ? Serveral Meg? Hundreds of Meg?

Assuming the CastIron team can get the service working, be sure that they keep their service very simple...just rows and columns....you will also need to study how to do an "array" with WSPack (lots of entries in here and one at my blog). There's no ISD involved, so it should be straightforward. Have your SOAP testing tools ready.

I like the External Source idea too, in case the Web Service doesn't turn out....assuming you have some external program that can do the retrieval and encryption on the fly.

Good luck and keep us posted!

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

Just to close-out this thread. It worked, but we are hamstrung by the fact I can't get CastIron to break up the files into rows. The setup we are being told to use returns a single 500mb chunk of text. CastIron can do it, but so far no luck with DataStage - it can do 100MB, but after that we get java out of memory errors. Still working with customer service to see what can be done if anything.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
Post Reply