Shell script that can be used for HA set-up

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
kristof31
Participant
Posts: 3
Joined: Wed Oct 10, 2007 3:58 am

Shell script that can be used for HA set-up

Post by kristof31 »

Hi All,

I just want to ask if anybody here was able to create a shell script that can be used for an HA implementation of DataStage.
A shell script that can detect if Server 1 is not working; meaning it is down. Then the shell script will make Server 2 active. Then all the DataStage processes that were not finished in Server 1 will be continued in Server 2.

Hoping for some replies as soon as possible.

Thanks
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I think you're going to be pretty much on your own with this one. We did something like this many years ago for a Compaq Tru64 cluster with shared EMC storage but even then there was no "continuing". Everything died and then was restarted on the failover node. And that was simplified by being Server only, I would imagine PX complicates it.

What kind of hardware setup are we talking about here? Isn't there an actaul "HA" version of DataStage nowadays? I remember some discussions of it here but don't recall if it has materialized yet or not. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
kristof31
Participant
Posts: 3
Joined: Wed Oct 10, 2007 3:58 am

Post by kristof31 »

Actually, we are supposed to use an HA Linux software to handle the HA functionality of the solution. But, we were not able to get an HA Linux software resource/export to implement it. That's why we are now looking forward in creating a shell script that can detect fail over, in short a script that can handle the 'HA functionality' of the solution.

We will be using 2 RHEL servers for this and we'll have a SAN shared disk for the storage. :)
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

First you have to determine what "down" means to you and your server.

Missing mounts, hung processes, SSH not working, over saturation of a server (to busy).

are you looking for Load Balancing your Engine Tier or just failing over from your active server to your passive server?

You also have to keep in mind that the Engine Tier server name is incorporated into your project name as found in your metadata repository. So when you "fail over" from ServerA to ServerB, your project exists in the repository under ServerA. You would "clone" the project and associated it with ServerB, but the metadata would not be shared between both servers. Meaning if your sequencer failed on job 4 out of 5, then your restart on ServerB would not know that, and restart from the top.

You can trick things out by setting up a Virtual Host name of Server. That host name would point to the active server. So "Server" points to "ServerA" (the active one). If ServerB detects that ServerA is not communicating to it. It can initiate a failover: shut down processes on A, unmount shared mounts, mount them on B, start up engine on B, etc... Since the metadata only knows "Server" you can start your process back up on B no prob. The Project directory must be accessible by both servers (hopefully only one at a time) because that Universe database has a big role to play in that shared metadata.

Off the shelf products exist for Clustering Linux boxes and handling failovers.

I don't want to drop names for that software but you can google it.

The DataStage setup we have has HA (Active / Passive) UDB, Websphere and Engine Tier. We have a 4 tier setup. We also run a GRID.

So yes, your Linux DataStage environment can be HA, in an Active Passive setup. You can also set up multiple Engine Tiers, but know that a project is directly associated with ONE Engine Server, you can have two projects with two names, and load them up with the same jobs... but it's a separate instance of those jobs. Separate Metadata.

Network Attached Storage would be a good way to share the engine binaries, just make sure you unmount them from one server when it's the passive one. You do NOT want to have the active and passive servers communicating to your repository and acting upon the same project. You'll be in a world of pain.
kristof31
Participant
Posts: 3
Joined: Wed Oct 10, 2007 3:58 am

Post by kristof31 »

Load balancing is out-of-scope, we're just implementing HA when there's a failover in server 1.

'Down' means that the server has been shutdown or there had been some hardware problems etc...

I'm already aware of the concept, but i'm not confident enough on how to create the shell script that will do that.
Post Reply