Creating config for accessing Teradata

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Creating config for accessing Teradata

Post by bcarlson »

I am reposting this message (from viewtopic.php?t=92057) to see if I can stir up some more info:

We are trying to resolve the following warning message from DS:

Teradata_Enterprise_0: There will be some skew in the usage of the server because the number of players (3) is not evenly divided by the number of available nodes (8). [terareadop.C:439]

How is the number of players related to the number of nodes? We have a number of different config files with node definitions ranging between 8 and 12 nodes. Here's a sample:

Code: Select all

{ 
node "myserver" { 
fastname "myserver" 
pool "" 
resource disk "/mydir/DataStage_work/1" {} 
resource scratchdisk "/mydir/DataStage_scratch/1" {} 
} 
node "myserver-2" { 
fastname "myserver" 
pool "" 
resource disk "/mydir/DataStage_work/2" {} 
resource scratchdisk "/mydir/DataStage_scratch/2" {} 
} 

... etc., etc., 
node "myserver-8" { 
fastname "myserver" 
pool "" 
resource disk "/mydir/DataStage_work/3" {} 
resource scratchdisk "/mydir/DataStage_scratch/3" {} 
} 
} 
So we have more than 3 nodes defined, but only 3 players are being used. No matter which configuration I use, only 3 players are used and I get the same warning everytime.

I've looked at the documentation (both the PJob and Mngr) before and again now, but I don't see what I am missing. And unfortunately, there isn't anything Teradata-specific in the config file docs.
Last edited by bcarlson on Fri Apr 15, 2005 10:05 am, edited 1 time in total.
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

I assume that Teradata Enterprise trying to work in parallel with DataStage will want one player interacting with one node. So it's trying to tell you that only 3 of your nodes will be used to read Teradata data. Have you tried it with a configuration file with just three nodes?

It's just a warning message. You can probably ignore it. You may be repartitioning your data anyway in a subsequent stage or using your remaining nodes for upstream processing.
nrevezzo
Participant
Posts: 15
Joined: Mon Sep 08, 2003 2:36 pm

Post by nrevezzo »

You need to look at the Orchestrate doc to get info on this. Here's a snip from the manual. You would code the sessionsperplayer and the requestedsessions parameters on the connection string in the Teradata Enterprise stage but you have to select user defined connection string.
This is from the Orchestrate Operator's Reference manual.

Chapter 37 The Teradata Interface Library The teraread Operator
-dboptions -dboptions
'{-user = username
-password = password
[-sessionsperplayer = nn]
[-requestedsessions = nn]}'
You must specify both the username and password with which you connect to
Teradata.
The value of -sessionsperplayer determines the number of connections each
player has to Teradata. Indirectly, it also determines the number of players.
The number selected should be such that (sessionsperplayer * number of
nodes * number of players per node) equals the total requested sessions. The
default is 2.
Setting the value of -sessionsperplayer too low on a large system can result
in so many players that the step fails due to insufficient resources. In that
case, -sessionsperplayer should be increased.
The value of the optional -requestedsessions is a number between 1 and the
number of vprocs in the database. The default is the maximum number of
available sessions.
Table 222 teraread Operator Options (continued)
Option Use
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

nrevezzo wrote:You need to look at the Orchestrate doc
and
This is from the Orchestrate Operator's Reference manual.
Are you referring to the old Torrent documentation (pre DataStage PX)? I looked at ours, and while I found a chapter entitled "Teradata Interface Library", it was not Chapter 37. Am I looking at the right document?

We are using DataStage 7.1r2, and I did not find any references to 'teraread' in the DS documentation.
richdhan
Premium Member
Premium Member
Posts: 364
Joined: Thu Feb 12, 2004 12:24 am

Post by richdhan »

Hi bcarlson,

I have both Orchestrate 6.1 Operators Reference and Orchestrate 7.0 Operators Reference. In both of the guides it is the last chapter named "The Teradata Interface Library" in which you can find this information.

Both these pdf's have references to teraread operator.

HTH
Rich
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

Found the docs. They unfortunately do not come with the DataStage installation. We downloaded them from Ascential's website (http://www.ascentialsoftware.com/eservice/index.jsp, then click on Product Docuementation on the left side of page). They did help, thanks! We still need to update our config file, but the sessionsperplayer and requestedsession options helped us get more read-sessions to run - the test program ran with 7 partitions for the 7 amps on our database.
bcarlson
Premium Member
Premium Member
Posts: 772
Joined: Fri Oct 01, 2004 3:06 pm
Location: Minnesota

Post by bcarlson »

Thanks to all for your help. I also worked with some guys from my team that have more experience with the config files (they built the ones we use with DB2), and with Glenn Bryan from Ascential.

The fix is a mix of config file and an additional option for the Teradata Enterpise Stage (undocumented in the DataStage docs, but documented in the Orchestrate 7.0 Operator Reference).

Here's a sample of the new config file:

Code: Select all

{
    node "myserver" {
        fastname "myserver"
        pool ""
        resource disk "/u001/DataStage_work/1" {}
        resource scratchdisk "/u001/DataStage_scratch/1" {}
    }
    node "myserver-2" {
        fastname "myserver"
        pool ""
        resource disk "/u001/DataStage_work/2" {}
        resource scratchdisk "/u001/DataStage_scratch/2" {}
    }

    ... etc, through node "edwdev1-8"

    node "terapool1-1" {
        fastname "myserver"
        pool "terapool"
        resource disk "/u001/DataStage_work/1" {}
        resource scratchdisk "/u001/DataStage_scratch/1" {}
    }
    node "terapool1-2" {
        fastname "myserver"
        pool "terapool"
        resource disk "/u001/DataStage_work/2" {}
        resource scratchdisk "/u001/DataStage_scratch/2" {}
    }

    ... etc, through node "terapool1-7"

}
Recap: We had defined 8 nodes on the local server where DataStage runs, but we only had 7 amps in Teradata. DataStage tried to open 1 session per amp, but was limited to the default 2 sessions per player. With only 2 sessions per player, only 6 total sessions were created (3 players). With 8 DS nodes available for processing, and only 3 players used, that how we got the warning message.

The solution: We defined a new pool (called terapool) and new nodegroup with 7 nodes - 1 DataStage node per Teredata amp (keep data a free flowing as possible). In my DataStage program, I then specify the Teradata Enterprise stage to use the terapool pool. We also set the Additional Connection Options to sessionsperplayer=1 in the Teradata Enterprise options (Output/Properties/Connection/Additional Connection Options). By default, DataStage will attempt to create 1 session per Teradata Amp. The sessionsperplayer=1 forces 1 player per node on the DataStage side. Works great, and gets rid of the warnings.

Thanks everyone!
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

And thanks to you and your associates for posting such a detailed answer, which means no-one else need ever be unable to solve this particular problem. :D
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vinaymanchinila
Premium Member
Premium Member
Posts: 353
Joined: Wed Apr 06, 2005 8:45 am

Post by vinaymanchinila »

Do we need to control the number of session for TD API stage?

Thanks,
Thanks,
Vinay
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes, just as any Teradata client must. However, there are default values in DataStage; it is sufficient to rely on these unless you want to make use of more amps than the number of processing nodes in your DataStage configuration file.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vinaymanchinila
Premium Member
Premium Member
Posts: 353
Joined: Wed Apr 06, 2005 8:45 am

Post by vinaymanchinila »

Hi Ray,

In the TD enterprise stage, we can use DBOptions and set the values, where do you think we do that for TD API stage? In the config file , from where it reads?

Thanks
Thanks,
Vinay
vinaymanchinila
Premium Member
Premium Member
Posts: 353
Joined: Wed Apr 06, 2005 8:45 am

Post by vinaymanchinila »

I was under the opinion that the API does not invoke the multi or fast load process.
Thanks,
Vinay
Post Reply