Connection Error

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

FYI - fixtool doesn't think anything is wrong with UV_USERS, even at level 10. :?

Question - if I can get ahold of an SA tomorrow to walk them through the FIX option, do I absolutely need to get all DS users disconnected first?
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Technically no, but they may be thwarted from doing anything DataStage/SQL while the utilities have UV_USERS locked. Further, since it's a dynamic hashed file, the bad information might be loaded in memory, and overwrite our fix when the file is closed. Therefore it's better to have no DataStage clients connected or jobs running while the repair is in progress. This should only need a few minutes.

If fixtool with -fix doesn't fix things, we might consider using filepeek to reset the current modulus of the file to 1 (which the size of DATA.30 suggests that it ought to be).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

To close this out, I did end up using filepeek to correct the problems with this and a handful of other tables. Not all by myself, mind you, but under the steady hand of a very nice Second Level support lady from Ascential named Elaine who walked me through all the 'poking' of corrected values into the proper places on the hashes we found that were horked up. Interesting stuff.

Happy to say all is now right with the world... well, at least with this world. Still got a messed up Version Control project as part of all this, but that's a bedtime story for another night. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Editing bit values in a structure is a very dangerous thing to do - don't even contemplate attempting it without expert assistance such as Craig had.

You can do a lot of damage if you don't know what you are doing.

You can do even more damage if you do know what you are doing! :twisted:

(BTW, I teach this stuff in the UniVerse world.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Called in yet another level of support today to resolve the horked up Version Control project I mentioned in the previous post. No idea how many of you have had the pleasure of working with Karen Powers, but she is the Big Gun they call in when the front line troops get stumped and they need it fixed... now. Been probably four years since last we spoke, but spent some time today with both Karen and Elaine working out the last of my corruption kinks.

I knew there were issues with the APM* hashed files that make a project a Version Control project but we thought we had them all fixed on Monday. However, with only about 5% of what should show up actually showing up, we knew something was still wrong.

Turns out the header block in several of the system hashed files got themselves corrupted, overwritten with information from different hashed files it seems. :shock: We had repaired the header information with filepeek but turned out we missed one step. After correcting the header block information for Current / Base Modulus and Next Split we needed to RESIZE the file. Resizing the hashed file (using its current type of 30) basically rehashed key information into the now corrected header block, correcting the issues that still lurked there. Once all the APM hashed files were made whole and could be 'joined' again so to speak, the Version Control software could properly display all of the versioned components in the project.

As noted, not something you'd want to try on your own but something I thought I would share. It was quite interesting to see some of the Secrets of the Inner Sanctum. :lol:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

We spoke with Karen at the Las Vegas event last November, if you recall.

Overwriting of dynamic hashed file headers has been seen in the past when the T30FILE table got exactly full - without generating any errors - but I thought that had been fixed.

It could, in theory, also occur if some inadvertent redirection of file units occurred, such as between agent processes dsapi_server and dsapi_slave. Again, there are - as far as I am aware - safeguards in the source code nowadays to prevent that from happening. But they may have missed one - or introduced one - I guess.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Must of been a different we, I wasn't able to go last November. 2004 yes, 2005 no. :cry:

Don't know about the T30FILE table issue, I got the impression from Karen it was more a dsapi_server / slave issue. They're checking for me to see if it is 'officially' fixed in the 7.5.2 release but for now in 7.5.1A systems that have exhibited the problem there is a patch available.
-craig

"You can never have too many knives" -- Logan Nine Fingers
jamesm
Participant
Posts: 1
Joined: Mon Mar 25, 2002 5:04 pm
Location: Australia

Post by jamesm »

We had the same error message, with the same symptoms as described in the post, using DS 7.5A.

Support was able to get us back up and running by copying the UV_USERS file from another server over the top of the corrupted one. (There was no backed up version). We have been running for the last two days without issue.

There is also a patch coming for this version.

Cheers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

We patched up just after this and have been running without issue since. Guess it's time to mark this as resolved. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54595
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

jamesm wrote:(There was no backed up version). We have been running for the last two days without issue.

There is also a patch coming for this version.

Cheers


And the moral of THIS story is... ?

(I very much doubt that there will be a patch coming to cover the situation that you have no backup. Though the good folks at IBM will be more than happy to sell you some "high availability" hardware it still is best practice to institute a backup regime and to have an auditable disaster recovery plan.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply