Delete specific records from hashfile / hashed-file

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Delete specific records from hashfile / hashed-file

Post by kausmone »

Hi there,

I have a flat-file A and a hashed file B. Records in A are a subset of records in B. I need to delete those records in the hashfile B that are present in flat-file A. Can anyone help me on how to go about it in a server job/routine?

Thanks,
Kaus
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Use a UV stage with user-defined SQL containing a DELETE statement such as

Code: Select all

DELETE FROM tablename WHERE key = ?
Provide the key value from your text file.

If the hashed file has been created in a directory you will need a VOC pointer to it. Search the forum for the SETFILE command.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Post by kausmone »

Did you mean the UniVerse stage? I tried using the stage but am not sure what is to be filled in the Data Source Name etc. I tried filling in the name of the hashed-file (images_metadata_hashed) but it didn't work. I did a view data by putting in a SELECT SQL and got the following error:

Code: Select all

JB_SVTT_Delete_HashedFile_UV..UniVerse_1: DSD.BCIConnect connecting to images_metadata_hashed, call to SQLConnect failed.
SQLSTATE=IM002, DBMS.CODE=0
[DataStage][SQL Client][ODBC][DataDirect][ODBC lib] Data source name not found and no default driver specified
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

The data source name needs to be localuv which is pre-defined in your uvodbc.config file. User name and password will be disabled, because localuv is a pre-established connection to the DataStage project itself.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Post by kausmone »

Thanks Ray. I tried with localuv after confirming that it is defined in the uvodbc.config file. I gave the hashed-file name in the Table Name field of the Outputs-> General tab. Now I'm getting error "table does not exist"

Code: Select all

JB_SVTT_Delete_HashedFile_UV..UniVerse_1.DSLink2: DSD.BCIOpenR call to SQLExecDirect failed.
Statement was:SELECT "ship_code", "doc_key", "piece_id", "event_key", "data_source_type_id", "coda_timestamp", "etl_timestamp", "pcdoc_sent_timestamp" FROM "images_metadata_hashed" 
SQLSTATE=S0002, DBMS.CODE=950390
[DataStage][SQL Client][UNIVERSE]DataStage/SQL: Table "images_metadata_hashed" does not exist.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Your hashed file must be pathed, i.e. created in a directory. In that case you'd need to create a VOC record pointer to it before you can use a UV stage over the hashed file.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Post by kausmone »

Ah, ok, gotcha. I tried creating the original hashed file by using account name and when I accessed this using UV stage (with a SELECT query), it worked without any problems.

Thanks for your help Ray and Craig!

-kaus
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Another thought - why not logically delete the records? Rather than futz with all this, add a column and update it for the 'deleted' records. Check for the flag whenever you access the hashed file, treat it like a hashed miss when you find that it is set.

That will keep your account based hashed file out of your 'account' aka Project and put it back in the original directory without any extra shenanigans.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Post by kausmone »

I had to rule that out because the size of my hashed-file would keep on increasing with logical deletes. My process might encounter about a million records a day.

Now, when I try deleting, I am giving the query as below (user-defined) but it is not working. What is the correct way of specifying the input column? I am getting event_key and doc_key from my flat-file

Code: Select all

DELETE FROM "images_metadata_hashed" WHERE "event_key" = :event_key AND "doc_key" = :doc_key;
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

First generate sql to get the syntax right, then tweak from there. For example 'Replace existing rows completely' includes a delete. I'll let others chime in on the space issue but I don't believe deletes like this will actually get you any space back. :?

Not sure because I've never had a need to do this, I'll rebuild a hashed file before I'd ever try deleting anything from it.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Post by kausmone »

Thanks for the hint :)

For the record, the query needed to be:

Code: Select all

DELETE FROM "images_metadata_hashed" WHERE ("doc_key" = ? AND "event_key" = ?);
Can someone throw more light on whether or not I will be able to reclaim space this way?
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

kausmone wrote: Can someone throw more light on whether or not I will be able to reclaim space this way?
Why dont you check it out yourself by running 'ANALYZE.FILE images_metadata_hashed' command on your hashed file before and after your delete and check out the size.
This command can be run from the administrator.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Post by kausmone »

Cool, thanks. Will do
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You do not reclaim space through DELETE operations. Free space is preserved in hashed files against the likelihood that it will be needed again in the future.

The only command that can reclaim space (apart from deleting and re-creating the hashed file) is RESIZE.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kausmone
Participant
Posts: 39
Joined: Fri Sep 21, 2007 1:47 am
Location: Prague

Post by kausmone »

If the freed space is going to be reclaimed by the hashed-file, it still serves the purpose, since my hashed-file is going to get appended to every hour or so. I think I should stick to this approach of deleting records 'physically' instead of logical deletes, where there will be no reclaimable space?
Post Reply