Deleting descriptor file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
srini.dw
Premium Member
Premium Member
Posts: 186
Joined: Fri Aug 18, 2006 1:59 am
Location: Chennai

Deleting descriptor file

Post by srini.dw »

Hello,

The descriptor file is generated in path /is/DEV/datasets/PX_BEA.ds

The data file is generated in path is/node1/dataset

i.,e BEA.ds.dsadmin.DEV.0000.0000.0000.65e7.cf771bfe.0000.b324c6a7

My Question

1. Can we go ahead and delete the data file without deleting the original descriptor file.

2. What is the good method to remove the data file
orchadmin rm OR orchadmin delete command.

Thanks,
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

You seem to have a slight confusion on which files are what in a parallel dataset.

The dataset descriptor is the file which points to the dataset segments, describes the characteristics of the dataset and is typically named with the .ds extension. In your situation, this is the file /is/DEV/datasets/PX_BEA.ds

The dataset segments contain the actual data and have the names such as PX_BEA.ds.dsadmin.DEV.0000.0000.0000.65e7.cf771bfe.0000.b324c6a7

Please read Chapter 11: Managing data sets of the Parallel Job Developer Guide.

To answer your corrected questions:

1) Delete the dataset segments without deleting the dataset descriptor? Technically you can, but for what reason? The dataset is no longer valid once you delete a part of it

2) What is a good method to remove a dataset segment? Use the orchadmin command (rm and delete are synonymous). This will delete the entire dataset (segments and descriptor). Otherwise you must do the deletion manually.

A single question back to you: WHY?

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
srini.dw
Premium Member
Premium Member
Posts: 186
Joined: Fri Aug 18, 2006 1:59 am
Location: Chennai

Post by srini.dw »

Thanks for the reply.

In my case I got a requirement to delete dataset segments which has names such as PX_BEA.ds.dsadmin.DEV.0000.0000.0000.65e7.cf771bfe.0000.b324c6a7 located in is/node1/dataset.

Without knowing dataset descriptor file exist or do not exist in the seperate path can I go ahead and do the above task.

Thanks,
zulfi123786
Premium Member
Premium Member
Posts: 730
Joined: Tue Nov 04, 2008 10:14 am
Location: Bangalore

Post by zulfi123786 »

if you are sure that the dataset PX_BEA.ds will never be referenced you can go ahead and delete the segment files corresponding to PX_BEA.ds

Having done that also delete the descriptor file as it will now be worthless
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

srini.dw wrote:In my case I got a requirement to delete dataset segments which has names such as PX_BEA.ds.dsadmin.DEV.0000.0000.0000.65e7.cf771bfe.0000.b324c6a7 located in is/node1/dataset.
A single question back to you: WHY?
-craig

"You can never have too many knives" -- Logan Nine Fingers
srini.dw
Premium Member
Premium Member
Posts: 186
Joined: Fri Aug 18, 2006 1:59 am
Location: Chennai

Post by srini.dw »

We have been facing space issues in Dev environment, hence the requirement.

"if you are sure that the dataset PX_BEA.ds will never be referenced you can go ahead and delete the segment files corresponding to PX_BEA.ds"

1 Question
What happens if I delete BEA.ds.dsadmin.DEV.0000.0000.0000.65e7.cf771bfe.0000.b324c6a7 and the PX_BEA.ds file still remain.

We would be getting some error.

Thanks,
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Yes, the dataset will become unusable. Deleting individual segments makes no sense. Your requirement should read:

"if you are sure that the dataset PX_BEA.ds will never be referenced you can go ahead and delete the dataset and the corresponding PX_BEA.ds file"

That latter would be automatic if you use the included utilities for the deletion.
-craig

"You can never have too many knives" -- Logan Nine Fingers
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

If you are consistently running into space problems within the development environment, you should begin implementing a reasonable management policy within the development teams.

Suggestions to start off with:

- No long-term storage of parallel datasets. Once you have completed the assignment using the datasets, delete them
- No production-level quantities/volumes for files/datasets. This is a development environment, not a performance testing environment. If you MUST, use and get rid of it as soon as reasonably possible.
- Delete datasets using the appropriate methods: orchadmin or the dataset administration tool in the GUIs (which uses orchadmin anyway). Manually delete segments and descriptor files only when necessary (corruption/missing files/etc.)

While you may not be in a position to implement and enforce such policies, you can at least recommend them to others and begin practicing them yourself.

The system admins may also need to consider increasing and/or moving dataset storage allocations.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

srini.dw wrote:In my case I got a requirement to delete dataset segments which has names such as PX_BEA.ds.dsadmin.DEV.0000.0000.0000.65e7.cf771bfe.0000.b324c6a7 located in is/node1/dataset.
Resist stupid requirements!

Never delete part of a Data Set. Delete it all, using the Data Set management tool or the orchadmin command.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply