Hi all,
Is it possible to compress the datasets in Datastage 7X. Also kindly help me to understand the advantages and disadvantages of compressing the datasets .
My Questions Extends further below .
Once the DS is compressed can we read the DS without uncompressing it ?
Is it possible to overwrite the compressed DS ?
Is it possible to create the dataset in compressed mode rather than creating it and compress it.
Will there be any space saving when we compress? Also do we have any compression ratio available.
Will the perfomance be affected when we compress a dataset ?
Thanks in Advance
Dataset Compression
Moderators: chulett, rschirm, roy
Dataset Compression
Regards Abee.
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
In a DWH system, we have more than 300 raw files for loading and many star models and interims . We are trying to reduce the space occupied by the Datasets which are created , since we are having some space crunch . Likewise in Oracle if we can able to compress the data and access them as well without hindering the perfomance , we would like to use them . Also we would like to know the compression ratio , since identify the optimal solution .
Regards Abee.
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You can compress a Data Set (you have to work out where all its data files are, of course) and doing so renders it unusable by DataStage. The gains are negligible because data are already in binary form within a Data Set.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
You can compress data using the "compress" stage without having to land the dataset to disk then compress. I suggest that the original poster create a couple of sample jobs, one with the compress stage and one without.
Run the job and find where the dataset persists its data on disk as defined in the configuration file or in the dataset descriptor file. Compare the sizes. I have managed to get some reasonable space saving (we are not talking huge) using the compress stage in particular using the g-zip setting. Note: nothing is free. You will be sacrificing performance for some space savings. You will also need to decompress the data before using it again using the "expand" stage.
Code: Select all
e.g. source stage -> compress stage -> dataset stage