Loop Through Columns to change type and length

jackson.eyton · Post by **jackson.eyton** » Fri Nov 17, 2017 5:48 pm

Hi guys,
Is it possible to loop through the columns of a source and change the type and length of all columns of a particular type? Say for example I wanted to change all Character type columns to VarChar(256), is there an automated way to do that? Currently I have been doing this in a transform stage, going through column by column, I just had a job with over 600 columns corrupt on me and the backup won't load in either so I have to re-create that job. I'll be opening a case with IBM on that particular nugget.

jackson.eyton · Post by **jackson.eyton** » Wed Nov 22, 2017 9:40 am

Is this not possible?

chulett · Post by **chulett** » Wed Nov 22, 2017 10:56 am

So... you are looking for a way to change the metadata of columns in some other fashion other than one-by-one by hand? If so, have you tried exporting the job to a dsx file, editing the transformer entries (search/replace via your favorite editor) and then importing the job back into the repository? I would suggest you try that first with a test job with a small number to change, i.e. something you can afford to lose.

While there's a way to change / assign derivations "en masse" I don't recall a way to do that to the data type. Could be wrong though.

jackson.eyton · Post by **jackson.eyton** » Wed Nov 22, 2017 1:53 pm

Thanks chulett, I believe I looked at doing that once with the XML export format, I will check again with the default dsx.

chulett · Post by **chulett** » Wed Nov 22, 2017 2:47 pm

Way easier to work with IMHO.

jackson.eyton · Post by **jackson.eyton** » Wed Nov 22, 2017 3:35 pm

I did look into this further and the .dsx file is easier to read myself but I am unsure how have it parsed as of yet. I have some powershell experience so I took another crack at the XML file and I am able to create an XML object from the file, from here its broken down into an array of objects. I am stuck on getting the column details specifically and I will need to create a library of the SQL type definitions based on what DataStage designates but this is progress nonetheless. If/When I do end up getting a utility made that can parse your export and enact column type changes based on some parameters you pass it I will absolutely share it here from my GitHub. If anyone has more ideas I am all ears, otherwise expect an update here soonish.

jackson.eyton · Post by **jackson.eyton** » Tue Nov 28, 2017 5:01 pm

Great News! I was able to get this working for my use case. There are some limitations, the powershell functions that I have will ONLY change the output columns for a transform stage within the exported job. This was by design. It is completely possible to edit any part of the job and loop through edits with conditions. In my case the functions are designed to have the user point to the source XML file (exported job or set of jobs), enter a path to save the modified job, enter what column type the user wants to modify in bulk, set the type they want to change that column to, set the length and scale of the column as well. In my case I have the functions set to ONLY change the scale of a column IF the column already has a scale set, otherwise leave it at 0. Here are the links to the two function versions, one for single job export XML files, and another for multijob. I will make these unique functions later so as to prevent confusion, but for now it works fine, and I can't use the import-module cmdlet in my company anyway so I have to manually paste the functions in every time. Overall its still 99% faster than going through column by column.

Multi Job:
https://github.com/jacksoneyton/DataSta ... tiJob.psm1

Single Job:
https://github.com/jacksoneyton/DataSta ... ifier.psm1

How to use:
1. Export your datastage job to an XML file
2. Import the functions for your job export type
3. Run the Set-DSColumns cmdlet
4. Enter the full path to the XML file from Step 1
5. Enter the full file name and path where you want to save the modified XML
6. Enter the source column type ID number (definitions provided)
7. Enter the column type ID you want to change to (use the same ID as step 6 if you don't want to change it).
8. Enter the length you want
9. Enter the scale you want or 0 for no scale (will not affect anything that does not already have scale, not designed to mass assign scale could be modified to so if anyone wants)
10. Answer if you want to repeat steps 6-9 for any other columns
11. If no the file will be saved to the path specified in step 5.
12. Import that XML back into datastage and recompile the job.

Technically it should be safe to overwrite your jobs with the modified job import since you still have the original XML export untouched. HOWEVER, I have seen issues with both the XML and the dsx exports where they simply seem to be broken.... REALLY wish DataStage incorporated job versioning with rollback options.

chulett · Post by **chulett** » Tue Nov 28, 2017 6:41 pm

PowerShell Script Modules? Pretty sure those are part of the Forbidden List on my trusty ol' government issued laptop.

jackson.eyton · Post by **jackson.eyton** » Wed Nov 29, 2017 8:20 am

Technically they are for me as well, being at a bank, so I have to copy the code for the functions and paste it into a powershell console, have to do one at a time, after which I can use those functions until I close the console window. So I can't auto import any modules or anything.

jackson.eyton · Post by **jackson.eyton** » Wed Nov 29, 2017 1:28 pm

*FACE PALM*
https://www.ibm.com/support/knowledgece ... indow.html

This functionality was the entire purpose of what I scripted..... I need to work on my communication, I must not have explained what I wanted in a way that made any sense.

eostic · Post by **eostic** » Wed Nov 29, 2017 1:44 pm

Still...what you did is VERY cool, and you will probably find lots of future things where your research into .dsx and .dsx/xml will be a huge benefit! bravo!

Ernie

chulett · Post by **chulett** » Wed Nov 29, 2017 2:46 pm

chulett wrote:While there's a way to change / assign derivations "en masse" I don't recall a way to do that to the data type. Could be wrong though.

And frequently are, it would seem. I have no DataStage access but I'm thinking what you found is what was tickling the back of my mind...

jackson.eyton · Post by **jackson.eyton** » Wed Nov 29, 2017 3:35 pm

It was nice to be able to export 10 jobs to a single xml file today, run my powershell functions to change the outgoing column types and lengths against all jobs in one run and import them back in. So from a multijob standpoint it might have been worth it. Going forward I can't see needing to use it much now that I know about the column propagation window.