Useful Perl Scripts I have written

alanwoo · Post by **alanwoo** » Tue Aug 17, 2010 11:52 pm

Any ideas why would the DSX_Cutter.pl will be inserting new line or some special character before the .dsx e.g. test[].dsx

ray.wurlod · Post by **ray.wurlod** » Wed Aug 18, 2010 12:07 am

Only one possible reason - the Perl script specifies to do so.

OK, there is a second possibility, that the special characters are already in the DSX file being processed.

alanwoo · Post by **alanwoo** » Wed Aug 18, 2010 12:09 am

no worries, will keep playing around with it Ray, long time no see, how's Canberra?

ray.wurlod · Post by **ray.wurlod** » Wed Aug 18, 2010 1:22 am

Cold mornings but not bad otherwise.

vdr123 · Post by **vdr123** » Wed Aug 18, 2010 6:04 pm

the script might be done for old DSX versions, not sure if something changed in new ones or the version of DS may effect it.(8.0 was buggy without patches)

alanwoo · Post by **alanwoo** » Wed Aug 18, 2010 6:54 pm

Forgot to dos2unix.
Works perfectly now.

cgi_bi · Post by **cgi_bi** » Thu Dec 15, 2011 12:29 pm

Here is a version of DSX_cutter.pl that add a flag to not export the HEADER into the target DSX.

It is useful for us to prepare the code before commit it into our Version Control system

Code: Select all

#!/usr/local/bin/perl -w 
#
################################################################################################################## 
# 
# Documentation Header 
# 
################################################################################################################### 
# Script Name: DSX_Cutter.pl 
# Author     : John Miceli 
# Create Date: 04/26/2007 
# 
# Purpose: This script is meant to easily take a single .dsx file as it was exported from DataStage and split it 
#       into it's component parts.  There will be one job, routine, transform, etc., per file.  The file 
#       will be named after the Identifier for each DSRECORD within a section or each DSJOB with all 
#       appropriate subrecords attached to it. 
# 
# Test results: 
# 
# Operating command: perl DSX_Cutter.pl <-nh> filename.dsx 
#
#										<-nh> if the flag "-nh" is added the target DSX will be produce without HEADER section
#        
# Output     : Individual files with the appropriate header for importing into DataStage for each component. 
# 
# Caveats    : I don't claim to be a Perl maven.  I'm sure someone smarter than me could have done this with 
#       fewer lines of code.  However, it works quickly and efficiently.  It is not complicated at all, 
#       hence the resulting performance.  Part of keeping it simple is that I may be missing some of the 
#       fluff found in lots of Perl code that probably should be there depending on your level of 
#       paranoia ;-) 
# 
#   Any and all input/improvements to the script are welcome.  Either email me at 'jmiceli@wrberkley.com' or 
#   message 'jdmiceli' on DSXchange.com and I will get back to you if I can. 
# 
# 
################################################################################################################### 
# 
# Disclaimer: This script is provided AS IS and I take no responsibility for it's use or misuse.  It's use is completely 
#       the responsibility of YOU the user and is at your own risk.  I do not claim perfection, nor do I warrantee or 
#       guarantee the operation of the script.  It appears to work and I am using it and that's all the further I 
#       am willing to go!! 
#           
###################################################################################################################
#
#			Hugo Poissant 2011-12-15 Add the "-nh" flag
#
###################################################################################################################
# 
# Code Block 
# 
################################################################################################################### 
use strict; 
use warnings; 


#  define scalars 
my $dsx;        # name of the file to be processed 
my $value;          # holds name of parameters file 
my $line;            # value for line for comparison 
my @work;      # array to hold working rows 
my @header;      # array to hold the header rows 
my @jobs;      # array to hold the job rows 
my @transforms;      # array to hold the transform rows 
my @routines;      # array to hold the routines rows 
my @tabledefs;      # array to hold the tabledefs rows 
my @stagetypes;      # array to hold the stagetypes rows 
my @datatypes;      # array to hold the datatypes rows 
my @containers;      # array to hold the shared container rows 
my @section;      # array to hold information for sectional parts of the dsx (transforms, routines, etc.) 
my $element;      # element of the array being processed 
my $tag;      # tag holder 
my $on;         # off/on flag 
my $dsname;      # name to be assigned to the new file 
my $cnt;      # generic count for checking things 
my $rowchk;      # counter for catching begin and end tags right next to each other 
my $in_section;      # flag showing a routine file is being worked on 
my $no_header;				# flag to specify if the HEADER have to be include in the taget DSX 


# 
# initialize some things if needed 
# 
$on = 0; 
$in_section = 0; 

# 
# collect the filename from the input argument and create the working filename 
# 
if ("$ARGV[0]" eq "-nh")
{
		$no_header = 1;
		$dsx = "$ARGV[1]";
	  print "Target DSX produce WITHOUT header\n"; 
}
else
{
	$no_header = 0; 
	$dsx = "$ARGV[0]"; 
	print "Target DSX produce WITH header\n"; 
}


 open (OLD, "< $dsx")||die "Unable to open $dsx for reading!\n"; 

 while ($line = <OLD>) 
 { 
   chomp $line; 

   push(@work, "$line\n"); 
} 

close OLD; 

# now that it is all in memory, parse out the sections into their own arrays 
# for processing 
foreach $element (@work) 
{ 
   chomp $element; 

   # determine which section we are in and flag it accordingly 
   if ($element =~ /BEGIN HEADER/ && $no_header == 0) 
   { 
      $on = 1;  ## flag for HEADER records 
   }    
   elsif ($element =~ /BEGIN DSJOB/) 
   { 
      $on = 2;  ## flag for DSROUTINES records 
   } 
   elsif ($element =~ /BEGIN DSROUTINES/) 
   { 
      $on = 3;  ## flag for DSROUTINES records 
   } 
   elsif ($element =~ /BEGIN DSTRANSFORMS/) 
   { 
      $on = 4;  ## flag for DSTRANSFORMS records 
   } 
   elsif ($element =~ /BEGIN DSTABLEDEFS/) 
   { 
      $on = 5;  ## flag for DSTABLEDEFS records 
   } 
   elsif ($element =~ /BEGIN DSSTAGETYPES/) 
   { 
      $on = 6;  ## flag for DSSTAGETYPES records 
   } 
   elsif ($element =~ /BEGIN DSDATATYPES/) 
   { 
      $on = 7;  ## flag for DSDATATYPES records 
   } 
   elsif ($element =~ /BEGIN DSSHAREDCONTAINER/) 
   { 
      $on = 8;  ## flag for DSDATATYPES records 
   } 

   # separate out each section to a name array for it 
   if ($on == 1) 
   { 
      push(@header, "$element\n"); 
      if ($element =~ /END HEADER/) 
      { 
         $on = 0; 
      } 
   } 

   if ($on == 2) 
   { 
      push(@jobs, "$element\n"); 
      if ($element =~ /END DSJOB/) 
      { 
         $on = 0; 
      } 
   } 
    
   if ($on == 3) 
   { 
      push(@routines, "$element\n"); 
      if ($element =~ /END DSROUTINES/) 
      { 
         $on = 0; 
      } 
   } 
    
   if ($on == 4) 
   { 
      push(@transforms, "$element\n"); 
      if ($element =~ /END DSTRANSFORMS/) 
      { 
         $on = 0; 
      } 
   } 
    
   if ($on == 5) 
   { 
      push(@tabledefs, "$element\n"); 
      if ($element =~ /END DSTABLEDEFS/) 
      { 
         $on = 0; 
      } 
   } 
    
   if ($on == 6) 
   { 
      push(@stagetypes, "$element\n"); 
      if ($element =~ /END DSSTAGETYPES/) 
      { 
         $on = 0; 
      } 
   } 
    
   if ($on == 7) 
   { 
      push(@datatypes, "$element\n"); 
      if ($element =~ /END DSDATATYPES/) 
      { 
         $on = 0; 
      } 
   } 
    
   if ($on == 8) 
   { 
      push(@containers, "$element\n"); 
      if ($element =~ /END DSSHAREDCONTAINER/) 
      { 
         $on = 0; 
      } 
   } 
    
} 

############################################################################################################## 
# process the jobs into their own individual files 
############################################################################################################## 

# DSJOB section 
foreach $element(@jobs) 
{ 
   chomp $element; 
   push(@section,"$element\n"); 
    
   if (($element =~ /BEGIN DSJOB/) and ($in_section == 0)) 
   { 
      # flag the jobs section as being active 
      $in_section = 1; 
   } 
   elsif ($in_section == 1) 
   { 
      # check to see if we are at the end of the job 
      if ($element =~ /END DSJOB/) 
      { 
         $cnt = scalar(@section); 
         if ($cnt > 2) 
         { 
            # reset the in_section counter 
            $in_section = 0; 
             
            # extract the name of the file using the Identifier 
            $value = $section[1]; 
            chomp $value;   # remove the end of line stuff 
            $value =~ s/Identifier//g; 
            $value =~ s/"//g;   # handle quotes 
            $value =~ s/ //g;   # handle spaces 
            $value =~ s/([\.])/_/g; # handle periods if they exist 
            $value =~ s/([\\\\])/_/g; # handle double back slashes if they exist 
            $value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist 
            $value =~ s/([\\])/_/g; # handle back slashes if they exist 
            $value =~ s/([\/])/_/g; # handle forward slashes if they exist 
            $value =~ s/([:])/_/g;  # handle colons if they exist 

            # name the file 
            $dsname = "DSJOB_" . "$value.dsx"; 
            print "Creating job file for $dsname\n"; 

            open (NEW, "> $dsname")|| die "Unable to create $dsname file!"; 

            # output the header to the file 
            foreach $line(@header) 
            { 
               chomp $line; 
               print NEW "$line\n"; 
            } 
             
            # output the lines of this particular job 
            print NEW "BEGIN DSJOB\n"; 
            foreach $line(@section) 
            { 
               next if (($line =~ /BEGIN DSJOB/) or ($line =~ /END DSJOB/)); 
               chomp $line; 
               print NEW "$line\n"; 
            } 
            print NEW "END DSJOB\n"; 
             
            # when done pushing the lines out, close the file 
            close NEW; 
         } 

         # if the count was 2 or less or has been processed, then ditch what is in the array 
         @section = (); 
         $value = ''; 
      } 
   } 
} 

# DSTRANSFORMS section 
$cnt = 0; 

# get rid of dupes and empty pairs 
@work= (); 
foreach $element(@transforms) 
{ 
   chomp $element; 
   next if (($element =~ /BEGIN DSTRANSFORMS/) or ($element =~ /END DSTRANSFORMS/)); 
   push(@work, "$element\n"); 
} 
@transforms = (); 
@transforms = @work; 
@work = (); 

foreach $element(@transforms) 
{ 
   chomp $element; 
   push(@section,"$element\n"); 
    
   if (($element =~ /BEGIN DSRECORD/) and ($in_section == 0)) 
   { 
      # flag the jobs section as being active 
      $in_section = 1; 
   } 
   elsif ($in_section == 1) 
   { 
      # check to see if we are at the end of the job 
      if ($element =~ /END DSRECORD/) 
      { 
         $cnt = scalar(@section); 
         if ($cnt > 2) 
         { 
            # reset the in_section counter 
            $in_section = 0; 
             
            # extract the name of the file using the Identifier 
            # the identifier could be at position 1 or 2 depending on 
            # where we are in the process 
             
            # finding the first identifier - this is needed if there is a trucation of 
            # some sort. 
            # 
            # NEED TO MAKE THIS MORE ROBUST!!!! 
            if ($section[1] =~ /Identifier/) 
            { 
               $value = $section[1]; 
            } 
            elsif ($section[2] =~ /Identifier/) 
            { 
               $value = $section[2]; 
            } 
            elsif ($section[3] =~ /Identifier/) 
            { 
               $value = $section[3]; 
            } 
            elsif ($section[4] =~ /Identifier/) 
            { 
               $value = $section[4]; 
            } 
            elsif ($section[5] =~ /Identifier/) 
            { 
               $value = $section[5]; 
            } 

            chomp $value;   # remove the end of line stuff 
            $value =~ s/Identifier//g; 
            $value =~ s/"//g;   # handle quotes 
            $value =~ s/ //g;   # handle spaces 
            $value =~ s/([\.])/_/g; # handle periods if they exist 
            $value =~ s/([\\\\])/_/g; # handle double back slashes if they exist 
            $value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist 
            $value =~ s/([\\])/_/g; # handle back slashes if they exist 
            $value =~ s/([\/])/_/g; # handle forward slashes if they exist 
            $value =~ s/([:])/_/g;  # handle colons if they exist 

            # name the file 
            $dsname = "DSTRANSFORMS_" . "$value.dsx"; 
            print "Creating transform file for $dsname\n"; 

            open (NEW, "> $dsname")|| die "Unable to create $dsname file!"; 

            # output the header to the file 
            foreach $line(@header) 
            { 
               chomp $line; 
               print NEW "$line\n"; 
            } 
             
            # output the lines of this particular job 
            print NEW "BEGIN DSTRANSFORMS\n"; 
            foreach $line(@section) 
            { 
               chomp $line; 
               print NEW "$line\n"; 
            } 
            print NEW "END DSTRANSFORMS\n"; 
             
            # when done pushing the lines out, close the file 
            close NEW; 
         } 

         # if the count was 2 or less or has been processed, then ditch what is in the array 
         @section = (); 
         $value = ''; 
      } 
   } 
} 

# DSROUTINES section 
# 
# get rid of dupes and empty pairs 
@work= (); 
foreach $element(@routines) 
{ 
   chomp $element; 
   next if (($element =~ /BEGIN DSROUTINES/) or ($element =~ /END DSROUTINES/)); 
   push(@work, "$element\n"); 
} 
@routines = (); 
@routines = @work; 
@work = (); 

foreach $element(@routines) 
{ 
   chomp $element; 
   push(@section,"$element\n"); 
    
   if (($element =~ /BEGIN DSRECORD/) and ($in_section == 0)) 
   { 
      # flag the jobs section as being active 
      $in_section = 1; 
   } 
   elsif ($in_section == 1) 
   { 
      # check to see if we are at the end of the job 
      if ($element =~ /END DSUBINARY/) 
      { 
         $cnt = scalar(@section); 
         if ($cnt > 2) 
         { 
            # reset the in_section counter 
            $in_section = 0; 
             
            # extract the name of the file using the Identifier 
            # the identifier could be at position 1 or 2 depending on 
            # where we are in the process 
             
            # finding the first identifier - this is needed if there is a trucation of 
            # some sort. 
            # 
            # NEED TO MAKE THIS MORE ROBUST!!!! 
            if ($section[1] =~ /Identifier/) 
            { 
               $value = $section[1]; 
            } 
            elsif ($section[2] =~ /Identifier/) 
            { 
               $value = $section[2]; 
            } 
            elsif ($section[3] =~ /Identifier/) 
            { 
               $value = $section[3]; 
            } 
            elsif ($section[4] =~ /Identifier/) 
            { 
               $value = $section[4]; 
            } 
            elsif ($section[5] =~ /Identifier/) 
            { 
               $value = $section[5]; 
            } 

            chomp $value;   # remove the end of line stuff 
            $value =~ s/Identifier//g; 
            $value =~ s/"//g;   # handle quotes 
            $value =~ s/ //g;   # handle spaces 
            $value =~ s/([\.])/_/g; # handle periods if they exist 
            $value =~ s/([\\\\])/_/g; # handle double back slashes if they exist 
            $value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist 
            $value =~ s/([\\])/_/g; # handle back slashes if they exist 
            $value =~ s/([\/])/_/g; # handle forward slashes if they exist 
            $value =~ s/([:])/_/g;  # handle colons if they exist 

            # name the file 
            $dsname = "DSROUTINES_" . "$value.dsx"; 
            print "Creating routine file for $dsname\n"; 

            open (NEW, "> $dsname")|| die "Unable to create $dsname file!"; 

            # output the header to the file 
            foreach $line(@header) 
            { 
               chomp $line; 
               print NEW "$line\n"; 
            } 
             
            # output the lines of this particular job 
            print NEW "BEGIN DSROUTINES\n"; 
            foreach $line(@section) 
            { 
               # print "$line\n" if (($line =~ /BEGIN DSROUTINES/) or ($line =~ /END DSROUTINES/)); 
               chomp $line; 
               print NEW "$line\n"; 
            } 
            print NEW "END DSROUTINES\n"; 
             
            # when done pushing the lines out, close the file 
            close NEW; 
         } 

         # if the count was 2 or less or has been processed, then ditch what is in the array 
         @section = (); 
         $value = ''; 
      } 
   } 
} 

# DSTABLEDEFS section 
# 
# get rid of dupes and empty pairs 
@work= (); 
foreach $element(@tabledefs) 
{ 
   chomp $element; 
   next if (($element =~ /BEGIN DSTABLEDEFS/) or ($element =~ /END DSTABLEDEFS/)); 
   push(@work, "$element\n"); 
} 
@tabledefs = (); 
@tabledefs = @work; 
@work = (); 

foreach $element(@tabledefs) 
{ 
   chomp $element; 
   push(@section,"$element\n"); 
    
   if (($element =~ /BEGIN DSRECORD/) and ($in_section == 0)) 
   { 
      # flag the jobs section as being active 
      $in_section = 1; 
   } 
   elsif ($in_section == 1) 
   { 
      # check to see if we are at the end of the job 
      if ($element =~ /END DSRECORD/) 
      { 
         $cnt = scalar(@section); 
         if ($cnt > 2) 
         { 
            # reset the in_section counter 
            $in_section = 0; 
             
            # extract the name of the file using the Identifier 
            # the identifier could be at position 1 or 2 depending on 
            # where we are in the process 
             
            # finding the first identifier - this is needed if there is a trucation of 
            # some sort. 
            # 
            # NEED TO MAKE THIS MORE ROBUST!!!! 
            if ($section[1] =~ /Identifier/) 
            { 
               $value = $section[1]; 
            } 
            elsif ($section[2] =~ /Identifier/) 
            { 
               $value = $section[2]; 
            } 
            elsif ($section[3] =~ /Identifier/) 
            { 
               $value = $section[3]; 
            } 
            elsif ($section[4] =~ /Identifier/) 
            { 
               $value = $section[4]; 
            } 
            elsif ($section[5] =~ /Identifier/) 
            { 
               $value = $section[5]; 
            } 

            chomp $value;   # remove the end of line stuff 
            $value =~ s/Identifier//g; 
            $value =~ s/"//g;   # handle quotes 
            $value =~ s/ //g;   # handle spaces 
            $value =~ s/([\.])/_/g; # handle periods if they exist 
            $value =~ s/([\\\\])/_/g; # handle double back slashes if they exist 
            $value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist 
            $value =~ s/([\\])/_/g; # handle back slashes if they exist 
            $value =~ s/([\/])/_/g; # handle forward slashes if they exist 
            $value =~ s/([:])/_/g;  # handle colons if they exist 

            # name the file 
            $dsname = "DSTABLEDEFS_" . "$value.dsx"; 
            print "Creating tabledef file for $dsname\n"; 

            open (NEW, "> $dsname")|| die "Unable to create $dsname file!"; 

            # output the header to the file 
            foreach $line(@header) 
            { 
               chomp $line; 
               print NEW "$line\n"; 
            } 
             
            # output the lines of this particular job 
            print NEW "BEGIN DSTABLEDEFS\n"; 
            foreach $line(@section) 
            { 
               chomp $line; 
               print NEW "$line\n"; 
            } 
            print NEW "END DSTABLEDEFS\n"; 
             
            # when done pushing the lines out, close the file 
            close NEW; 
         } 

         # if the count was 2 or less or has been processed, then ditch what is in the array 
         @section = (); 
         $value = ''; 
      } 
   } 
} 

# DSSTAGETYPES section 
# 
# get rid of dupes and empty pairs 
@work= (); 
foreach $element(@stagetypes) 
{ 
   chomp $element; 
   next if (($element =~ /BEGIN DSSTAGETYPES/) or ($element =~ /END DSSTAGETYPES/)); 
   push(@work, "$element\n"); 
} 
@stagetypes = (); 
@stagetypes = @work; 
@work = (); 

foreach $element(@stagetypes) 
{ 
   chomp $element; 
   push(@section,"$element\n"); 
    
   if (($element =~ /BEGIN DSRECORD/) and ($in_section == 0)) 
   { 
      # flag the jobs section as being active 
      $in_section = 1; 
   } 
   elsif ($in_section == 1) 
   { 
      # check to see if we are at the end of the job 
      if ($element =~ /END DSRECORD/) 
      { 
         $cnt = scalar(@section); 
         if ($cnt > 2) 
         { 
            # reset the in_section counter 
            $in_section = 0; 
             
            # extract the name of the file using the Identifier 
            # the identifier could be at position 1 or 2 depending on 
            # where we are in the process 
             
            # finding the first identifier - this is needed if there is a trucation of 
            # some sort. 
            # 
            # NEED TO MAKE THIS MORE ROBUST!!!! 
            if ($section[1] =~ /Identifier/) 
            { 
               $value = $section[1]; 
            } 
            elsif ($section[2] =~ /Identifier/) 
            { 
               $value = $section[2]; 
            } 
            elsif ($section[3] =~ /Identifier/) 
            { 
               $value = $section[3]; 
            } 
            elsif ($section[4] =~ /Identifier/) 
            { 
               $value = $section[4]; 
            } 
            elsif ($section[5] =~ /Identifier/) 
            { 
               $value = $section[5]; 
            } 

            chomp $value;   # remove the end of line stuff 
            $value =~ s/Identifier//g; 
            $value =~ s/"//g;   # handle quotes 
            $value =~ s/ //g;   # handle spaces 
            $value =~ s/([\.])/_/g; # handle periods if they exist 
            $value =~ s/([\\\\])/_/g; # handle double back slashes if they exist 
            $value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist 
            $value =~ s/([\\])/_/g; # handle back slashes if they exist 
            $value =~ s/([\/])/_/g; # handle forward slashes if they exist 
            $value =~ s/([:])/_/g;  # handle colons if they exist 

            # name the file 
            $dsname = "DSSTAGETYPES_" . "$value.dsx"; 
            print "Creating stagetypes file for $dsname\n"; 

            open (NEW, "> $dsname")|| die "Unable to create $dsname file!"; 

            # output the header to the file 
            foreach $line(@header) 
            { 
               chomp $line; 
               print NEW "$line\n"; 
            } 
             
            # output the lines of this particular job 
            print NEW "BEGIN DSSTAGETYPES\n"; 
            foreach $line(@section) 
            { 
               next if (($line =~ /BEGIN DSSTAGETYPES/) or ($line =~ /END DSSTAGETYPES/)); 
               chomp $line; 
               print NEW "$line\n"; 
            } 
            print NEW "END DSSTAGETYPES\n"; 
             
            # when done pushing the lines out, close the file 
            close NEW; 
         } 

         # if the count was 2 or less or has been processed, then ditch what is in the array 
         @section = (); 
         $value = ''; 
      } 
   } 
} 

# DSDATATYPES section 
# 
# get rid of dupes and empty pairs 
@work= (); 
foreach $element(@datatypes) 
{ 
   chomp $element; 
   next if (($element =~ /BEGIN DSDATATYPES/) or ($element =~ /END DSDATATYPES/)); 
   push(@work, "$element\n"); 
} 
@datatypes = (); 
@datatypes = @work; 
@work = (); 

foreach $element(@datatypes) 
{ 
   chomp $element; 
   push(@section,"$element\n"); 
    
   if (($element =~ /BEGIN DSRECORD/) and ($in_section == 0)) 
   { 
      # flag the jobs section as being active 
      $in_section = 1; 
   } 
   elsif ($in_section == 1) 
   { 
      # check to see if we are at the end of the job 
      if ($element =~ /END DSRECORD/) 
      { 
         $cnt = scalar(@section); 
         if ($cnt > 2) 
         { 
            # reset the in_section counter 
            $in_section = 0; 
             
            # extract the name of the file using the Identifier 
            # the identifier could be at position 1 or 2 depending on 
            # where we are in the process 
             
            # finding the first identifier - this is needed if there is a trucation of 
            # some sort. 
            # 
            # NEED TO MAKE THIS MORE ROBUST!!!! 
            if ($section[1] =~ /Identifier/) 
            { 
               $value = $section[1]; 
            } 
            elsif ($section[2] =~ /Identifier/) 
            { 
               $value = $section[2]; 
            } 
            elsif ($section[3] =~ /Identifier/) 
            { 
               $value = $section[3]; 
            } 
            elsif ($section[4] =~ /Identifier/) 
            { 
               $value = $section[4]; 
            } 
            elsif ($section[5] =~ /Identifier/) 
            { 
               $value = $section[5]; 
            } 

            chomp $value;   # remove the end of line stuff 
            $value =~ s/Identifier//g; 
            $value =~ s/"//g;   # handle quotes 
            $value =~ s/ //g;   # handle spaces 
            $value =~ s/([\.])/_/g; # handle periods if they exist 
            $value =~ s/([\\\\])/_/g; # handle double back slashes if they exist 
            $value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist 
            $value =~ s/([\\])/_/g; # handle back slashes if they exist 
            $value =~ s/([\/])/_/g; # handle forward slashes if they exist 
            $value =~ s/([:])/_/g;  # handle colons if they exist 

            # name the file 
            $dsname = "DSDATATYPES_" . "$value.dsx"; 
            print "Creating datatypes file for $dsname\n"; 

            open (NEW, "> $dsname")|| die "Unable to create $dsname file!"; 

            # output the header to the file 
            foreach $line(@header) 
            { 
               chomp $line; 
               print NEW "$line\n"; 
            } 
             
            # output the lines of this particular job 
            print NEW "BEGIN DSDATATYPES\n"; 
            foreach $line(@section) 
            { 
               next if (($line =~ /BEGIN DSDATATYPES/) or ($line =~ /END DSDATATYPES/)); 
               chomp $line; 
               print NEW "$line\n"; 
            } 
            print NEW "END DSDATATYPES\n"; 
             
            # when done pushing the lines out, close the file 
            close NEW; 
         } 

         # if the count was 2 or less or has been processed, then ditch what is in the array 
         @section = (); 
         $value = ''; 
      } 
   } 
} 

# DSSHAREDCONTAINER section 
# 
# get rid of dupes and empty pairs 
@work= (); 
foreach $element(@containers) 
{ 
   chomp $element; 
   next if (($element =~ /BEGIN DSSHAREDCONTAINER/) or ($element =~ /END DSSHAREDCONTAINER/)); 
   push(@work, "$element\n"); 
} 
@containers = (); 
@containers = @work; 
@work = (); 

foreach $element(@containers) 
{ 
   chomp $element; 
   push(@section,"$element\n"); 
    
   if (($element =~ /BEGIN DSSHAREDCONTAINER/) and ($in_section == 0)) 
   { 
      # flag the jobs section as being active 
      $in_section = 1; 
   } 
   elsif ($in_section == 1) 
   { 
      # check to see if we are at the end of the job 
      if ($element =~ /END DSSHAREDCONTAINER/) 
      { 
         $cnt = scalar(@section); 
         if ($cnt > 2) 
         { 
            # reset the in_section counter 
            $in_section = 0; 
             
            # extract the name of the file using the Identifier 
            # the identifier could be at position 1 or 2 depending on 
            # where we are in the process 
             
            # finding the first identifier - this is needed if there is a trucation of 
            # some sort. 
            # 
            # NEED TO MAKE THIS MORE ROBUST!!!! 
            if ($section[1] =~ /Identifier/) 
            { 
               $value = $section[1]; 
            } 
            elsif ($section[2] =~ /Identifier/) 
            { 
               $value = $section[2]; 
            } 
            elsif ($section[3] =~ /Identifier/) 
            { 
               $value = $section[3]; 
            } 
            elsif ($section[4] =~ /Identifier/) 
            { 
               $value = $section[4]; 
            } 
            elsif ($section[5] =~ /Identifier/) 
            { 
               $value = $section[5]; 
            } 

            chomp $value;   # remove the end of line stuff 
            $value =~ s/Identifier//g; 
            $value =~ s/"//g;   # handle quotes 
            $value =~ s/ //g;   # handle spaces 
            $value =~ s/([\.])/_/g; # handle periods if they exist 
            $value =~ s/([\\\\])/_/g; # handle double back slashes if they exist 
            $value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist 
            $value =~ s/([\\])/_/g; # handle back slashes if they exist 
            $value =~ s/([\/])/_/g; # handle forward slashes if they exist 
            $value =~ s/([:])/_/g;  # handle colons if they exist 

            # name the file 
            $dsname = "DSSHAREDCONTAINER_" . "$value.dsx"; 
            print "Creating shared container file for $dsname\n"; 

            open (NEW, "> $dsname")|| die "Unable to create $dsname file!"; 

            # output the header to the file 
            foreach $line(@header) 
            { 
               chomp $line; 
               print NEW "$line\n"; 
            } 
             
            # output the lines of this particular job 
            print NEW "BEGIN DSSHAREDCONTAINER\n"; 
            foreach $line(@section) 
            { 
               next if (($line =~ /BEGIN DSSHAREDCONTAINER/) or ($line =~ /END DSSHAREDCONTAINER/)); 
               chomp $line; 
               print NEW "$line\n"; 
            } 
            print NEW "END DSSHAREDCONTAINER\n"; 
             
            # when done pushing the lines out, close the file 
            close NEW; 
         } 

         # if the count was 2 or less or has been processed, then ditch what is in the array 
         @section = (); 
         $value = ''; 
      } 
   } 
}

ray.wurlod · Post by **ray.wurlod** » Thu Dec 15, 2011 2:39 pm

Delightful cleanly written and formatted code. Well done and thank you.

Dsnew · Post by **Dsnew** » Thu Dec 15, 2011 4:58 pm

After running DSX_Cutter.pl Does anybody get a special character [?] before the .dsx

Code: Select all

 Example DSJOB_JOB_NAME?.dsx

or is it just me

Any suggestion folks ?

jdmiceli · Post by **jdmiceli** » Thu May 23, 2019 1:20 am

Realizing this response is coming 7 1/2 years late: yes, I have seen this happen. I am not sure why, but it appeared in the last year or two depending one which version of Windows you run it on. I have revised the script to fix the problem and I have also added in functionality that was not part of Server Edition 8.1 when I originally wrote the script. Now the script will handle parameter sets and some other minor fixes, including the thing with the filename. Hopefully, people are still finding these scripts useful

Bestest!

Code: Select all

#!/usr/local/bin/perl -w

###################################################################################################################
#
# Documentation Header
#
###################################################################################################################
# Script Name: DSX_Cutter.pl
# Author     : John Miceli
# Create Date: 04/26/2007
#
# Purpose: This script is meant to easily take a single .dsx file as it was exported from DataStage and split it
# 		into it's component parts.  There will be one job, routine, transform, etc., per file.  The file
# 		will be named after the Identifier for each DSRECORD within a section or each DSJOB with all
# 		appropriate subrecords attached to it.
#
# Test results:
#
# Operating command: perl DSX_Cutter.pl filename.dsx
#
# Output     : Individual files with the appropriate header for importing into DataStage for each component.
#
# Caveats    : I don't claim to be a Perl maven.  I'm sure someone smarter than me could have done this with
# 		fewer lines of code.  However, it works quickly and efficiently.  It is not complicated at all,
# 		hence the resulting performance.  Part of keeping it simple is that I may be missing some of the
# 		fluff found in lots of Perl code that probably should be there depending on your level of
# 		paranoia ;-)
#
#	Any and all input/improvements to the script are welcome.  Either email me at 'jmiceli@wrberkley.com' or
#	message 'jdmiceli' on DSXchange.com and I will get back to you if I can.
#
#  Date          Ref.       Coder     Notes
#  04/26/2007    ----         jdm      Initial script creation
#  03/30/2009    2009-1       jdm      Added capability to have script cut files into directories as described in the
#                                      category.  This was requested by a user on DSXchange.com. I am making it a par-
#                                      ameter so that it is essentially configurable.
#  11/14/2018	 	          jdm      Added capability to handle Parametersets. Functionality did not exist before.
#  11/14/2018	 	          jdm      Tried to fix extra character(s) showing up in file name. Not sure why it happens.
#										 Seems to be dependent on which version of Windows and/or which Perl engine is used
###################################################################################################################
#
# Disclaimer: This script is provided AS IS and I take no responsibility for it's use or misuse.  It's use is completely
#       the responsibility of YOU the user and is at your own risk.  I do not claim perfection, nor do I warrantee or
#       guarantee the operation of the script.  It appears to work and I am using it and that's all the further I
#       am willing to go!!
#
###################################################################################################################
#
# Code Block
#
###################################################################################################################
use strict;
use warnings;
use File::Path;  # 2009-1 - used to create the directory structure quickly
#perl2exe_include File::Path

#  define scalars
my $dsx;  		# name of the file to be processed
my $value;    		# holds name of parameters file
my $line;      		# value for line for comparison
my @dirs;       	# array to hold category listings from jobs  # 2009-1
my $dir;        	# scalar to hold the Category for naming the files  # 2009-1
my @work;		# array to hold working rows
my @header;		# array to hold the header rows
my @jobs;		# array to hold the job rows
my @transforms;		# array to hold the transform rows
my @routines;		# array to hold the routines rows
my @tabledefs;		# array to hold the tabledefs rows
my @stagetypes;		# array to hold the stagetypes rows
my @datatypes;		# array to hold the datatypes rows
my @containers;		# array to hold the shared container rows
my @section;		# array to hold information for sectional parts of the dsx (transforms, routines, etc.)
my @parametersets;      # array to hold the parameterset rows
my $element;		# element of the array being processed
my $tag;		# tag holder
my $on;			# off/on flag
my $dsname;		# name to be assigned to the new file
my $cnt;		# generic count for checking things
my $rowchk;		# counter for catching begin and end tags right next to each other
my $in_section;		# flag showing a routine file is being worked on


#
# initialize some things if needed
#
$on = 0;
$in_section = 0;

#
# collect the filename from the input argument and create the working filename
#
$dsx = "$ARGV[0]";

 open (OLD, "< $dsx")||die "Unable to open $dsx for reading!\n";

 while ($line = <OLD>)
 {
	chomp $line;

	push(@work, "$line\n");

#    #### 2009-1 #####
#    # capture any lines that are categories
#    if ($line =~ /Category \"/)
#    {
#		 $line =~ s/([\\\\])/\\/g; # handle double back slashes if they exist
#		 $line =~ s/([\/\/])/\//g; # handle double forward slashes if they exist
#         push (@dirs, "$line\n");
#    }
}

close OLD;


##### 2009-1 #####
## process the values in the @dirs directory to keep only unique rows
#my %seen = ();
#my @uniq = ();
#foreach $line(@dirs) {
#    unless ($seen{$line}) {
#        # if we get here, we have not seen it before
#        $seen{$line} = 1;
#        push(@uniq, $line);
#    }
#}
#undef @dirs;
#
## Create the directory structure found in the @uniq array
## Reference:  mkpath(['/foo/bar/baz', 'blurfl/quux'], 1, 0711);
#foreach $line(@uniq)
#{
#   chomp $line;
#
#   # this will default to the directory you are running this program from
#   mkpath(["'$line'"], 1, 0777);
#}
##### 2009-1 #####


# now that it is all in memory, parse out the sections into their own arrays
# for processing
foreach $element (@work)
{
	chomp $element;

	# determine which section we are in and flag it accordingly
	if ($element =~ /BEGIN HEADER/)
	{
		$on = 1;  ## flag for HEADER records
	}
	elsif ($element =~ /BEGIN DSJOB/)
	{
		$on = 2;  ## flag for DSROUTINES records
	}
	elsif ($element =~ /BEGIN DSROUTINES/)
	{
		$on = 3;  ## flag for DSROUTINES records
	}
	elsif ($element =~ /BEGIN DSTRANSFORMS/)
	{
		$on = 4;  ## flag for DSTRANSFORMS records
	}
	elsif ($element =~ /BEGIN DSTABLEDEFS/)
	{
		$on = 5;  ## flag for DSTABLEDEFS records
	}
	elsif ($element =~ /BEGIN DSSTAGETYPES/)
	{
		$on = 6;  ## flag for DSSTAGETYPES records
	}
	elsif ($element =~ /BEGIN DSDATATYPES/)
	{
		$on = 7;  ## flag for DSDATATYPES records
	}
	elsif ($element =~ /BEGIN DSSHAREDCONTAINER/)
	{
		$on = 8;  ## flag for DSDATATYPES records
	}
	elsif ($element =~ /BEGIN DSPARAMETERSETS/)
	{
		$on = 9;  ## flag for DSPARAMETERSETS records
	}

	# separate out each section to a name array for it
	if ($on == 1)
	{
		push(@header, "$element\n");
		if ($element =~ /END HEADER/)
		{
			$on = 0;
		}
	}

	if ($on == 2)
	{
		push(@jobs, "$element\n");
		if ($element =~ /END DSJOB/)
		{
			$on = 0;
		}
	}

	if ($on == 3)
	{
		push(@routines, "$element\n");
		if ($element =~ /END DSROUTINES/)
		{
			$on = 0;
		}
	}

	if ($on == 4)
	{
		push(@transforms, "$element\n");
		if ($element =~ /END DSTRANSFORMS/)
		{
			$on = 0;
		}
	}

	if ($on == 5)
	{
		push(@tabledefs, "$element\n");
		if ($element =~ /END DSTABLEDEFS/)
		{
			$on = 0;
		}
	}

	if ($on == 6)
	{
		push(@stagetypes, "$element\n");
		if ($element =~ /END DSSTAGETYPES/)
		{
			$on = 0;
		}
	}

	if ($on == 7)
	{
		push(@datatypes, "$element\n");
		if ($element =~ /END DSDATATYPES/)
		{
			$on = 0;
		}
	}

	if ($on == 8)
	{
		push(@containers, "$element\n");
		if ($element =~ /END DSSHAREDCONTAINER/)
		{
			$on = 0;
		}
	}

	if ($on == 9)
	{
		push(@parametersets, "$element\n");
		if ($element =~ /END DSPARAMETERSETS/)
		{
			$on = 0;
		}
	}

}

##############################################################################################################
# process the jobs into their own individual files
##############################################################################################################

# DSJOB section
foreach $element(@jobs)
{
	chomp $element;
	push(@section,"$element\n");

	if (($element =~ /BEGIN DSJOB/) and ($in_section == 0))
	{
		# flag the jobs section as being active
		$in_section = 1;
	}
	elsif ($in_section == 1)
	{
		# check to see if we are at the end of the job
		if ($element =~ /END DSJOB/)
		{
			$cnt = scalar(@section);
			if ($cnt > 2)
			{
				# reset the in_section counter
				$in_section = 0;

				# extract the name of the file using the Identifier
				$value = $section[1];
				chomp $value;   # remove the end of line stuff
				$value =~ s/  Identifier "//g;
				$value =~ s/"//g;   # handle quotes
				$value =~ s/ //g;   # handle spaces
				$value =~ s/([\.])/_/g; # handle periods if they exist
				$value =~ s/([\\\\])/_/g; # handle double back slashes if they exist
				$value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist
				$value =~ s/([\\])/_/g; # handle back slashes if they exist
				$value =~ s/([\/])/_/g; # handle forward slashes if they exist
				$value =~ s/([:])/_/g;  # handle colons if they exist

				# name the file
				#chop $value; # clean the end of the name again as there is a mysterious character showing up
				# $dsname = "DSJOB_" . "$value.dsx";
				$dsname = "$value.dsx";
				print "Creating job file for $dsname\n";

				open (NEW, "> $dsname")|| die "Unable to create $dsname file!";

				# output the header to the file
				foreach $line(@header)
				{
					chomp $line;
					print NEW "$line\n";
				}

				# output the lines of this particular job
				print NEW "BEGIN DSJOB\n";
				foreach $line(@section)
				{
					next if (($line =~ /BEGIN DSJOB/) or ($line =~ /END DSJOB/));
					chomp $line;
					print NEW "$line\n";
				}
				print NEW "END DSJOB\n";

				# when done pushing the lines out, close the file
				close NEW;
			}

			# if the count was 2 or less or has been processed, then ditch what is in the array
			@section = ();
			$value = '';
		}
	}
}

# DSTRANSFORMS section
$cnt = 0;

# get rid of dupes and empty pairs
@work= ();
foreach $element(@transforms)
{
	chomp $element;
	next if (($element =~ /BEGIN DSTRANSFORMS/) or ($element =~ /END DSTRANSFORMS/));
	push(@work, "$element\n");
}
@transforms = ();
@transforms = @work;
@work = ();

foreach $element(@transforms)
{
	chomp $element;
	push(@section,"$element\n");

	if (($element =~ /BEGIN DSRECORD/) and ($in_section == 0))
	{
		# flag the jobs section as being active
		$in_section = 1;
	}
	elsif ($in_section == 1)
	{
		# check to see if we are at the end of the job
		if ($element =~ /END DSRECORD/)
		{
			$cnt = scalar(@section);
			if ($cnt > 2)
			{
				# reset the in_section counter
				$in_section = 0;

				# extract the name of the file using the Identifier
				# the identifier could be at position 1 or 2 depending on
				# where we are in the process

				# finding the first identifier - this is needed if there is a truncation of
				# some sort.
				#
				# NEED TO MAKE THIS MORE ROBUST!!!!
				if ($section[1] =~ /  Identifier "/)
				{
					$value = $section[1];
				}
				elsif ($section[2] =~ /  Identifier "/)
				{
					$value = $section[2];
				}
				elsif ($section[3] =~ /  Identifier "/)
				{
					$value = $section[3];
				}
				elsif ($section[4] =~ /  Identifier "/)
				{
					$value = $section[4];
				}
				elsif ($section[5] =~ /  Identifier "/)
				{
					$value = $section[5];
				}

				chomp $value;   # remove the end of line stuff
				$value =~ s/  Identifier "//g;
				$value =~ s/"//g;   # handle quotes
				$value =~ s/ //g;   # handle spaces
				$value =~ s/([\.])/_/g; # handle periods if they exist
				$value =~ s/([\\\\])/_/g; # handle double back slashes if they exist
				$value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist
				$value =~ s/([\\])/_/g; # handle back slashes if they exist
				$value =~ s/([\/])/_/g; # handle forward slashes if they exist
				$value =~ s/([:])/_/g;  # handle colons if they exist

				# name the file
				#chop $value; # clean the end of the name again as there is a mysterious character showing up
				# $dsname = "DSTRANSFORMS_" . "$value.dsx";
				$dsname = "$value.dsx";
				print "Creating transform file for $dsname\n";

				open (NEW, "> $dsname")|| die "Unable to create $dsname file!";

				# output the header to the file
				foreach $line(@header)
				{
					chomp $line;
					print NEW "$line\n";
				}

				# output the lines of this particular job
				print NEW "BEGIN DSTRANSFORMS\n";
				foreach $line(@section)
				{
					chomp $line;
					print NEW "$line\n";
				}
				print NEW "END DSTRANSFORMS\n";

				# when done pushing the lines out, close the file
				close NEW;
			}

			# if the count was 2 or less or has been processed, then ditch what is in the array
			@section = ();
			$value = '';
		}
	}
}

# DSROUTINES section
#
# get rid of dupes and empty pairs
@work= ();
foreach $element(@routines)
{
	chomp $element;
	next if (($element =~ /BEGIN DSROUTINES/) or ($element =~ /END DSROUTINES/));
	push(@work, "$element\n");
}
@routines = ();
@routines = @work;
@work = ();

foreach $element(@routines)
{
	chomp $element;
	push(@section,"$element\n");

	if (($element =~ /BEGIN DSRECORD/) and ($in_section == 0))
	{
		# flag the jobs section as being active
		$in_section = 1;
	}
	elsif ($in_section == 1)
	{
		# check to see if we are at the end of the job
		if ($element =~ /END DSUBINARY/)
		{
			$cnt = scalar(@section);
			if ($cnt > 2)
			{
				# reset the in_section counter
				$in_section = 0;

				# extract the name of the file using the Identifier
				# the identifier could be at position 1 or 2 depending on
				# where we are in the process

				# finding the first identifier - this is needed if there is a truncation of
				# some sort.
				#
				# NEED TO MAKE THIS MORE ROBUST!!!!
				if ($section[1] =~ /  Identifier "/)
				{
					$value = $section[1];
				}
				elsif ($section[2] =~ /  Identifier "/)
				{
					$value = $section[2];
				}
				elsif ($section[3] =~ /  Identifier "/)
				{
					$value = $section[3];
				}
				elsif ($section[4] =~ /  Identifier "/)
				{
					$value = $section[4];
				}
				elsif ($section[5] =~ /  Identifier "/)
				{
					$value = $section[5];
				}

				chomp $value;   # remove the end of line stuff
				$value =~ s/  Identifier "//g;
				$value =~ s/"//g;   # handle quotes
				$value =~ s/ //g;   # handle spaces
				$value =~ s/([\.])/_/g; # handle periods if they exist
				$value =~ s/([\\\\])/_/g; # handle double back slashes if they exist
				$value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist
				$value =~ s/([\\])/_/g; # handle back slashes if they exist
				$value =~ s/([\/])/_/g; # handle forward slashes if they exist
				$value =~ s/([:])/_/g;  # handle colons if they exist

				# name the file
				#chop $value; # clean the end of the name again as there is a mysterious character showing up
				# $dsname = "DSROUTINES_" . "$value.dsx";
				$dsname = "$value.dsx";
				print "Creating routine file for $dsname\n";

				open (NEW, "> $dsname")|| die "Unable to create $dsname file!";

				# output the header to the file
				foreach $line(@header)
				{
					chomp $line;
					print NEW "$line\n";
				}

				# output the lines of this particular job
				print NEW "BEGIN DSROUTINES\n";
				foreach $line(@section)
				{
					# print "$line\n" if (($line =~ /BEGIN DSROUTINES/) or ($line =~ /END DSROUTINES/));
					chomp $line;
					print NEW "$line\n";
				}
				print NEW "END DSROUTINES\n";

				# when done pushing the lines out, close the file
				close NEW;
			}

			# if the count was 2 or less or has been processed, then ditch what is in the array
			@section = ();
			$value = '';
		}
	}
}

# DSTABLEDEFS section
#
# get rid of dupes and empty pairs
@work= ();
foreach $element(@tabledefs)
{
	chomp $element;
	next if (($element =~ /BEGIN DSTABLEDEFS/) or ($element =~ /END DSTABLEDEFS/));
	push(@work, "$element\n");
}
@tabledefs = ();
@tabledefs = @work;
@work = ();

foreach $element(@tabledefs)
{
	chomp $element;
	push(@section,"$element\n");

	if (($element =~ /BEGIN DSRECORD/) and ($in_section == 0))
	{
		# flag the jobs section as being active
		$in_section = 1;
	}
	elsif ($in_section == 1)
	{
		# check to see if we are at the end of the job
		if ($element =~ /END DSRECORD/)
		{
			$cnt = scalar(@section);
			if ($cnt > 2)
			{
				# reset the in_section counter
				$in_section = 0;

				# extract the name of the file using the Identifier
				# the identifier could be at position 1 or 2 depending on
				# where we are in the process

				# finding the first identifier - this is needed if there is a truncation of
				# some sort.
				#
				# NEED TO MAKE THIS MORE ROBUST!!!!
				if ($section[1] =~ /  Identifier "/)
				{
					$value = $section[1];
				}
				elsif ($section[2] =~ /  Identifier "/)
				{
					$value = $section[2];
				}
				elsif ($section[3] =~ /  Identifier "/)
				{
					$value = $section[3];
				}
				elsif ($section[4] =~ /  Identifier "/)
				{
					$value = $section[4];
				}
				elsif ($section[5] =~ /  Identifier "/)
				{
					$value = $section[5];
				}

				chomp $value;   # remove the end of line stuff
				$value =~ s/  Identifier "//g;
				$value =~ s/"//g;   # handle quotes
				$value =~ s/ //g;   # handle spaces
				$value =~ s/([\.])/_/g; # handle periods if they exist
				$value =~ s/([\\\\])/_/g; # handle double back slashes if they exist
				$value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist
				$value =~ s/([\\])/_/g; # handle back slashes if they exist
				$value =~ s/([\/])/_/g; # handle forward slashes if they exist
				$value =~ s/([:])/_/g;  # handle colons if they exist

				# name the file
				#chop $value; # clean the end of the name again as there is a mysterious character showing up
				# $dsname = "DSTABLEDEFS_" . "$value.dsx";
				$dsname = "$value.dsx";
				print "Creating tabledef file for $dsname\n";

				open (NEW, "> $dsname")|| die "Unable to create $dsname file!";

				# output the header to the file
				foreach $line(@header)
				{
					chomp $line;
					print NEW "$line\n";
				}

				# output the lines of this particular job
				print NEW "BEGIN DSTABLEDEFS\n";
				foreach $line(@section)
				{
					chomp $line;
					print NEW "$line\n";
				}
				print NEW "END DSTABLEDEFS\n";

				# when done pushing the lines out, close the file
				close NEW;
			}

			# if the count was 2 or less or has been processed, then ditch what is in the array
			@section = ();
			$value = '';
		}
	}
}

# DSSTAGETYPES section
#
# get rid of dupes and empty pairs
@work= ();
foreach $element(@stagetypes)
{
	chomp $element;
	next if (($element =~ /BEGIN DSSTAGETYPES/) or ($element =~ /END DSSTAGETYPES/));
	push(@work, "$element\n");
}
@stagetypes = ();
@stagetypes = @work;
@work = ();

foreach $element(@stagetypes)
{
	chomp $element;
	push(@section,"$element\n");

	if (($element =~ /BEGIN DSRECORD/) and ($in_section == 0))
	{
		# flag the jobs section as being active
		$in_section = 1;
	}
	elsif ($in_section == 1)
	{
		# check to see if we are at the end of the job
		if ($element =~ /END DSRECORD/)
		{
			$cnt = scalar(@section);
			if ($cnt > 2)
			{
				# reset the in_section counter
				$in_section = 0;

				# extract the name of the file using the Identifier
				# the identifier could be at position 1 or 2 depending on
				# where we are in the process

				# finding the first identifier - this is needed if there is a truncation of
				# some sort.
				#
				# NEED TO MAKE THIS MORE ROBUST!!!!
				if ($section[1] =~ /  Identifier "/)
				{
					$value = $section[1];
				}
				elsif ($section[2] =~ /  Identifier "/)
				{
					$value = $section[2];
				}
				elsif ($section[3] =~ /  Identifier "/)
				{
					$value = $section[3];
				}
				elsif ($section[4] =~ /  Identifier "/)
				{
					$value = $section[4];
				}
				elsif ($section[5] =~ /  Identifier "/)
				{
					$value = $section[5];
				}

				chomp $value;   # remove the end of line stuff
				$value =~ s/  Identifier "//g;
				$value =~ s/"//g;   # handle quotes
				$value =~ s/ //g;   # handle spaces
				$value =~ s/([\.])/_/g; # handle periods if they exist
				$value =~ s/([\\\\])/_/g; # handle double back slashes if they exist
				$value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist
				$value =~ s/([\\])/_/g; # handle back slashes if they exist
				$value =~ s/([\/])/_/g; # handle forward slashes if they exist
				$value =~ s/([:])/_/g;  # handle colons if they exist

				# name the file
				#chop $value; # clean the end of the name again as there is a mysterious character showing up
				# $dsname = "DSSTAGETYPES_" . "$value.dsx";
				$dsname = "$value.dsx";
				print "Creating stagetypes file for $dsname\n";

				open (NEW, "> $dsname")|| die "Unable to create $dsname file!";

				# output the header to the file
				foreach $line(@header)
				{
					chomp $line;
					print NEW "$line\n";
				}

				# output the lines of this particular job
				print NEW "BEGIN DSSTAGETYPES\n";
				foreach $line(@section)
				{
					next if (($line =~ /BEGIN DSSTAGETYPES/) or ($line =~ /END DSSTAGETYPES/));
					chomp $line;
					print NEW "$line\n";
				}
				print NEW "END DSSTAGETYPES\n";

				# when done pushing the lines out, close the file
				close NEW;
			}

			# if the count was 2 or less or has been processed, then ditch what is in the array
			@section = ();
			$value = '';
		}
	}
}

# DSDATATYPES section
#
# get rid of dupes and empty pairs
@work= ();
foreach $element(@datatypes)
{
	chomp $element;
	next if (($element =~ /BEGIN DSDATATYPES/) or ($element =~ /END DSDATATYPES/));
	push(@work, "$element\n");
}
@datatypes = ();
@datatypes = @work;
@work = ();

foreach $element(@datatypes)
{
	chomp $element;
	push(@section,"$element\n");

	if (($element =~ /BEGIN DSRECORD/) and ($in_section == 0))
	{
		# flag the jobs section as being active
		$in_section = 1;
	}
	elsif ($in_section == 1)
	{
		# check to see if we are at the end of the job
		if ($element =~ /END DSRECORD/)
		{
			$cnt = scalar(@section);
			if ($cnt > 2)
			{
				# reset the in_section counter
				$in_section = 0;

				# extract the name of the file using the Identifier
				# the identifier could be at position 1 or 2 depending on
				# where we are in the process

				# finding the first identifier - this is needed if there is a truncation of
				# some sort.
				#
				# NEED TO MAKE THIS MORE ROBUST!!!!
				if ($section[1] =~ /  Identifier "/)
				{
					$value = $section[1];
				}
				elsif ($section[2] =~ /  Identifier "/)
				{
					$value = $section[2];
				}
				elsif ($section[3] =~ /  Identifier "/)
				{
					$value = $section[3];
				}
				elsif ($section[4] =~ /  Identifier "/)
				{
					$value = $section[4];
				}
				elsif ($section[5] =~ /  Identifier "/)
				{
					$value = $section[5];
				}

				chomp $value;   # remove the end of line stuff
				$value =~ s/  Identifier "//g;
				$value =~ s/"//g;   # handle quotes
				$value =~ s/ //g;   # handle spaces
				$value =~ s/([\.])/_/g; # handle periods if they exist
				$value =~ s/([\\\\])/_/g; # handle double back slashes if they exist
				$value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist
				$value =~ s/([\\])/_/g; # handle back slashes if they exist
				$value =~ s/([\/])/_/g; # handle forward slashes if they exist
				$value =~ s/([:])/_/g;  # handle colons if they exist

				# name the file
				#chop $value; # clean the end of the name again as there is a mysterious character showing up
				# $dsname = "DSDATATYPES_" . "$value.dsx";
				$dsname = "$value.dsx";
				print "Creating datatypes file for $dsname\n";

				open (NEW, "> $dsname")|| die "Unable to create $dsname file!";

				# output the header to the file
				foreach $line(@header)
				{
					chomp $line;
					print NEW "$line\n";
				}

				# output the lines of this particular job
				print NEW "BEGIN DSDATATYPES\n";
				foreach $line(@section)
				{
					next if (($line =~ /BEGIN DSDATATYPES/) or ($line =~ /END DSDATATYPES/));
					chomp $line;
					print NEW "$line\n";
				}
				print NEW "END DSDATATYPES\n";

				# when done pushing the lines out, close the file
				close NEW;
			}

			# if the count was 2 or less or has been processed, then ditch what is in the array
			@section = ();
			$value = '';
		}
	}
}

# DSSHAREDCONTAINER section
#
# get rid of dupes and empty pairs
@work= ();
foreach $element(@containers)
{
	chomp $element;
	next if (($element =~ /BEGIN DSSHAREDCONTAINER/) or ($element =~ /END DSSHAREDCONTAINER/));
	push(@work, "$element\n");
}
@containers = ();
@containers = @work;
@work = ();

foreach $element(@containers)
{
	chomp $element;
	push(@section,"$element\n");

	if (($element =~ /BEGIN DSSHAREDCONTAINER/) and ($in_section == 0))
	{
		# flag the jobs section as being active
		$in_section = 1;
	}
	elsif ($in_section == 1)
	{
		# check to see if we are at the end of the job
		if ($element =~ /END DSSHAREDCONTAINER/)
		{
			$cnt = scalar(@section);
			if ($cnt > 2)
			{
				# reset the in_section counter
				$in_section = 0;

				# extract the name of the file using the Identifier
				# the identifier could be at position 1 or 2 depending on
				# where we are in the process

				# finding the first identifier - this is needed if there is a truncation of
				# some sort.
				#
				# NEED TO MAKE THIS MORE ROBUST!!!!
				if ($section[1] =~ /  Identifier "/)
				{
					$value = $section[1];
				}
				elsif ($section[2] =~ /  Identifier "/)
				{
					$value = $section[2];
				}
				elsif ($section[3] =~ /  Identifier "/)
				{
					$value = $section[3];
				}
				elsif ($section[4] =~ /  Identifier "/)
				{
					$value = $section[4];
				}
				elsif ($section[5] =~ /  Identifier "/)
				{
					$value = $section[5];
				}

				chomp $value;   # remove the end of line stuff
				$value =~ s/  Identifier "//g;
				$value =~ s/"//g;   # handle quotes
				$value =~ s/ //g;   # handle spaces
				$value =~ s/([\.])/_/g; # handle periods if they exist
				$value =~ s/([\\\\])/_/g; # handle double back slashes if they exist
				$value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist
				$value =~ s/([\\])/_/g; # handle back slashes if they exist
				$value =~ s/([\/])/_/g; # handle forward slashes if they exist
				$value =~ s/([:])/_/g;  # handle colons if they exist

				# name the file
				#chop $value; # clean the end of the name again as there is a mysterious character showing up
				# $dsname = "DSSHAREDCONTAINER_" . "$value.dsx";
				$dsname = "$value.dsx";
				print "Creating shared container file for $dsname\n";

				open (NEW, "> $dsname")|| die "Unable to create $dsname file!";

				# output the header to the file
				foreach $line(@header)
				{
					chomp $line;
					print NEW "$line\n";
				}

				# output the lines of this particular job
				print NEW "BEGIN DSSHAREDCONTAINER\n";
				foreach $line(@section)
				{
					next if (($line =~ /BEGIN DSSHAREDCONTAINER/) or ($line =~ /END DSSHAREDCONTAINER/));
					chomp $line;
					print NEW "$line\n";
				}
				print NEW "END DSSHAREDCONTAINER\n";

				# when done pushing the lines out, close the file
				close NEW;
			}

			# if the count was 2 or less or has been processed, then ditch what is in the array
			@section = ();
			$value = '';
		}
	}
}


# DSPARAMETERSETS section
#
# get rid of dupes and empty pairs
@work= ();
foreach $element(@parametersets)
{
	chomp $element;
	###jdm### next if (($element =~ /BEGIN DSPARAMETERSETS/) or ($element =~ /END DSPARAMETERSETS/));
	push(@work, "$element\n");
}
@parametersets = ();
@parametersets = @work;
@work = ();

# print "length of parameterset array = " . scalar(@parametersets) . ".\n";

foreach $element(@parametersets)
{
	chomp $element;
	push(@section,"$element\n");

	if (($element =~ /BEGIN DSPARAMETERSETS/) and ($in_section == 0))
	{
		# flag the jobs section as being active
		# print "currently in the parametersets section\n";
		$in_section = 1;
	}
	elsif ($in_section == 1)
	{
		# check to see if we are at the end of the job
		if ($element =~ /END DSPARAMETERSETS/)
		{
			# print "currently leaving the parametersets section\n";
			$cnt = scalar(@section);
			if ($cnt > 2)
			{
				# reset the in_section counter
				$in_section = 0;
###jdm###
###jdm###				# extract the name of the file using the Identifier
###jdm###				# the identifier could be at position 1 or 2 depending on
###jdm###				# where we are in the process
###jdm###
###jdm###				# finding the first identifier - this is needed if there is a truncation of
###jdm###				# some sort.
###jdm###				#
###jdm###				# NEED TO MAKE THIS MORE ROBUST!!!!
###jdm###				if ($section[1] =~ /  Identifier "/)
###jdm###				{
###jdm###					$value = $section[1];###jdm###		
###jdm###				}
###jdm###				elsif ($section[2] =~ /  Identifier "/)
###jdm###				{
###jdm###					$value = $section[2];
###jdm###				}
###jdm###				elsif ($section[3] =~ /  Identifier "/)
###jdm###				{
###jdm###					$value = $section[3];
###jdm###				}
###jdm###				elsif ($section[4] =~ /  Identifier "/)
###jdm###				{
###jdm###					$value = $section[4];
###jdm###				}
###jdm###				elsif ($section[5] =~ /  Identifier "/)
###jdm###				{
###jdm###					$value = $section[5];
###jdm###				}
###jdm###
###jdm###				chomp $value;   # remove the end of line stuff
###jdm###				$value =~ s/  Identifier "//g;
###jdm###				$value =~ s/"//g;   # handle quotes
###jdm###				$value =~ s/ //g;   # handle spaces
###jdm###				$value =~ s/([\.])/_/g; # handle periods if they exist
###jdm###				$value =~ s/([\\\\])/_/g; # handle double back slashes if they exist
###jdm###				$value =~ s/([\/\/])/_/g; # handle double forward slashes if they exist
###jdm###				$value =~ s/([\\])/_/g; # handle back slashes if they exist
###jdm###				$value =~ s/([\/])/_/g; # handle forward slashes if they exist
###jdm###				$value =~ s/([:])/_/g;  # handle colons if they exist

###jdm###				# name the file
###jdm###				#chop $value; # clean the end of the name again as there is a mysterious character showing up
###jdm###				# $dsname = "DSPARAMETERSETS_" . "$value.dsx";
				$dsname = "PARAMETERSETS.dsx";
				# print "parametersets filename = $dsname\n";
				print "Creating parameter set file for $dsname\n";

				open (NEW, "> $dsname")|| die "Unable to create $dsname file!";

				# output the header to the file
				foreach $line(@header)
				{
					chomp $line;
					print NEW "$line\n";
				}

				# output the lines of this particular job
				print NEW "BEGIN DSPARAMETERSETS\n";
				foreach $line(@section)
				{
					next if (($line =~ /BEGIN DSPARAMETERSETS/) or ($line =~ /END DSPARAMETERSETS/));
					chomp $line;
					print NEW "$line\n";
				}
				print NEW "END DSPARAMETERSETS\n";

				# when done pushing the lines out, close the file
				close NEW;
			}

			# if the count was 2 or less or has been processed, then ditch what is in the array
			@section = ();
			$value = '';
		}
	}
}