Parallel routine appears to have a memory leak

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

PhilHibbs
Premium Member
Premium Member
Posts: 1044
Joined: Wed Sep 29, 2004 3:30 am
Location: Nottingham, UK
Contact:

Post by PhilHibbs »

That's nice, but I'd still like to be able to implement the pxStrFilter routine that is in the third post of this thread.
Phil Hibbs | Capgemini
Technical Consultant
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

why not use a char array which can be passed and returned as pointers and will not leave a dangling pointer. You just need to make sure you define the array long enough.

Did you try with new() of c++ instead of malloc() of C, I think it will free up the memory once its scope is out but not sure though.
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
PhilHibbs
Premium Member
Premium Member
Posts: 1044
Joined: Wed Sep 29, 2004 3:30 am
Location: Nottingham, UK
Contact:

Post by PhilHibbs »

The problem is, you can't pass in a char*. DataStage can pass in a char* but I, as a DataStage user, can only pass in a Stage Variable or a string literal. DataStage will somehow construct a char array and pass a pointer to that, but as far as I know it is unsafe for the function to write to that array, as DataStage may well have plans for it. For instance, IBM actually suggested that I should pass in a Stage Variable that would then be updated by the function instead of returning the value. The implication there is that if the function writes to the char* that is passed in, then that write goes directly into the Stage Variable storage. I'm uneasy about that, and I would actually be quite surprised if it worked reliably. IBM support have already demonstrated that they don't know what they are talking about in regard to char* routines.

As to new(), it's no different to malloc. Nothing will free the memory pointed to by a char* unless it is explicit, no matter how that memory was allocated. And even if it did, freeing the memory as soon as the function returns, and therefore before DataStage uses the pointer to access the return value, is extremely dangerous.
Phil Hibbs | Capgemini
Technical Consultant
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Added comments to compile on Sun.

Code: Select all

/****************************************************************************** 
* pxEreplace - DataStage parallel routine 
* 
* Published on DSXchange.com by user DSguru2B 
* http://www.dsxchange.com/viewtopic.php?t=106358 
* http://www.dsxchange.com/viewtopic.php?t=147367&highlight=ereplace
* 
* Bugs (malloc, realloc, count) fixed by Philip Hibbs, Capgemini 
* 
* INSTRUCTIONS 
* 
* 1. Copy the source file pxEreplace.cpp into a directory on the server 
* 2. Run the following command: 
* 
*         g++ -O -fPIC -Wno-deprecated -c pxEreplace.cpp 
*    Sun CC syntax
*    /opt/SUNWspro/bin/CC -m64 -O -PIC -c pxEreplace_4.C -o pxEreplace_4.o
* 
* (check Administrator->Properties->Environment->Parallel->Compiler settings) 
* 
* 3. Copy the output into the DataStage library directory: 
* 
*         cp pxEreplace.o `cat /.dshome`/../PXEngine/lib/pxEreplace.o 
*         cp pxEreplace_4.o pxEreplace.o
* 
* 4. Create the Server Routine with the following properties: 
* 
* Routine Name             : pxEreplace 
* External subroutine name : pxEreplace 
* Type                     : External function 
* Object type              : Object 
* Return type              : char* 
* Library path             : /software/opt/IBM/InformationServer/Server/PXEngine/lib/pxEreplace.o 
* Arguments: 
*     str     I  char* 
*     subStr  I  char* 
*     rep     I  char* 
*     num     I  int 
*     beg     I  int 
* 
* Save & Close 
* 
* Any time that anything changes, you must recompile all jobs that use the routine. 
* 
******************************************************************************/ 

#include "string.h" 
#include "stdlib.h" 

char* pxEreplace(char *str, char *subStr, char *rep, int num, int beg) 
{ 
  char empty[1]=""; 

  if (!str) {str = empty;} 
  if (!subStr) {subStr = empty;} 
  if (!rep) {rep = empty;} 

  int buflen = strlen(str)+1; 
  char *result = (char *)malloc( buflen ); 

  if (!result) {return 0;} 
  if (buflen==1) {result[0]='\0'; return result;} 

  int oldlen = strlen(subStr); 
  int newlen = strlen(rep); 

  int i, x, count = 0; 

  if (oldlen==0) 
  { // special case - insert rep once at the start of the string and return 
    if (newlen>0) 
    { 
      buflen = buflen + newlen; 
      result = (char *)realloc( result, buflen ); 
    } 
    strcpy(result, rep); 
    strcpy(result+newlen, str); 
    return result; 
  } 

  //If begining is less than or equal to 1 then default it to 1 
  if (beg <= 1) 
  {beg = 1;} 

  //replace all instances if value of num less than or equal to 0 
  if (num <= 0) 
  {num = buflen;} 

  //Get the character position in i for substring instance to start from 
  for (i = 0; str[i] != '\0' ; i++) 
  { 
    if (strncmp(&str[i], subStr, oldlen) == 0) 
    { 
      count++; 
      if (count == beg) { break; } 
      i += oldlen - 1; 
    } 
  } 

  //Get everything before position i before replacement begins 

  x = 0; 
  while (i != x) 
  {  result[x++] = *str++; } 

  //Start replacement 
  while (*str) //for the complete input string 
  { 

    if (num != 0 ) // untill no more occurances need to be changed 
    { 
      if (strncmp(str, subStr, oldlen) == 0) 
      { 
        if (newlen > oldlen) 
        { 
          buflen = buflen + (newlen - oldlen); 
          result = (char *)realloc( result, buflen ); 
        } 
        strcpy(&result[x], rep); 
        x += newlen; 
        str += oldlen; 
        num--; 
      } 
      else // if no match is found 
      { 
        result[x++] = *str++; 
      } 
    } 
    else 
    { 
      result[x++] = *str++; 
    } 
  } 

  result[x] = '\0'; //Terminate the string 
  return result; //Return the replaced string 
} 
Mamu Kim
Post Reply