How to compile a custom operator for RHEL 5

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
stephan.zahariev
Premium Member
Premium Member
Posts: 9
Joined: Sun Feb 12, 2012 11:05 am
Contact:

How to compile a custom operator for RHEL 5

Post by stephan.zahariev »

Hi,

I need to build a custom operator that should run under DataStage 9.1 and RedHat Linux 5 (64 bit). There is an introduction on the IBM site how to build custom operators for the Windows platform here: http://www.ibm.com/developerworks/data/ ... 0702chard/

I was trying to find something similar that describes the compilation process for the Linux but no luck so far... The g++ compiler is set correctly (transformers in a job are working fine) but upon trying to build the example from the link above it returns hundreds of errors found in the .h files referenced by the sample code.

Is there an example that describes the compilation process of a custom operator for Linux? Or maybe somebody can share a personal experience on the subject.


Thanks
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Custom operators are compiled within the stage, irrespective of platform. The compiler is specified through environment variables such APT_COMPILER, and the Build stage allow you to specify additional flags if necessary.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Hello Stephan,

when I started working on building custom operators I thought I'd start with the few examples that IBM has posted on the web but that ended up being a bit of a mistake, as the published examples don't work, or only work partially and on one platform (Windows).

There are 3 types of stages available for you to create: "build", "custom" and "wrapped"

The "build" type lets you define a rigid input and output schema and create c++ code within the DataStage Designer GUI to manipulate the data. When you compile/build this type of stage DataStage will include the requisite header files and your design effort is made a bit simpler. On the other hand, it splits the final program into logic in "definitions", "pre-loop", "per-record" and "post-loop". The compilation is done under the covers by DataStage and uses the setting defined in the DSPARAMS file. If what you want to do fits into this structure then I would suggest you use this method.


The next type is "custom" and this is somewhat more involved. The DataStage Designer for this stage defines the entrypoint and the command line options (using the "properties" tab). You can open up existing stages and see how their "properties" are defined to see how easy this part is.
The coding of your custom operator in this case is done outside of DataStage and you need to ensure that you include the requisite classes and headers as well as other libraries and write a standard c++ program as a library. I've found that I use almost the same compiler settings as found in the DSPARAMS file, on windows one needs to make some changes when using the Design Studio, but on UNIX that hasn't been necessary.
Once you have a library, it needs to be placed in the environment setting for libraries and also registered with DataStage and it will be dynamically linked at runtime. While I don't go into great depth, you can download the most recent document at Hist-Op Documentation and check out chapter 8 for some installation information.

The third option is the "wrapped" type and is one I haven't actually used, so I will let someone else comment on that approach if that is what you need.
stephan.zahariev
Premium Member
Premium Member
Posts: 9
Joined: Sun Feb 12, 2012 11:05 am
Contact:

Post by stephan.zahariev »

Hello and thank you for the responses. I'm happy to say that I have made progress on this...

For a future reference if somebody wants to compile the code from the IBM artcile mentioned in the first post, here is what needs to be done:
1) the following code needs to be removed from src/myhelloworld.c

Code: Select all

#define __NUTC__
because it causes the wrong file to be included later on and the following compilation error raises:

Code: Select all

/opt/IBM/InformationServer/Server/PXEngine/include/unicode/umachine.h:47:30: error: unicode/pnutc.h: No such file or directory
2) The source code compilation can be done like this (the params are like the APT_COMPILEOPT as Arnd said above):

Code: Select all

g++ -c -O -fPIC -Wno-deprecated -m64 -mtune=generic -mcmodel=small -I/opt/IBM/InformationServer/Server/PXEngine/include src/myhelloworld.c
3) The link step is:

Code: Select all

g++ -shared -m64 -L/opt/IBM/InformationServer/Server/PXEngine/lib -lorchgeneralx86_64 -lorchx86_64 -lorchmonitorx86_64   -lorchcorex86_64 -lorchsortx86_64  myhelloworld.o
Now about integrationg this into the designer and running a job with the custom operator - it was a partial success...

I was following the Arnd's pdf and created an antry into the /opt/IBM/InformationServer/Server/PXEngine/etc/operator.apt

Code: Select all

hello libhello 1
Then I have create a new "Parallel Stage Type (Custom)" and introduced "hello" as an operator and moved the libhello.so library here: /opt/IBM/InformationServer/Server/DSComponents/bin

Unfortunatelly during runtime the job aborts with:

main_program: PATH search failure:
main_program: Could not locate operator definition, wrapper, or Unix command for "hello"; please check that all needed libraries are preloaded, and check the PATH for the wrappers


But the PATH has deployment directory:

PATH=/opt/IBM/InformationServer/Server/Projects/dstage1/wrapped:/opt/IBM/InformationServer/Server/Projects/dstage1/buildop
:/opt/IBM/InformationServer/Server/Projects/dstage1/RT_BP8.O
:/opt/IBM/InformationServer/Server/DSComponents/lib:/opt/IBM/InformationServer/Server/DSComponents/bin
:/opt/IBM/InformationServer/Server/DSParallel:/opt/IBM/InformationServer/Server/PXEngine/user_osh_wrappers
:/opt/IBM/InformationServer/Server/PXEngine/osh_wrappers
:/opt/IBM/InformationServer/Server/PXEngine/bin
:/opt/IBM/InformationServer/Server/PXEngine/grid:/opt/IBM/InformationServer/ASBNode/apps/jre/bin:/sbin
:/usr/sbin:/bin:/usr/bin:/usr/local/nz/bin64:/home/loadl/bin:/usr/kerberos/bin:/usr/local/bin:/usr/X11R6/bin:.


What I have tried as well is to copy the libhello.so under a folder referenced by LD_LIBRARY_PATH. No luck...

The libhello.so has permissions like this:
-rwxr-xr-x 1 root root 144264 Sep 15 22:01 libhello.so

So what could be the problem for the above???

The only way I was able to run the job was to introduce a wrapper script for the operator like this:

Code: Select all

#!/bin/sh
props=''
numtimes=''
uppercase=0
usage="hello [-u] [-n times] < input > output"
status=0
error () {
  echo "select: $1" 1>&2 ;
  if [ $# -eq 1 ] ; then status=1 ; else status=$2 ; fi
}
# Parse argument list
while [ $# -gt 0 ] ; do
  case "$1" in
  -n) # number of times
    if [ $# -lt 2 ] ; then
      error "no value specified for -n argument"
      break
    else
      numtimes="$2"
      shift; shift
    fi
    ;;
  -u) # print uppercase
    uppercase=1
    shift
    ;;
  *)  # otherwise
    error "Unrecognized argument, $1"
    shift # skip to next argument
    ;;
  esac
done
# Check for properties
if [ ${status} -eq 0 ] ; then

  if [ $uppercase -eq 1 ] ; then
     props="${props:+$props,}uppercase"
  fi

  if [ -n "$numtimes" ] ; then
     props="${props:+$props,}numtimes=$numtimes"
  fi

fi

if [ ${status} -ne 0 ] ; then

echo "{
  usage=\"$usage\"
}"

echo "{
  class=HelloWorldOp,
  initialization={${props}},
  usage=\"$usage\",
  library=\"hello\"
}"

fi
exit $status
Having this script in one of the PATH directories makes the job run without problems. Could we avoid running this script?
stephan.zahariev
Premium Member
Premium Member
Posts: 9
Joined: Sun Feb 12, 2012 11:05 am
Contact:

Post by stephan.zahariev »

OK, I have managed to resolve it. The issue described in the last post was due to wrong invocation of the APT_DEFINE_OSH_NAME macro.

If anyone is interested I have summarized the experience here: http://szahariev.blogspot.com/2013/09/C ... Howto.html
Post Reply