Parallel routine c++ regex compilation error
Posted: Wed Oct 20, 2010 11:24 pm
Hi
I am trying to write a parallel routine to perform regular expression validation in DataStage.
First I successfully created a routine in DataStage that uses the standard C library regex.h header file, similar to what is implemented in this thread: viewtopic.php?t=107882
However, I found this function did not work as desired (it doesn't seem to recognise the interval notation), and turned to the Boost C++ library implementation of regular expressions.
I have successfully compiled and run a test program in Windows
Using the command: cl /EHsc /I C:\Progra~1\boost\boost_1_44 C:\temp\testboostregex.cpp /link /LIBPATH:C:\Progra~1\boost\boost_1_44\lib
However, when I modify this for use in DataStage I get compile errors.
Output from transformer:
It looks like the linking of the transformer object code to the routine object which fails.
From the transformer error it looks like it is running the command
cxx -LC:/IBM/InformationServer/Server/PXEngine/lib/ -LC:/IBM/InformationServer/Server/PXEngine/lib/ -LC:/IBM/InformationServer/Server/PXEngine/user_lib -s -W/dll -W/base:0x50000000 -W/Zc:wchar_t- -lliborchnt -lliborchcorent -lliborchbuildopnt C:/IBM/InformationServer/Server/PXEngine/lib/regexp.o C:/temp/V4S0_CopyOfCopyOftestjob_Copy_of_Transformer_37.tmp.o -o C:/temp/V4S0_CopyOfCopyOftestjob_Copy_of_Transformer_37.dll
If i run the same command on the command prompt it sucessfully compiles the dll. I can then put the dll in the datastage job directory and the job will run correctly - so it looks like some kind of weird DataStage compile error?
Any help would be much appreciated.
Thanks
I am trying to write a parallel routine to perform regular expression validation in DataStage.
First I successfully created a routine in DataStage that uses the standard C library regex.h header file, similar to what is implemented in this thread: viewtopic.php?t=107882
However, I found this function did not work as desired (it doesn't seem to recognise the interval notation), and turned to the Boost C++ library implementation of regular expressions.
I have successfully compiled and run a test program in Windows
Code: Select all
#include <iostream>
#include <string>
#include <boost/regex.hpp> // Boost.Regex lib
using namespace std;
int main(int argc, char* argv[])
{
boost::regex re;
try
{
// Set up the regular expression for case-insensitivity
re.assign(argv[2], boost::regex_constants::icase);
}
catch (boost::regex_error& e)
{
cout << argv[2] << " is not a valid regular expression: \""
<< e.what() << "\"" << endl;
return(0);
}
if (!boost::regex_match(argv[1], re))
{
cout << re << " does not match " << argv[1] << endl;
return(0);
}
cout << re << " matches " << argv[1] << endl;
return(1);
}
However, when I modify this for use in DataStage I get compile errors.
Code: Select all
#include <string>
#include <iostream>
#include <boost/regex.hpp> // Boost.Regex lib
int matchRegexp(char *inString, char *inPattern)
{
boost::regex re;
try
{
// Set up the regular expression for case-insensitivity
re.assign(inPattern, boost::regex_constants::icase);
}
catch (boost::regex_error& e)
{
return(0);
}
if (!boost::regex_match(inString, re))
{
return(0);
}
return(1);
}
$APT_COMPILER: cxx
$APT_COMPILEOPT: -W/TP -W/EHa -DAPT_USE_ANSI_IOSTREAMS -c -W/Zc:wchar_t-
$APT_LINKER: cxx
$APT_LINKOPT: W/base:0x50000000 -W/Zc:wchar_t- -LC:/Progra~1/boost/boost_1_44/libOutput from transformer:
Code: Select all
Output from transformer compilation follows:
##I IIS-DSEE-TFCN-00001 15:41:07(000) <main_program>
IBM WebSphere DataStage Enterprise Edition 8.1.0.5447
Copyright (c) 2001, 2005-2008 IBM Corporation. All rights reserved
##I IIS-DSEE-TFCN-00006 15:41:07(001) <main_program> conductor uname: -s=Windows_NT; -r=2; -v=5; -n=IBM1; -m=Pentium
##I IIS-DSEE-TOSH-00002 15:41:07(002) <main_program> orchgeneral: loaded
##I IIS-DSEE-TOSH-00002 15:41:07(003) <main_program> orchsort: loaded
##I IIS-DSEE-TOSH-00002 15:41:07(004) <main_program> orchstats: loaded
##W IIS-DSEE-TOSH-00049 15:41:07(007) <main_program> Parameter specified but not used in flow: DSPXWorkingDir
##E IIS-DSEE-TBLD-00076 15:41:13(000) <main_program> Error when checking composite operator: Subprocess command failed with exit status 24,576.
##E IIS-DSEE-TFSR-00019 15:41:13(001) <main_program> Could not check all operators because of previous error(s)
##W IIS-DSEE-TFTM-00012 15:41:13(002) <transform> Error when checking composite operator: The number of reject datasets "0" is less than the number of input datasets "1".
##I IIS-DSEE-TBLD-00000 15:41:13(003) <main_program> Error when checking composite operator: Output from subprocess: C:\IBM\InformationServer\Server\PXEngine\include\apt_util/keylookup.h(1151) : warning C4251: 'APT_KeyLookupRange::rangeOptions_' : class 'std::vector<_Ty>' needs to have dll-interface to be used by clients of class 'APT_KeyLookupRange'
with
[
_Ty=APT_KeyLookupRange::rangeOption
]
C:\IBM\InformationServer\Server\PXEngine\include\apt_util/lookupops.h(541) : warning C4251: 'APT_LUTCreateOp::parentNodeMap_' : class 'std::vector<_Ty>' needs to have dll-interface to be use
##I IIS-DSEE-TBLD-00000 15:41:13(004) <main_program> Error when checking composite operator: Output from subprocess: d by clients of class 'APT_LUTCreateOp'
with
[
_Ty=APT_Node
]
C:\IBM\InformationServer\Server\PXEngine\include\apt_util/lookupops.h(546) : warning C4251: 'APT_LUTCreateOp::inputDisabled_' : class 'std::vector<_Ty,_Ax>' needs to have dll-interface to be used by clients of class 'APT_LUTCreateOp'
with
[
_Ty=bool,
_Ax=std::allocator<bool>
]
C:\IBM\InformationServer\Server\PXEngine\include\apt_util/lookupops.h(548) :
##I IIS-DSEE-TBLD-00000 15:41:13(005) <main_program> Error when checking composite operator: Output from subprocess: warning C4251: 'APT_LUTCreateOp::sharedTableNodes_' : class 'std::vector<_Ty>' needs to have dll-interface to be used by clients of class 'APT_LUTCreateOp'
with
[
_Ty=APT_UString
]
C:\IBM\InformationServer\Server\Projects\health_dev\RT_BP292.O\V4S0_CopyOfCopyOftestjob_Copy_of_Transformer_37.C(179) : warning C4101: 'output' : unreferenced local variable
C:\IBM\InformationServer\Server\Projects\health_dev\RT_BP292.O\V4S0_CopyOfCopyOftestjob_Copy_of_Transformer_37.C(17
##I IIS-DSEE-TBLD-00000 15:41:13(006) <main_program> Error when checking composite operator: Output from subprocess: 4) : warning C4101: 'input' : unreferenced local variable
##I IIS-DSEE-TBLD-00079 15:41:13(007) <transform> Error when checking composite operator: cxx -LC:/IBM/InformationServer/Server/Projects/health_dev/RT_BP292.O/ -LC:/IBM/InformationServer/Server/PXEngine/lib -LC:/IBM/InformationServer/Server/PXEngine/user_lib -s -W/dll -W/base:0x50000000 -W/Zc:wchar_t- -LC:/progra~1/boost/boost_1_44/lib -lliborchnt -lliborchcorent -lliborchbuildopnt C:/IBM/InformationServer/Server/PXEngine/lib/regexp.o C:/IBM/InformationServer/Server/Projects/health_dev/RT_BP292.O/V4S0_CopyOfCopyOftestjob_Copy_of_Transformer_37.tmp.o -o C:/IBM/InformationServer/Server/Projects/health_dev/RT_BP292.O/V4S0_CopyOfCopyOftestjob_Copy_of_Transformer_37.dll.
##I IIS-DSEE-TBLD-00000 15:41:13(008) <main_program> Error when checking composite operator: Output from subprocess: cp80.lib(nutlcp80.dll) : warning LNK4006: "public: __thiscall std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >::~basic_string<char,struct std::char_traits<char>,class std::allocator<char> >(void)" (??1?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@QAE@XZ) already defined in regexp.o; second definition ignored
cp80.lib(nutlcp80.dll) : warning LNK4006: "public: __thiscall std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >::basic_st
##I IIS-DSEE-TBLD-00000 15:41:13(009) <main_program> Error when checking composite operator: Output from subprocess: ring<char,struct std::char_traits<char>,class std::allocator<char> >(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &)" (??0?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@QAE@ABV01@@Z) already defined in regexp.o; second definition ignored
cp80.lib(nutlcp80.dll) : warning LNK4006: "public: __thiscall std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >::basic_string<char,struct std::char_traits<char>,class std::alloca
##I IIS-DSEE-TBLD-00000 15:41:13(010) <main_program> Error when checking composite operator: Output from subprocess: tor<char> >(char const *)" (??0?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@QAE@PBD@Z) already defined in regexp.o; second definition ignored
##I IIS-DSEE-TBLD-00000 15:41:13(011) <main_program> Error when checking composite operator: Output from subprocess: LIBCMT.lib(typinfo.obj) : warning LNK4006: "public: virtual __thiscall type_info::~type_info(void)" (??1type_info@@UAE@XZ) already defined in c.lib(typinfo.obj); second definition ignored
LIBCMT.lib(typinfo.obj) : warning LNK4006: "public: int __thiscall type_info::before(class type_info const &)const " (?before@type_info@@QBEHABV1@@Z) already defined in c.lib(typinfo.obj); second definition ignored
LIBCMT.lib(typinfo.obj) : warning LNK4006: "public: char const * __thiscall type_info::raw_name(void)const
##I IIS-DSEE-TBLD-00000 15:41:13(012) <main_program> Error when checking composite operator: Output from subprocess: " (?raw_name@type_info@@QBEPBDXZ) already defined in c.lib(typinfo.obj); second definition ignored
LIBCMT.lib(typinfo.obj) : warning LNK4006: "private: __thiscall type_info::type_info(class type_info const &)" (??0type_info@@AAE@ABV0@@Z) already defined in c.lib(typinfo.obj); second definition ignored
LIBCMT.lib(typinfo.obj) : warning LNK4006: "private: class type_info & __thiscall type_info::operator=(class type_info const &)" (??4type_info@@AAEAAV0@ABV0@@Z) already defined in c.lib(typinfo.obj); second
##I IIS-DSEE-TBLD-00000 15:41:13(013) <main_program> Error when checking composite operator: Output from subprocess: definition ignored
LIBCMT.lib(invarg.obj) : warning LNK4006: __invalid_parameter_noinfo already defined in c.lib(nutlibc4.dll); second definition ignored
##I IIS-DSEE-TBLD-00000 15:41:13(014) <main_program> Error when checking composite operator: Output from subprocess: LIBCMT.lib(hooks.obj) : warning LNK4006: "void __cdecl terminate(void)" (?terminate@@YAXXZ) already defined in c.lib(nutlibc4.dll); second definition ignored
##I IIS-DSEE-TBLD-00000 15:41:13(015) <main_program> Error when checking composite operator: Output from subprocess: LIBCMT.lib(crt0.obj) : error LNK2019: unresolved external symbol _main referenced in function ___tmainCRTStartup
C:\IBM\InformationServer\Server\Projects\health_dev\RT_BP292.O\V4S0_CopyOfCopyOftestjob_Copy_of_Transformer_37.dll : fatal error LNK1120: 1 unresolved externals
##E IIS-DSEE-TCOS-00029 15:41:13(016) <main_program> Creation of a step finished with status = FAILED. (CopyOfCopyOftestjob.Copy_of_Transformer_37)
*** Internal Generated Transformer Code follows:
0001: //
0002: // Generated file to implement the V4S0_CopyOfCopyOftestjob_Copy_of_Transformer_37 transform operator.
0003: //
0004:
0005: // define external functions used
0006: extern int32 matchRegexp(string inString,string inPattern);
0007:
0008: // define our input/output link names
0009: inputname 0 DSLink31;
0010: outputname 0 DSLink39;
0011:
0012: initialize {
0013: // define our row rejected variable
0014: int8 RowRejected0;
0015:
0016: // define our null set variable
0017: int8 NullSetVar0;
0018:
0019: // declare our intermediate variables for this section
0020: string InterVar0_0;
0021:
0022: // initialise constant values which require conversion
0023: InterVar0_0 = "123";
0024: }
0025:
0026: mainloop {
0027: // initialise our row rejected variable
0028: RowRejected0 = 1;
0029:
0030: // evaluate columns (no constraints) for link: DSLink39
0031: DSLink39.testregex = matchRegexp(InterVar0_0 , InterVar0_0);
0032: writerecord 0;
0033: RowRejected0 = 0;
0034: }
0035:
0036: finish {
0037: }
0038:
*** End of Internal Generated Transformer CodeIt looks like the linking of the transformer object code to the routine object which fails.
From the transformer error it looks like it is running the command
cxx -LC:/IBM/InformationServer/Server/PXEngine/lib/ -LC:/IBM/InformationServer/Server/PXEngine/lib/ -LC:/IBM/InformationServer/Server/PXEngine/user_lib -s -W/dll -W/base:0x50000000 -W/Zc:wchar_t- -lliborchnt -lliborchcorent -lliborchbuildopnt C:/IBM/InformationServer/Server/PXEngine/lib/regexp.o C:/temp/V4S0_CopyOfCopyOftestjob_Copy_of_Transformer_37.tmp.o -o C:/temp/V4S0_CopyOfCopyOftestjob_Copy_of_Transformer_37.dll
If i run the same command on the command prompt it sucessfully compiles the dll. I can then put the dll in the datastage job directory and the job will run correctly - so it looks like some kind of weird DataStage compile error?
Any help would be much appreciated.
Thanks