Page 1 of 1

Error in Join stage

Posted: Tue Jan 02, 2018 3:29 pm
by subashri
Hi ,

I am getting the following error when trying to join two tables each having 40 million records .
Source 1 -->Checksum-->Join (Primary link )
Source 2 --> Join (reference )

No other complex transformation involved. The Partition is Auto on the Join stage .Job runs on 4 Node in Big Integ environment.

APT_CombinedOperatorController,0: APT_IOPort: read failed on [fd 5: L16.34.70.10:40212, R16.34.70.10:11028], errno 104 (Connection reset by peer)

APT_CombinedOperatorController,0: Virtual data set.; input to "inserted tsort operator {key={value=C1, subArgs={asc, nulls={value=first}, cs}}, key={value=C2, subArgs={asc, nulls={value=first}, cs}}, key={value=C3, subArgs={asc, nulls={value=first}, cs}}, key={value=C4, subArgs={asc, nulls={value=first}, cs}}, key={value=C5, subArgs={asc, nulls={value=first}, cs}}}(1)": getRecord DM read error; returning false.

Please help .

Thanks,

Posted: Tue Jan 02, 2018 3:58 pm
by chulett
Well... the first bit of advice in situations like this is always going to be to add $APT_DISABLE_COMBINATION to the job and set it to TRUE. This will allow you to know what the actual origin of the error is.

And this probably isn't important but I'm not really sure what a "Big Integ environment" is. :?

Posted: Wed Jan 03, 2018 6:10 am
by subashri
Hi Craig,
Issue is resolved by adding the env variable . I think this will impact the job performance since combining is disabled .
I am just eager to know if there is any other work around with out impacting the job performance.

By d way , I was trying to shorten the term Big Integrate in my original post :)

Thanks for your help

Posted: Wed Jan 03, 2018 9:07 am
by chulett
Just so you know, that variable was not meant to solve anything, simply help with debugging by stopping the operator combining that the underlying framework does to make things more efficient. The end goal was simply to move from an error message from "APT_CombinedOperatorController" to the actual operator with the issue.

Others that have been in the same situation can chime in with advice on what to do next. I'd suggest opening a support case and let them know what's going on, see what they suggest. Are you current on your fix packs for your version?

Posted: Wed Jan 03, 2018 10:52 am
by qt_ky
I would second Craig on that. Your error is likely intermittent and will happen again. It would help to prepare and open a Support case about it.