Weired Join Stage problem.
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 161
- Joined: Wed Aug 02, 2006 4:30 am
Weired Join Stage problem.
I found a very weired things in Join stage, I couldn't understand why DS works like that .
Sequencial file stage 1
Varchar(3)
001
002
003
Sequencial file stage 2
varchar(5), varchar(5)
0010, apple
001, pear
Inner join 1 and 2 will get
001, apple
001, pear
However, if I increase sequencial file1 from varchar(3) to varchar(5), DS will give me right answer..
Anybody can help me explain why this can happen ? or I need to set some environment variables to get rid of such things.
thanks
Sequencial file stage 1
Varchar(3)
001
002
003
Sequencial file stage 2
varchar(5), varchar(5)
0010, apple
001, pear
Inner join 1 and 2 will get
001, apple
001, pear
However, if I increase sequencial file1 from varchar(3) to varchar(5), DS will give me right answer..
Anybody can help me explain why this can happen ? or I need to set some environment variables to get rid of such things.
thanks
As I understand it, datatype matters in a join, as does data format. For example, '003 ' is different from '003'.
I know this is the case in a filter. If the field is 5 characters and has a 'A', then the filter clause needs to be 'A '. I would assume the same rule applies in a join stage.
... course that may be specific to 'char' and not varchar. We don't use many varchars, especially for fields that are keys or often used in where clauses, so to be honest we have rarely run into this.
Maybe the bigger issue is matching on datatype...
Brad.
I know this is the case in a filter. If the field is 5 characters and has a 'A', then the filter clause needs to be 'A '. I would assume the same rule applies in a join stage.
... course that may be specific to 'char' and not varchar. We don't use many varchars, especially for fields that are keys or often used in where clauses, so to be honest we have rarely run into this.
Maybe the bigger issue is matching on datatype...
Brad.
Join stage will compare the field names as well as the meta data. Since the lengths are different (3 and 5) the field with length 5 is truncated to 3. hence the output. Out of curiosity what happens with RCP off in this case with different lengths? Can u try that out. should not make a difference, but can u post the result here.
-
- Participant
- Posts: 161
- Joined: Wed Aug 02, 2006 4:30 am
-
- Participant
- Posts: 612
- Joined: Thu May 03, 2007 4:59 am
- Location: Melbourne
Is there a chance that there is something wrong with your job design? I just ran the above example and it gives me the right answer.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
Right!
I agree to Josh. Even I could not really recreate the problem. Whereas I got the following warning which may put some light in understanding why its happening.
When checking operator: When binding output schema variable "leftRec": When binding output interface field "id1" to field "Id1": Implicit conversion from source type "string[max=5]" to result type "string[max=3]": Possible truncation of variable length string.
I feel that the VARCHAR(5) column is getting truncated to VarChar(3). And hence the wierd result.
having said that I accept the fact that I have failed to generate the wrong output![Smile :)](./images/smilies/icon_smile.gif)
I agree to Josh. Even I could not really recreate the problem. Whereas I got the following warning which may put some light in understanding why its happening.
When checking operator: When binding output schema variable "leftRec": When binding output interface field "id1" to field "Id1": Implicit conversion from source type "string[max=5]" to result type "string[max=3]": Possible truncation of variable length string.
I feel that the VARCHAR(5) column is getting truncated to VarChar(3). And hence the wierd result.
having said that I accept the fact that I have failed to generate the wrong output
![Smile :)](./images/smilies/icon_smile.gif)
-
- Participant
- Posts: 612
- Joined: Thu May 03, 2007 4:59 am
- Location: Melbourne
If you change the link odering then below error also will disappear.
This is because your "leftRec" of Inner join is with VarChar(3) and you have mapped that to the output result column, which is VarChar(5).
This is because your "leftRec" of Inner join is with VarChar(3) and you have mapped that to the output result column, which is VarChar(5).
Rubu wrote: Whereas I got the following warning which may put some light in understanding why its happening.
When checking operator: When binding output schema variable "leftRec": When binding output interface field "id1" to field "Id1": Implicit conversion from source type "string[max=5]" to result type "string[max=3]": Possible truncation of variable length string.
I feel that the VARCHAR(5) column is getting truncated to VarChar(3). And hence the wierd result.
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
-
- Participant
- Posts: 161
- Joined: Wed Aug 02, 2006 4:30 am
-
- Participant
- Posts: 612
- Joined: Thu May 03, 2007 4:59 am
- Location: Melbourne
You are the OP , pls. test and see for yourself. We have already posted our findings to help you on this regard.
xinhuang66 wrote:Hi Josh,JoshGeorge wrote:Is there a chance that there is something wrong with your job design? I just ran the above example and it gives me the right answer.
I guess that is possible because of the input link sequences, Can you just swap your input link to see what will happen?
Joshy George
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
<a href="http://www.linkedin.com/in/joshygeorge1" ><img src="http://www.linkedin.com/img/webpromo/bt ... _80x15.gif" width="80" height="15" border="0"></a>
We had a similar kind of problem , but our join keys were Char(8) and Varchar(8) . THe result as has laready been mentioned is weird.
So we made both the join keys data type same and have followed the same principle in every job that has a JOIN Stage , to keep join keys data type same ( even lenght)
So we made both the join keys data type same and have followed the same principle in every job that has a JOIN Stage , to keep join keys data type same ( even lenght)