Join Fails
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 145
- Joined: Fri Sep 15, 2006 12:23 pm
-
- Premium Member
- Posts: 145
- Joined: Fri Sep 15, 2006 12:23 pm
-
- Premium Member
- Posts: 145
- Joined: Fri Sep 15, 2006 12:23 pm
here is the osh of the join stage.
Code: Select all
## General options
[ident('oc_sap_corr_rcpi_cat_code'); jobmon_ident('oc_sap_corr_rcpi_cat_code')]
## Outputs
0> [] 'oc_sap_corr_rcpi_cat_code:ln_read_oracle_sap_corr_usage_codex.v'
;
#### STAGE: Join_785.DSLink723_Part
## Operator
hash -key KEY_FIELD -cs
## General options
[ident('Join_785.DSLink723_Part')]
## Inputs
0< [] 'Copy_788:DSLink723.v'
## Outputs
0> [] 'Copy_788:DSLink723_Part.v'
;
#### STAGE: Join_785.DSLink723_Sort
## Operator
tsort -key KEY_FIELD -cs -asc -nonStable
## General options
[ident('Join_785.DSLink723_Sort')]
## Inputs
0< [] 'Copy_788:DSLink723_Part.v'
## Outputs
0> [] 'Copy_788:DSLink723_Part_Sort.v'
;
#### STAGE: Join_785.DSLink721_Part
## Operator
hash -key KEY_FIELD -cs
## General options
[ident('Join_785.DSLink721_Part')]
## Inputs
0< [] 'Copy_791:DSLink721.v'
## Outputs
0> [] 'Copy_791:DSLink721_Part.v'
;
#### STAGE: Join_785.DSLink721_Sort
## Operator
tsort -key KEY_FIELD -cs -asc -nonStable
## General options
[ident('Join_785.DSLink721_Sort')]
## Inputs
0< [] 'Copy_791:DSLink721_Part.v'
## Outputs
0> [] 'Copy_791:DSLink721_Part_Sort.v'
;
#################################################################
#### STAGE: Join_785
## Operator
innerjoin
## Operator options
-key 'KEY_FIELD'
## General options
[ident('Join_785'); jobmon_ident('Join_785')]
## Inputs
0< [] 'Copy_788:DSLink723_Part_Sort.v'
1< [] 'Copy_791:DSLink721_Part_Sort.v'
## Outputs
0> [modify (
keep
*********** ALL COLUMNS**********
)] 'Join_785:ln_data_1.v'
Nothing to do with stage and its relates to data is my guess.
You haven't confirmed whether the source is file / db / ??
For test purpose, write hardly 2 rows of data into dataset files and then use that files.
You haven't confirmed whether the source is file / db / ??
For test purpose, write hardly 2 rows of data into dataset files and then use that files.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
-
- Premium Member
- Posts: 145
- Joined: Fri Sep 15, 2006 12:23 pm
This is happening due to the oracle connector stage that I am using. when I join the data from oracle connector & dataset it fails for most of the records as I have described earlier. when I do the same with a dataset in both the input links it works fine. so, not sure what is not right with oracle connector stage. Although, I have taken out the spaces using trim.
could this be a resource issue?
could this be a resource issue?
When you use a lookup instead of a join and are matching all rows, are you using datasets only or the Oracle connector? What is the key type you are using in the lookup? Are you using a sparse or normal lookup?
When you use datasets only, where did the data in the datasets come from? Is it the source data that was loaded to Oracle, or is it data that was extracted from Oracle? What about key nullability? If your dataset key is non-nullable, but the Oracle key column is nullable, you may need to handle nulls when reading from Oracle, even if you specify non-nullable in the column metadata.
A join stage requires that the key values are identical (byte-for-byte) and in the same partition in order for one row to match another. If you are trimming keys, are you doing so BEFORE you partition and sort the rows?
You may find that you will need to manually compare the keys from two rows you believe should have matched and didn't to discover that the difference is.
Regards,
Regards,
When you use datasets only, where did the data in the datasets come from? Is it the source data that was loaded to Oracle, or is it data that was extracted from Oracle? What about key nullability? If your dataset key is non-nullable, but the Oracle key column is nullable, you may need to handle nulls when reading from Oracle, even if you specify non-nullable in the column metadata.
A join stage requires that the key values are identical (byte-for-byte) and in the same partition in order for one row to match another. If you are trimming keys, are you doing so BEFORE you partition and sort the rows?
You may find that you will need to manually compare the keys from two rows you believe should have matched and didn't to discover that the difference is.
Regards,
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
-
- Premium Member
- Posts: 145
- Joined: Fri Sep 15, 2006 12:23 pm
Well, when I am running in single node config it's still producing the same results. I have verified the data byte to byte and it is matching. one more issue I could see when I am reading a dataset from the same job it reads around 4 million records only when I remove the other stages and just try to read it, reads around 4.9 million records. has this anything to do with the memory as I can see the memory usage is peeking to 99.6% and also it is joining only 25K records only.
If you are certain that all of the rows which should match are in both inputs to the join, then you have one of the following situations:
1) Data is out of order
2) Data is incorrectly partitioned (apparently moot as you have the same results in a single-node configuration)
3) Data does not match exactly (all data, whitespace included)
As #2 is apparently eliminated, #1 and #3 are left to continue investigating. The questions I asked in my earlier posts are meant to help both identify and eliminate potential causes.
Join will only read an input when there is the possibility of a match to another input (i.e. key value on input 1 is < key value on input 2, therefore read input 1 until key1>=key2). Outer joins will read all rows from the Left (left outer), Right (right outer) or all (full outer) inputs, but in the case of let or right outer, may not read all of the other inputs (depending on key values). An inner join may quit reading when one of the inputs hits EOF as there is no longer a chance of a match to the other inputs.
Running out of memory can cause your jobs to run slower as processes are paged in and out of swap, as well as potentially causing failures due to unavailable resources.
Regards,
1) Data is out of order
2) Data is incorrectly partitioned (apparently moot as you have the same results in a single-node configuration)
3) Data does not match exactly (all data, whitespace included)
As #2 is apparently eliminated, #1 and #3 are left to continue investigating. The questions I asked in my earlier posts are meant to help both identify and eliminate potential causes.
Join will only read an input when there is the possibility of a match to another input (i.e. key value on input 1 is < key value on input 2, therefore read input 1 until key1>=key2). Outer joins will read all rows from the Left (left outer), Right (right outer) or all (full outer) inputs, but in the case of let or right outer, may not read all of the other inputs (depending on key values). An inner join may quit reading when one of the inputs hits EOF as there is no longer a chance of a match to the other inputs.
Running out of memory can cause your jobs to run slower as processes are paged in and out of swap, as well as potentially causing failures due to unavailable resources.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
-
- Premium Member
- Posts: 145
- Joined: Fri Sep 15, 2006 12:23 pm
Yes, James I am aware of all the 3 possibilities. strangely when I replace the join with lookup it works fine. and second thing in that environment is when I try to read the dataset from one job it reads fewer records than the dataset actually holding. and with another job it reads the right number of records. not sure why it stops reading the records and joining when it should be doing it.
Could it be that the source data is read very slowly at the rate of 100 rows/sec to read around 5 Million records causing the join to fail?
Could it be that the source data is read very slowly at the rate of 100 rows/sec to read around 5 Million records causing the join to fail?
Go to the Dataset mamagement / use admin command to check how many records dose the dataset have?ahmedwaseem2000 wrote:when I try to read the dataset from one job it reads fewer records than the dataset actually holding. and with another job it reads the right number of records. not sure why it stops reading the records and joining when it should be doing it.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
-
- Premium Member
- Posts: 145
- Joined: Fri Sep 15, 2006 12:23 pm
-
- Premium Member
- Posts: 145
- Joined: Fri Sep 15, 2006 12:23 pm
If it is relates to resource issue, it may lead to different error and that what i faced in my env.
Wrong partition kind of issue will lead you to wrong (counts) results.
To test the join, use the same table as source and reference and run the job. Let see what is happening.
Wrong partition kind of issue will lead you to wrong (counts) results.
To test the join, use the same table as source and reference and run the job. Let see what is happening.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.