Hi,
I have questions regarding an ODBC stage with 2 input links, both inserting into the same table.
I can enter certain properties per input link, e.g. update action or transaction size.
What happens if I define different transaction sizes for both input links (5000 and 1). Does the greater one overrule the less one? Or vice versa? Do I have one or two connections?
What happens if I define "Clear the table, then insert rows" on both links? Can I be sure that no link clears the data already inserted by the other one?
Regards,
Stefan
ODBC Stage: Insert on 2 links; Transaction Size
Moderators: chulett, rschirm, roy
Re: ODBC Stage: Insert on 2 links; Transaction Size
Welcome aboard!
One stage = one database connection. Two links = two queries or DML running via that one connection. There is no 'overriding' here, as link one has no idea that it is working on the same table as link two, so each would work using the values associated with their own link. Probably not a 'good idea' to have them different like that, but it can certainly be done.
Not positive with ODBC.
If you really wanted to know you'd need to build an explicit test case. For the record, the 'Clear' or 'Truncate' action done by a link seems to happen when the stage 'starts', as can be seen in the log. This is generally when the first row goes down it, so you stand a chance of a 'tardy' link removing work already done by the other link. It might even be the kind of thing that works most of the time.
In that situation, I for one would do something to clear or truncate the table as the job starts - say in a Before Job via a script. Safer and more reliable that way. IMHO.
SBublitz wrote:I have questions regarding an ODBC stage with 2 input links, both inserting into the same table.
I can enter certain properties per input link, e.g. update action or transaction size.
What happens if I define different transaction sizes for both input links (5000 and 1). Does the greater one overrule the less one? Or vice versa? Do I have one or two connections?
One stage = one database connection. Two links = two queries or DML running via that one connection. There is no 'overriding' here, as link one has no idea that it is working on the same table as link two, so each would work using the values associated with their own link. Probably not a 'good idea' to have them different like that, but it can certainly be done.
What happens if I define "Clear the table, then insert rows" on both links? Can I be sure that no link clears the data already inserted by the other one?
Not positive with ODBC.
In that situation, I for one would do something to clear or truncate the table as the job starts - say in a Before Job via a script. Safer and more reliable that way. IMHO.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Re: ODBC Stage: Insert on 2 links; Transaction Size
Thank you for the quick reply.
To be sure that I am right: I have 1 connection, 2 transactions (one per link), possibly 2 threads (if links start from differents transformers).
Imagine one link is "update" and the other one "insert" (e.g. Slowly Changing Dimensions 2 szenario): Then I have to use transaction size = 1 on both links. Otherwise with transaction sizes like e.g. 5000 the "update" link could fail with "record does not exist" (because the "insert" transaction is not complete) and the insert link with "duplicate key". :?
Is this true?
To be sure that I am right: I have 1 connection, 2 transactions (one per link), possibly 2 threads (if links start from differents transformers).
Imagine one link is "update" and the other one "insert" (e.g. Slowly Changing Dimensions 2 szenario): Then I have to use transaction size = 1 on both links. Otherwise with transaction sizes like e.g. 5000 the "update" link could fail with "record does not exist" (because the "insert" transaction is not complete) and the insert link with "duplicate key". :?
Is this true?
Not sure with ODBC, being more familiar with OCI. In the OCI stage, one connection equals one transaction.
I personally would never use a Transaction Size of 1 (given a choice), unless you also build in a mechanism to 'pick up where you left off' in the event of a problem. Better, in my opinion, to leave both at zero for that all-or-nothing approach - if you can afford to.
Perhaps someone more intimate with the gory details of ODBC could help clarify things? This could also be dependant on the target DB I would think, I've been assuming Oracle but just noticed you didn't actually say.
I personally would never use a Transaction Size of 1 (given a choice), unless you also build in a mechanism to 'pick up where you left off' in the event of a problem. Better, in my opinion, to leave both at zero for that all-or-nothing approach - if you can afford to.
Perhaps someone more intimate with the gory details of ODBC could help clarify things? This could also be dependant on the target DB I would think, I've been assuming Oracle but just noticed you didn't actually say.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
ray.wurlod
- Participant
- Posts: 54595
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Provide you've not enabled transaction groupings, the SQL statements executed on the two input links to an ODBC stage are independent of each other. You can prove this quite easily by enabling ODBC tracing then poring over the results, that very thick book the ODBC API manual in hand.
Of course, if they're accessing the same table, then you may have locking issues in the database, but that's a design issue rather than a tool issue.
If you use transaction grouping, then they're no longer independent; a row is sent along each link and only then is a COMMIT sent to the database. (Technically, to the ODBC driver.)
(I believe that the ODBC 3.0 API book was actually split into three books and a CD in a slip case. It's the ODBC 2.0 API book that I referred to as very thick.)
Of course, if they're accessing the same table, then you may have locking issues in the database, but that's a design issue rather than a tool issue.
If you use transaction grouping, then they're no longer independent; a row is sent along each link and only then is a COMMIT sent to the database. (Technically, to the ODBC driver.)
(I believe that the ODBC 3.0 API book was actually split into three books and a CD in a slip case. It's the ODBC 2.0 API book that I referred to as very thick.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thank you, Ray, for your hints.
We have tried it very thoroughly and we found out the following:
If you have 2 links writing to one ODBC stage, especially to one table, and if you are using different transaction sizes per link, then they are independent except for the transaction:
One ODBC stage means one connection means one transaction at a time (we checked this for MS SQL Server): If one link has reached the number of rows as given in its transaction size then the transaction is committed (for both links!) and a new one is opened.
We have tried it very thoroughly and we found out the following:
If you have 2 links writing to one ODBC stage, especially to one table, and if you are using different transaction sizes per link, then they are independent except for the transaction:
One ODBC stage means one connection means one transaction at a time (we checked this for MS SQL Server): If one link has reached the number of rows as given in its transaction size then the transaction is committed (for both links!) and a new one is opened.
-
ray.wurlod
- Participant
- Posts: 54595
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
