Code: Select all
[db2 refernce data]db2 Connector
|
rlink
|
|
[SQL Server]-------------plink--------[Join Stg]-------------------[Postgres db]
ODBC connector ODBC connector
Source data: 100 million
Reference data: 3 million
Join: Left Outer Join
I have sorted the data in db query based on the key(varchar) column and used hash partitioning on both the links(plink & rlink).
Job is running on 8 node config file. I have used Array size and Row count to 200000 in both source and target connectors.
After doing all these configuration, datastage job is taking more than 48 hrs.
Can someone help me to optimized this job? Please let me know any possible solution other than bulk load.
Note: Target db is on aws cloud and target table has primary key index only.