Update field in key when there is a match else then Insert
Posted: Wed Jun 21, 2017 2:47 pm
I have a requirement to read in a mainframe flat file, use two fields from the source record to see if a record with that key exists on the target greenplum table where the column END_EFF_DT = '9999-12-31'. If it exists, I need to update the target record and set the END_EFF_DT field to the value in INMOL-END-EFF-DT from the source file and then insert a new record into the target which includes all fields from the source setting END_EFF_DT = '9999-12-31' on the new record. If there is not a match, then I need to insert a new record into the target which includes all fields from the source setting END_EFF_DT = '9999-12-31' on the new record.
Example of Update statement:
UPDATE TARGET_TABLE
SET END_EFF_DT = :INMOL-END-EFF-DT (SOURCE FIELD)
WHERE MOL_PRFX_POL_NO = :INMOL-MOL-PRFX-POL-NO (SOURCE FIELD)
AND MOL_POL_NO = :INMOL-MOL-POL-NO (SOURCE FIELD)
AND END_EFF_DT = '9999-12-31'
To do this, I am doing the following:
1) created a job to read the source file and load it into a staging table
that will be truncated prior to each load.
2) Once loaded, I have an after SQL script that joins the staging table with
the target replicating the update statement listed above.
3) once the job from step 2 is completed, created a job to read the staging
table and do an insert for each record into the target table.
*****The target table is in a POSTGRESQL database(Greenplum) and we are using the native greenplum connector and utilizing the after SQL option.
*****The source file will have only approx. 100 records on a heavy day and the target table has approx. 2 million records.
My concern with taking this approach is that the number of records updated by the after SQL script will not be captured and only the inserts will be.
Any suggestions on an alternate/better approach?
I'm fairly new at datastage and would appreciate any and all feedback.
Example of Update statement:
UPDATE TARGET_TABLE
SET END_EFF_DT = :INMOL-END-EFF-DT (SOURCE FIELD)
WHERE MOL_PRFX_POL_NO = :INMOL-MOL-PRFX-POL-NO (SOURCE FIELD)
AND MOL_POL_NO = :INMOL-MOL-POL-NO (SOURCE FIELD)
AND END_EFF_DT = '9999-12-31'
To do this, I am doing the following:
1) created a job to read the source file and load it into a staging table
that will be truncated prior to each load.
2) Once loaded, I have an after SQL script that joins the staging table with
the target replicating the update statement listed above.
3) once the job from step 2 is completed, created a job to read the staging
table and do an insert for each record into the target table.
*****The target table is in a POSTGRESQL database(Greenplum) and we are using the native greenplum connector and utilizing the after SQL option.
*****The source file will have only approx. 100 records on a heavy day and the target table has approx. 2 million records.
My concern with taking this approach is that the number of records updated by the after SQL script will not be captured and only the inserts will be.
Any suggestions on an alternate/better approach?
I'm fairly new at datastage and would appreciate any and all feedback.