I have a project requires ETL from database to database.
But those source tables don’t have timestamp design, so I don’t know which datas were changed from the last time execute ETL job?
I thougth perhaps use ETL log tables can capture those changed data.
In ETL log table I can get when execute ETL job the last time, and the source table id and target table id relations.
Who can tell me how can I develop increment ETL job for this case?
Yesterday, I have thought two way to settle this problem.
The first way is using datebase trigger.
I can create three column on source table: createtime,updatetime, validsign . Then create trigger to update those three column when insert, update,delete. But this way can raise performance problem.
The second way is parse database log to get source table changed information. But this way may irresponsible because log can be overwrite before parse log next time.