Hi,
How do i remove duplicate records without specifying all the fields? my record has a metadata of 2000 fields…
here is a subset of my input data, sorted by REFERENCE (primary key):
“REFERENCE”,“NAME”,“NO”
"000000010271 ","WFB ",“1”
"000000010271 ","WFB ",“1”
"000000010272 ","ABC ",“1”
"000000010272 ","ABC ",“2”
i want an output result like this:
“REFERENCE”,“NAME”,“NO”
"000000010271 ","WFB ",“1” (removed the duplicate)
"000000010272 ","ABC ",“1”
"000000010272 ","ABC ",“2”
i know i can use DEDUP and set the dedupKey=“REFERENCE;NAME;NO” to achieve my output, but if my input data has 2000 fields, i do not want to set dedupKey to 2000 fields, right? moreover, can dedupKey be set to such a long string? so, is there a way to tell CloverETL to remove duplicate records if i have 2000 fields to match?
i would think DEDUP would just need a flag, say remove_only_if_all_fields_matches, set to true and can reference the FMT for the list of fields… if values of each respective fields match, then it’s a duplicate and remove it… that way, DEDUP would not need the dedupKey to be set to a large number of field names… right?
just to make sure, DEDUP does not sort the records, right?
thanks,
al