Hi,
I want to reject records that have the same field values and send them to port #1…
I have a Partition node that has the PartitionClass attribute set to the java class below:
public class DuplicateRowPartitioner implements PartitionFunction {
private static final Logger logger = Logger.getLogger(DuplicateRowPartitioner.class);
DataRecord previous;
/**
* unique records are sent to output port 0. exact duplicate records are sent
* to output port 1.
*/
public int getOutputPort(DataRecord record) {
if (previous == null) {
logger.info(new StringBuffer(“Comparing record : '”).append(
record.toString()).append(“’ to previous record : NULL…”).toString());
}
else {
logger.info(new StringBuffer(“Comparing record : '”).append(
record.toString()).append(“’ to previous record : '”).append(
previous.toString()).append(“'…”).toString());
}
// sent exact duplicate records to a different output port
if (previous != null && record.equals(previous)) {
logger.info(new StringBuffer(“Found duplicate record…”).toString());
return 1;
}
logger.info(new StringBuffer(“Found different record…”).toString());
previous = record.duplicate();
return 0;
} // end of getOutputPort()
public void init(int numPartitions, RecordKey partitionKey)
throws ComponentNotReadyException {
} // end init()
}
I can see that I have a record that has field values exactly the same as those in the previous record (see the logger output below), but Clover still treats them as different records??
INFO [PARTITION_0] - Comparing record : '#0|REFERENCE|S->10273
#1|POSITION|S->5
#2|AMOUNT_1|N->0.0
’ to previous record : '#0|REFERENCE|S->10273
#1|POSITION|S->5
#2|AMOUNT_1|N->0.0
'…
INFO [PARTITION_0] - Found different record…
Does it mean that I cannot use record.equals(previous) (see Java class above) to check whether they have exact duplicate field values? What should I do instead?
Thanks,
al