StringDataField and empty string

Hi,

I’ve recently run into a problem with the DEDUP component not removing duplicate records where the dedupKey field was an empty string.

After a bit of investigation it seems StringDataField treats an empty string field as null (?), but the comparision functions (equals() and compareTo()) return false/-1 when comparing two string fields that both contain the empty string (i.e. null) - so DEDUP treats the records as being different.

Is this the expected behaviour for equals() / compareTo()?

If so, is there an easy workaround for removing duplicate records where the field is an empty string?

Thanks,
Peter

I had this problem too, it allmost costed me my life :)…
This is the quick fix we used:

public Object getValue() {
return (isNull ? “” : value);
}

from StringDataField class
I think that the same problem will occur with Numbers too…

p.s. while trying to find a better fix, i also tried to use default value within metadata definition, if an string is empty then default should be ???
BUT, no success. Also tried to set nullable=“no” and dataPolicy=“lenient” but it seems like the method populateFieldFailure from BadDataFormatExceptionHandler class is not doing the job it’s suppossed to.
Has default worked before ???

Hi there,

Just wondering whether you’d had any more thoughts on this?

I guess theres a bigger question over whether you want to treat empty strings and nulls as the same thing - but it would still be very helpful (and DEDUP would work correctly for me) if “”.compareTo(“”) returned 0 and “”.equals(“”) returned true (currently they will return -1/false)

Regards,
Peter

Hi Peter,
i agree with you. It seems to be unpredictable behavior. I think that empty string is something else null field ;). David is not on this week available for clover but next week we solve this problem.

OtaSanek

Hello Peter !

We have added (specially for Dedup) option to treat two nulls as equal.
This is only temporal solution, but we can’t immediately change the behaviour of StringDataField to accept empty strings as correct value. Too many other things depend on it.
Nonetheless, it will be reexamined in future releases where the whole agenda around NULL can change.

David

Hello !

In general, if field is set to have NULL value, then getValue returns null reference.

The situation with defaults and lenient data policy works with data parsers.

Other than that, if you want to assign default value, you have to specifically call setToDefaultValue() either on DataRecord level or DataField level.

David.