Multi-valued fields

I’m dealing with XML input and I’m trying to figure out the best way to handle the case when a field will have multiple values. For input like:


<foo>
    <bar>some data</bar>
    <baz>some baz data</baz>
    <bar>some different data</bar>
</foo>

The field “bar” on foo has two distinct values. The default behavior for the xml reader is that one of these is read in. I found that if I specify a mapping like this:

<Mapping xpath="string-join(bar/text(), '@!@')" cloverField="bar"/>

And then tokenize the input on the join sequence, in this case @!@, I can get back the multiple input values for the field bar.

Is there a better way to handle this situation?

You can use: if you want to write the values to one Clover field. Both possibilities are right.

I was hoping for a solution that didn’t require me to do a split on the input…


String aDataList = aRecord.getField(aFieldName).getValue().toString();
for (String aData : aDataList.split("@!@")) {
  // do something here
}

I’d rather have code like


List aList = (List) aRecord.getField(aFieldName).getValue();
for (Object aObj : aList) {
  String aData = (String) aObj;
  // do something;
}

Being able to get the data out as a list, rather than having to rely on splitting on a token is a lot safer. While it’s unlikely that @!@ would ever be in the input, it *is* possible. It’s just a hack that’d I’d rather not have to do.

I thought about implementing a ListDataField extending from DataField, but I noticed there’s a lot of serialization stuff in the DataField that I was not aware of. I’ll have to dig into the code more to understand what’s going on with that a little better. There’s also the issue of not being able to add a new field time to the DataFieldFactory, which makes it difficult to use a ListDataField if I did manage to get it implemented.

What about sending these ‘bar’ to another output port? Each record would contain just one ‘bar’ and a key to ‘foo’.

I’m not sure. So what if foo looks like:


  <foo>
    <bar>0</bar>
    <bar>1</bar>
    <bar>2</bar>
    <baz>fake</baz>
    <biz>data</biz>
  </foo>

If i’m routing the record to different n different ports, where n is the number of times a field is repeated, do the other fields, baz and biz in this case, get copied to each port? Also, would you have to know ahead of time what n so you can route things in your graph appropriately?