JSONReader vs. JSONExtract mappings

codemonkee · November 7, 2014, 12:00am

Hello,

I just installed the CloverETL Designer (Version 4.0.0.030M2) demo and am testing these components with a file containing numerous of JSON objects.

JSONReader
I’ve configured a JSONReader component for Implicit Mapping and run into 2 problems:

Even after setting my heap size to -Xmx1500m, I get a Java heap error if I try to read a file of 50,000 objects (25,000 objects works OK). Is there a way to get past this heap issue?
In cases where the Element value is an array - only the first data value is returned to the output record. How do I configure this the return all the array values?

JSONExtract
I’ve configured the Mapping as:
<![CDATA[

/Mapping>

]]>

No heap problems
Each input row produces multiple output rows, one row/each value. (e.g. {“group_id”:“1”,“city”:“Paris”,“city_codes”:[“FR”,“MZ”]} produces 4 output rows, the first with just the group_id populated, the second with only the city populated, etc.). Can I get these to appear in a single row w/o using additional components (i.e Combine, etc.)?

Thanks!

imriskal · November 10, 2014, 12:57pm

Hello,

Let me answer step by step:

JSONReader

This is unfortunately expected. JSONReader uses DOM tree to parse the input and this tree can grow quite fast in case of 50.000 records and any complex structure in the data. It is therefore recommended to use JSONExtract anywhere it is possible.
All array values can be returned for example this way (you need two output ports):

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
  
<Context xpath="/root/object" outPort="0">

  <Mapping cloverField="group_id" xpath="group_x005fid"/>
  <Mapping cloverField="city" xpath="city"/>

    <Context xpath="city_x005fcodes" outPort="1">
      <Mapping cloverField="city_codes" xpath="." />

    </Context>
</Context>

JSONExtract

This is thanks to SAX technology used instead of DOM. Memory requirements are much lower in this case.
I think this is not possible or even wanted. There could be any number of items in an array so you can not prepare metadata in advance. You could read all of them in one string but this is better done using for example Denormalizer afterwards where you can define delimiters, quotation marks and so on for the values. There is however a plan to support direct extraction of arrays and maps. They should be extracted into a list or map variable in one field (“Container type” property set to “List” or “Map” on the metadata field). For more details, see https://bug.javlin.eu/browse/CLO-2054

I hope this helps.

Topic		Replies	Views
Json Extract CloverDX Platform	1	4	April 23, 2015
JsonExtract could not read second record of root array CloverDX Platform	2	14	February 13, 2019
How to use Json Extractor when records don't have all the same attributes CloverDX Platform	3	7	June 28, 2016
JsonExtract could not read json starting with array CloverDX Platform	3	8	May 22, 2018
Issue with Xml Reader CloverDX Platform	3	2	May 13, 2016

JSONReader vs. JSONExtract mappings

Related topics