How to use Json Extractor when records don't have all the same attributes

Support/help with CloverETL (4.9) and CloverDX (5.0 or newer) implementation problems

slaurenc
Posts: 5
Joined: Mon Jun 03, 2013 4:03 pm

How to use Json Extractor when records don't have all the same attributes

Postby slaurenc » Mon Mar 21, 2016 5:14 pm

I have a json file that has 9 possible combinations of columns (based on web site actions). I'm trying to use Json Extractor to load this file into a sql database. I was able to get it working so that it reads all the records based on the generated xsd schema, but that is based on the first row in the file which doesn't contain all the possible combinations. It is processing all the rows correctly for this smaller set, and I've gotten it also to map subtrees of the json data. But there are other pieces of data that are falling off.

I looked at the generated xsd schema and it has 61 rows total. I tried to edit the xsd to contain all the columns I wanted (has 115 rows now), but when I try to change the JsonExtractor to use this xsd, I get an error message in Clover: Error occured during schema reading. Reason: com.cloveretl.gui.dT:Error while initializing tree.
Clicking on details gives: com.cloveretl.gui.dT: Error while initializing tree
Caused by: org.apache.xmlbeans.XmlException: sandbox://Guruse/meta/stats14_json_serena.xsd:0: error: cvc-complex-type.2.4a: Expected elements '[email protected]://www.w3.org/2001/XMLSchema [email protected]://www.w3.org/2001/XMLSchema [email protected]://www.w3.org/2001/XMLSchema' instead of '[email protected]://www.w3.org/2001/XMLSchema' here in element [email protected]://www.w3.org/2001/XMLSchema

And then when I click on OK it brings up the same blank window where I need to generate tree structure before I can get started.

What can I do? How can I get Clover to recognize that each row in the file has the potential to need more or less mappings?

cholastal
Posts: 135
Joined: Tue Sep 01, 2015 1:22 pm

Re: How to use Json Extractor when records don't have all the same attributes

Postby cholastal » Tue Mar 29, 2016 12:18 pm

Hi,

I'm afraid that I don't fully understand what you want to achieve. I understand that the JSON files may have different structure. If the files contain some arrays that may have different number of fields, the solution will be quite easy. You just need to set the corresponding metadata container type to list and map it as an usual element (in the JSONExtract mapping property). For more detailed information see our blog, please.

If the files have different count and names of elements, they cannot be parsed by a single JSONExtract. You will need to use multiple JSONExtract components, one for each file structure. You need to create a deciding structure capable of sending the file to the correct JSONExtract. You can do this by:


If you want me to create a sample graph for you, send me the following, please.

    1. Some example input JSON files (2 or 3), so I can see what exactly is different.
    2. Your graph with its externalized dependencies (metadata, etc.), so I can see how you proceeded. Feel free to remove any sensitive information.
    3. The edited *.xsd file (with 115 rows), so I can see what went wrong.

Best regards,

---
Lukas Cholasta
CloverCARE Support
CloverDX

Visit us online at http://www.cloverdx.com

mcomsto
Posts: 1
Joined: Tue Jun 21, 2016 3:05 pm

Re: How to use Json Extractor when records don't have all the same attributes

Postby mcomsto » Tue Jun 21, 2016 3:20 pm

I have similar issue, I think.

Here is my sample data.

{
"total_results_count": 2,
"code": "OK",
"offset": 0,
"results_count": 2,
"results": [
{
"sem3_id": "2IFH5VS98y0gG22kIQi0i2",
"name": "Sea Gull Lighting 8753-34 Single-Light Outdoor Wall Lantern with Smooth White Globe, Black",
"model": "8753 34",
"brand": "Sea Gull Lighting",
"category": "Porch & Patio Lights",
"color": "Smooth White",
"created_at": 1347510294,
"ean": "0785652875335",
"features": {
"Recommended Light Bulb Shape": "A19",
"Number of Bulbs Required": "1",
"Bulbs Included": "No",
"Manufacturer Color/Finish": "Black",
"Style": "Casual/Transitional",
"Weather Resistant/Weatherproof": "Weatherproof",
"Glass Color": "White",
"Damp Rated": "Yes",
"Hardware Included": "Yes",
"Glass Style": "Flat",
"Material": "Polycarbonate",
"Dark Sky": "No",
"Package Quantity": "1",
"Bulb Type": "Incandescent",
"Fixture Height (Inches)": "7.25",
"Light Bulb Base Type": "Medium base (E-26)",
"UL Safety Listing": "Yes",
"Collection Name": "N/A",
"ENERGY STAR Qualified": "No",
"Maximum Bulb Wattage": "60",
"Color/Finish Family": "Black",
"CSA Safety Listing": "Yes",
"Fixture Width (Inches)": "6",
"Power Source": "Hardwired",
"Motion Activated": "No",
"Fixture Depth (Inches)": "7",
"Size Classification": "Small (width of product 6-in below)",
"ETL Safety Listing": "No"
},
"gtins": [
"00785652875335"
],
"geo": [
"usa"
],
"height": "184.15",
"images_total": 0,
"length": "177.80",
"manufacturer": "Sea Gull Lighting",
"mpn": "8753-34",
"upc": "785652875335",
"updated_at": 1466435233,
"width": "152.40"
},
{
"sem3_id": "0HukkZv3LTEGuMg6UQSaGY",
"name": "Sea Gull Lighting 8753-34 One Light Outdoor Wall Fixture - Black Finish",
"brand": "Sea Gull Lighting",
"category": "Outdoor Lighting",
"created_at": 1407272933,
"features": {
"Origin of Components": "USA and/or Imported",
"Shipping Weight (in pounds)": "2.18"
},
"geo": [
"usa"
],
"images_total": 0,
"manufacturer": "Sea Gull Lighting",
"updated_at": 1464091362
}
]
}

He is my jsonExtractor source mapping. In Bold, is the mapping I manually created, since I could not get the visual mapper to recognize the "features" data. I am trying to transform the features into a set of name/value pairs.

For example,

name:"Recommended Light Bulb Shape"/value:A19

Is this possible?

<Mappings>
<Mapping element="json_object">
<Mapping element="results" implicit="false" outPort="0"
xmlFields="{}brand;{}category;{}color;{}created_at;{}ean;{}geo;{}gtins;{}height;{}images_total;{}length;{}manufacturer;{}model;{}mpn;{}name;{}sem3_id;{}upc;{}updated_at;{}width"
cloverFields="brand;category;color;created_at;ean;geo;gtins;height;images_total;length;manufacturer;model;mpn;name;sem3_id;upc;updated_at;width">
<Mapping element="features" outPort="3" parentKey="sem3_id" generatedKey="sem3_id"
xmlFields="."
cloverFields="featuresValue">
</Mapping>

<Mapping element="gtins" implicit="false" outPort="2" parentKey="sem3_id" generatedKey="sem3_id"
xmlFields="."
cloverFields="gtinsValue">
</Mapping>
<Mapping element="geo" outPort="1" parentKey="sem3_id" generatedKey="sem3_id"
xmlFields="."
cloverFields="geoValue">
</Mapping>
</Mapping>
</Mapping>
</Mappings>

cholastal
Posts: 135
Joined: Tue Sep 01, 2015 1:22 pm

Re: How to use Json Extractor when records don't have all the same attributes

Postby cholastal » Tue Jun 28, 2016 1:19 pm

Hi mcomsto,

You can parse the 'features' by a component called Normalizer and dynamic field access CTL functions. I attached a simple example graph which utilizes this approach.

You mentioned that you cannot see the 'features' in the visual mapper of the JSONExtract component. I just copied your sample data into a file (with no editing) and was able to map them (see them in the mapping editor). I don't know what version of CloverETL you are using but you can try the newest one (4.2.0). You should be able to map everything via the mapping editor there.

Also, the example graph should be viewed in the 4.2.0 Designer as it was created in that version. It uses sample data from your post.

json_read.grf
(7.15 KiB) Downloaded 233 times

Best regards,

---
Lukas Cholasta
CloverCARE Support
CloverDX

Visit us online at http://www.cloverdx.com


cron