Pivot one table into another

I am evaluating Clover ETL and can’t seem to find whether or not the following operation is possible. I don’t really care if this is a “custom” style operation, as long as the tool can actually do this:

Let’s say I have two tables in my source database. The first is a “regular” table, say “Customers”, with a primary key of “CustomerId”:

Source Table 1: Customers (
int customerId;
varchar customer_name;
etc.)

Table 1 Data:
CustomerId, Customer_name
1, Dell
2, IBM

Table two is the hard part: it’s a “properties”-style table, that has a link to the Customers table via it’s primary key (CustomerId), but each ROW contains a value that needs to go into a COLUMN on the target:

Source Table 2: GenericProperties (
int recordId; ← joins to CustomerId
varchar property_name;
varchar property_value;)

Table 2 Data:
recordId, property_name, property_value
1, Address, 1234 Main St.
1, Phone, 888-123-567
2, Address, 9911 Some Ave.
2, Phone, 877-456-7890

The Target table customer table:

Target Table: TargetCustomer (
int customerId;
varchar customer_name;
varchar address;
varchar phone;)

Desired result for Target table:

customerId, customer_name, address, phone
1, Dell, 1234 Main St, 888-123-567
2, IBM, 9911 Some Ave, 877-456-7890

Please note that the target table has as many rows as the source Customer table, which is twice as small as the Source GenericProperties table.

Is this possible with cloverETL?

Hello,
if you don’t know what properties can be contained in GenericProperties table, it can be very hard. But if properties are known I would do it in following way:

Hello,
if you don’t know what properties can be contained in GenericProperties table, it can be very hard. But if properties are known I would do it in following way:

“avackova”

Fantastic! I kept on searching the manuals for “pivot”, since I thought that was the standard term for turning row-based data into column-based data, but I guess it’s called “denormalization” in CloverETL.

And yes, the properties are known (or unknowns can be ignored), so this should work.

Thanks!

OK - wait a minute - something’s not right - I don’t see the “denormalizer” in the Transformers “Palette” in the CloverETL Designer. I downloaded the Community Edition. Is this not included?

Hello,
in CloverETL CE it requires a little bit more work and some more time and memory for processing, but is also possible:

Hello,
in CloverETL CE it requires a little bit more work and some more time and memory for processing, but is also possible:

“avackova”

Excellent. Thank you for that second solution.

I also noticed that I am allowed to use the denormalizer node in the “open source” command-line driven version. I got it to work by editing the .frm file and use the XML syntax. Obviously, I can’t use the GUI anymore after that on this file, but for my evaluation that is OK, since we’ll be able to justify the “commercial” version once we know this works.

I have one more technical question on these two approaches you suggested: 1) denormalizer + exthashjoin or 2) exthashjoin + dedup. In general, does CloverETL need to load the entire resultset in memory for the exthashjoin to work, or is it smart enough to only load as many rows from the “slave” table into RAM as it needs for a matching “key” in the master and then free up that memory for the next master key? My final dataset will be big and probably won’t fit in RAM if the entire dataset must be loaded into RAM.

ExtHashJoin loads all slave data into memory. ExtMergeJoin should be used for merging large data sets. It requires sorted data, but merge it “on the fly” and consume the memory for slaves with the same key only. ExtMergeJoin is available in commercial gui version and in open-source engine, but not in CE version.