| ID | Seq_ID | Sequence |
-------------------------------------
| 10 | 7156582 | 1. R, 2. O |
| 11 | 3584395 | 1. R |
| 12 | 3392298 | 1. L, 2. H |
But I need to break it out like this:
| ID | Seq_ID | Sequence |
-----------------------------------
| 10 | 7156582 | R |
| 11 | 7156582 | O |
| 12 | 3584395 | R |
| 13 | 3392298 | L |
| 14 | 3392298 | H |
I’ve been told that I need to use the denormalizer to accomplish this and that the transformations in the denormalizer only work as CLT commands and I don’t know the CLT-language and the documentation isn’t clear enough for a beginner to use.
Hello Jcatoe,
as you rightly indicated, such transformation would require a bit more complex CTL coding. However, the component that would ‘do the trick’ is called Normalizer as the input data shown in your post is, in fact, denormalized. Attached is an example graph demonstrating the usage of the Normalizer component based on your case. The main idea is to split the Sequence field (by a comma as the delimiter) in the count() function and to push the number of the delimited sub-fields as the number of iterations to the transform() function. I would suggest reviewing out documentation to get a deeper insight into how Normalizer is designed to work.
On the side note, you can perform the entire transformation within the Normalizer component if you wanted to. However, I pulled out the minor cosmetic transformations into a separate Reformat component in order to keep the data normalizing code as transparent as possible. In the Reformat, you can than observer the usage of CloverETL sequences and the usage of the right function.
Kind regards,