What’s the easiest way to using BETWEEN logic in a join. I’m looking to grab data from two sources using a comparison on date fields.
Table 1 (actually a clover file dataset)-
Event_description
Event_date
Table 2 (actually a clover file dataset)-
event_description
event_date
I’m looking to merge the tables where Table2.event_date is between Table1.event_date -1 day AND Table1.event_date.
I can probably get there by filtering twice for each side and joining on the ID where there is a match from each side, but curious if there is an easier way to do this since the relational join only allows for one condition.
one more thing to add here. It’s actually two conditions I’m trying to join on. I want an ID=ID and then to add in the Event_Date logic where the date is in between a date from the other table and that date minus one day.
Seems like a lot of copying of data and hoops to jump through if we just use the joiners provided? It doesn’t look like it’s easy to use multiple conditions in a joiner unless they are all of the equal join type?
Is the easiest way to just write the data into a SQL DB and then use the DB joiner - all of the join logic is really easily accomplished in 1 or 2 lines of SQL…
This line would be all I need in SQL: INNER JOIN ON( T1.ID = T2.ID AND T1.date BETWEEN T2.date and T2.Date -1)
I think I’d have to use multiple joins to accomplish this in clover unless I’m pulling from a database and using the DB Joiner?
Well, the approach with DB is probably “simplest”, but the DB itself would need a series of joins and filters to perform that - which would be apparent from an execution plan.
So yes, you can do it in Clover by first joining, then filtering. Perhaps some optimisation could be done - but that would depend on the data.
Clover solution would be to join on T1.ID = T2.ID and then filter out those rows, where T1.date is not BETWEEN T2.date and T2.Date -1 → 2 components task, 1 joiner (merge or hash) and 1 filter.
Thanks for the response! I ran with the approach of the join, then filtering and it worked. The important part really was the ability to use multiple conditions in the filter component which gives you some additional flexibility. It would be nice to have multiple conditions in the joiner (similar to SQL) but at the end of the day you can accomplish it with the two components, so it can be done. Just makes the graph a little messy as I’m doing this on five different data sources ;-).
If anyone knows of any other approaches let me know. thanks!
With CloverETL version 4.0, you may use a subgraph to compound those two components into a new one - your version of join which does the joining and filtering, thus reducing the “visible” number of components in your transformation.