Dbjoin

I am trying to use dbjoin instead of incremental file in one oof my graphs. I am selecting max(event_id) from the fact table ( where I am storing the source event_id) in DBINPUT transformation and passing that through the metadata to DBJOIN transformation. DBJoin is selecting rows from the source database using the logic > ? . This is taking very long. Taking up memory. Graph fails because it is taking very long. This is my sql in DBjoin
select
coalesce(lower(trim(“public”.“ev_quicklook”.“page”)),lower(trim(“public”.“ev_quicklook”.“page”)),‘no value’) as page,
‘no value’,
‘no value’,
coalesce(trim(upper(“public”.“ev_quicklook”.“style_group”)),trim(upper(“public”.“ev_quicklook”.“style_group”)),‘no value’) as style_group,
‘quicklook’,
“public”.“evbase”.“created”,
null,
“public”.“ev_quicklook”.“event_id”,
trim(“public”.“evbase”.“session_id”) as session_id
from
“public”.“ev_quicklook”,
“public”.“evbase”
where
“public”.“ev_quicklook”.“event_id” >? and
“public”.“ev_quicklook”.“event_id” =“public”.“evbase”.“event_id”

What am I doing wrong?

Thanks for your help.
Kasturi

There are several potential reasons why your graph is slow:

  • Missing or incorrect indexes on event_id on one of the tables.

  • What part of graph is slow - reader, joiner or the output?

  • What are the data volumes you are processing?

  • DBJoin should not be memory intensive. What other operations are you doing in the graph?

  • If the graph fails, what is the reason for failure? Can you send the log file?

  • Can you post the graph so that we can see what you are actually doing?

Thank you for your reply. Here is the graph.

For the initial load, rows were about 3 million.

It is not bad for incremental load. I take the max(event_id) from the fact and get the records which are >max(event_id) .

Thanks,
Kasturi