I am new to clover and I’m trying to use it with PostgreSql. I am trying to insert rows into a table with an integer id field. The id should be generated from a sequence function in Postgres. Within Postgres I would use this statement in my Insert statement to get the next value from the sequence generator:
select nextval( ‘dwh.uuid_seq’ )
But in DbOutputTable, if I try to use that in the Insert statement I get an error message. I don’t want to get a sequence number internally generated from clover as it won’t match the numbers the database generates and will cause data integrity issues.
Thank you.
Hello, slaurenc,
You can use DBExecute component instead of DBOutputTable in this case. For more details, see http://doc.cloveretl.com/documentation/ … ecute.html
Best regards,
Hello, slaurenc,
Optionally you can send us:
* error message DbOutputTable produces
* SQL query you use
* DB type and version
and we will try to analyse cause.
I’ve simplified it for the purposes of discussion. I am trying to build the community dimension based on which communities exist in my data record. I am reading in a data file, looking at the community id column and eliminating all rows that are missing community id. I then reformat the data to my output file format. Then I deduplicate so I have a unique list of communities. I want to add a sequence id number by calling a db function and add that id number to my metadata and then write it out to my database table. I am stuck with the DBExecute function. I can get it to call my function and return a sequence number, but I am struggling with inputs and outputs. My metadata I am sending in has 11 columns. The first is empty on input, as that is where the sequence number I am generating will be stored. I can get that to be filled in on output, but how do I pass along the data in the other 11 columns from input to output? They are not doing anything in this step. Or is there something else I need to be doing to join up this database generated number with my other data?
Thanks!
Here is the graph:
<?xml version="1.0" encoding="UTF-8"?><Graph author="Serena" created="Tue May 21 09:19:56 EDT 2013" guiVersion="3.4.0.P" id="1369151214077" licenseType="Evaluation" modified="Tue Jun 04 11:34:40 EDT 2013" modifiedBy="Serena" name="GetUsers" revision="1.91" showComponentDetails="true">
<Global>
<Metadata fileURL="${META_DIR}/reformat_stats.fmt" id="Metadata10"/>
<Metadata id="Metadata8" previewAttachmentCharset="ISO-8859-1">
<Record fieldDelimiter=";" name="db_dim_community" previewAttachmentCharset="ISO-8859-1" recordDelimiter="\n" recordSize="-1" type="delimited">
<Field eofAsDelimiter="false" name="dim_community_id" nullable="true" shift="0" size="0" type="integer"/>
<Field eofAsDelimiter="false" name="src_community_id" nullable="true" shift="0" size="0" type="integer"/>
<Field eofAsDelimiter="false" name="name" nullable="true" shift="0" size="0" type="string"/>
<Field eofAsDelimiter="false" name="alias" nullable="true" shift="0" size="0" type="string"/>
<Field eofAsDelimiter="false" name="description" nullable="true" shift="0" size="0" type="string"/>
<Field eofAsDelimiter="false" format="yyyy-MM-dd" name="effective_date" nullable="true" shift="0" size="0" type="date"/>
<Field eofAsDelimiter="false" format="yyyy-MM-dd" name="expiry_date" nullable="true" shift="0" size="0" type="date"/>
<Field eofAsDelimiter="false" name="number_times_published" nullable="true" shift="0" size="0" type="integer"/>
<Field eofAsDelimiter="false" name="is_include_on_rebuild" nullable="true" shift="0" size="0" type="boolean"/>
<Field eofAsDelimiter="false" name="base_object_id" nullable="true" shift="0" size="0" type="integer"/>
<Field eofAsDelimiter="false" format="yyyy-MM-dd" name="last_updated" nullable="true" shift="0" size="0" type="date"/>
</Record>
</Metadata>
<Metadata id="Metadata11">
<Record fieldDelimiter="|" name="seq_no" recordDelimiter="\r\n" type="delimited">
<Field name="seq" type="integer"/>
</Record>
</Metadata>
<Metadata id="Metadata1" previewAttachment="${DATAIN_DIR}/stats_web2_20130402.csv" previewAttachmentCharset="ISO-8859-1">
<Record fieldDelimiter="|" label="stats.csv" name="stats_csv" previewAttachment="${DATAIN_DIR}/stats_web2_20130402.csv" previewAttachmentCharset="ISO-8859-1" quoteChar="both" quotedStrings="false" recordDelimiter="\r\n" skipSourceRows="1" type="delimited">
<Field name="timestamp" type="string"/>
<Field name="action" type="string"/>
<Field name="session_id" type="string"/>
<Field name="user_id" type="integer"/>
<Field name="community_id" type="integer"/>
<Field name="origin" type="string"/>
<Field name="referer" type="string"/>
<Field name="referrer" type="string"/>
<Field name="username" type="string"/>
<Field name="user_agent" type="string"/>
<Field name="ip" type="string"/>
<Field name="service_id" type="integer"/>
<Field name="creative_id" type="integer"/>
<Field name="nav_id" type="integer"/>
<Field name="ssio_id" type="integer"/>
<Field name="resolution" type="string"/>
<Field name="adSize" type="string"/>
<Field name="browserInfo" type="string"/>
<Field name="data" type="string"/>
<Field eofAsDelimiter="false" name="prize" type="string"/>
</Record>
</Metadata>
<Connection dbConfig="${CONN_DIR}/guruse.cfg" id="JDBC0" type="JDBC"/>
<Property fileURL="workspace.prm" id="GraphParameter0"/>
<Sequence cached="8" id="Sequence0" name="ID_next" start="2" step="1" type="PRIMITIVE_SEQUENCE"/>
<LookupTable dbConnection="JDBC0" id="LookupTable0" maxCached="0" metadata="Metadata8" name="db_dim_community" storeNulls="true" type="dbLookup">
<attr name="sqlQuery"><![CDATA[select id, src_community_id from dwh.dim_community where src_community_id = ?]]></attr>
</LookupTable>
<Dictionary/>
</Global>
<Phase number="0">
<Node dbConnection="JDBC0" enabled="enabled" guiName="all_data" guiX="15" guiY="128" id="ALL_DATA" type="DB_INPUT_TABLE">
<attr name="sqlQuery"><![CDATA[select * from dwh.all_data]]></attr>
</Node>
<Node enabled="enabled" guiName="community" guiX="546" guiY="122" id="COMMUNITY" type="REFORMAT">
<attr name="transform"><![CDATA[//#CTL2
// Transforms input record into output record.
function integer transform() {
$out.0.last_updated = today();
$out.0.expiry_date = 2020-12-31;
$out.0.number_times_published = 1;
$out.0.is_include_on_rebuild = false;
$out.0.effective_date = today();
$out.0.name = 'community_name';
$out.0.alias = 'alias';
$out.0.description = 'desc';
$out.0.base_object_id = 1;
$out.0.src_community_id = $in.0.community_id;
return ALL;
}
// Called during component initialization.
// function boolean init() {}
// Called during each graph run before the transform is executed. May be used to allocate and initialize resources
// required by the transform. All resources allocated within this method should be released
// by the postExecute() method.
// function void preExecute() {}
// Called only if transform() throws an exception.
// function integer transformOnError(string errorMessage, string stackTrace) {}
// Called during each graph run after the entire transform was executed. Should be used to free any resources
// allocated within the preExecute() method.
// function void postExecute() {}
// Called to return a user-defined error message when an error occurs.
// function string getMessage() {}
]]></attr>
</Node>
<Node dedupKey="src_community_id(a)" enabled="enabled" guiName="community_id" guiX="707" guiY="128" id="COMMUNITY_ID" type="DEDUP"/>
<Node enabled="enabled" guiName="Community_Notnull" guiX="356" guiY="128" id="COMMUNITY_NOTNULL" type="EXT_FILTER">
<attr name="filterExpression"><![CDATA[//#CTL2
$in.0.community_id <> null]]></attr>
</Node>
<Node callStatement="true" dbConnection="JDBC0" enabled="enabled" guiName="DBExecute" guiX="876" guiY="135" id="DBEXECUTE1" outputFields="dim_community_id" type="DB_EXECUTE">
<attr name="sqlQuery"><![CDATA[{call core.get_next_id()}]]></attr>
</Node>
<Node dbConnection="JDBC0" enabled="enabled" guiName="dim_community" guiX="1050" guiY="135" id="DIM_COMMUNITY2" type="DB_OUTPUT_TABLE">
<attr name="sqlQuery"><![CDATA[INSERT INTO dwh.dim_community (id, name, alias, description, effective_date, expiry_date, number_times_published, is_include_on_rebuild, base_object_id, last_updated, src_community_id)
VALUES ($dim_community_id, $name, $alias, $description, $effective_date, $expiry_date, $number_times_published, $is_include_on_rebuild, $base_object_id, $last_updated, $src_community_id)]]></attr>
</Node>
<Node enabled="enabled" guiName="ExtSort" guiX="184" guiY="122" id="EXT_SORT1" sortKey="community_id(a)" type="EXT_SORT"/>
<Edge fromNode="ALL_DATA:0" guiBendpoints="" guiRouter="Manhattan" id="Edge2" inPort="Port 0 (in)" metadata="Metadata10" outPort="Port 0 (out)" toNode="EXT_SORT1:0"/>
<Edge fromNode="COMMUNITY:0" guiBendpoints="" guiRouter="Manhattan" id="Edge4" inPort="Port 0 (in)" metadata="Metadata8" outPort="Port 0 (out)" toNode="COMMUNITY_ID:0"/>
<Edge debugMode="true" fromNode="COMMUNITY_ID:0" guiBendpoints="" guiRouter="Manhattan" id="Edge3" inPort="Port 0 (input parameters)" metadata="Metadata8" outPort="Port 0 (unique)" toNode="DBEXECUTE1:0"/>
<Edge fromNode="COMMUNITY_NOTNULL:0" guiBendpoints="" guiRouter="Manhattan" id="Edge21" inPort="Port 0 (in)" metadata="Metadata10" outPort="Port 0 (accepted)" toNode="COMMUNITY:0"/>
<Edge debugMode="true" fromNode="DBEXECUTE1:0" guiBendpoints="" guiRouter="Manhattan" id="Edge28" inPort="Port 0 (in)" metadata="Metadata8" outPort="Port 0 (procedure output)" toNode="DIM_COMMUNITY2:0"/>
<Edge fromNode="EXT_SORT1:0" guiBendpoints="" guiRouter="Manhattan" id="Edge5" inPort="Port 0 (in)" metadata="Metadata10" outPort="Port 0 (out)" toNode="COMMUNITY_NOTNULL:0"/>
</Phase>
</Graph>
Hi, Serena,
First of all, the condition !isnull($in.0.community_id) I suggested in your second forum question makes more sense in ExtFilter than your condition $in.0.community_id <> null.
But regarding your question, there is a few possible options.
Option nr.1: Use SimpleCopy, generate your IDs from DB with DBExecute on one of the streams and combine the two streams again using Combine component. Then save it using DBOutputTable.
Option nr.2: Nextval can be a part of insert query. For sequence “s”, table “t” and integer fields “a0”,“a1”,“a2”,“a3” we can do something like: INSERT INTO a (a0, a1, a2, a3) VALUES (nextval(‘s’), nextval(‘s’), nextval(‘s’), nextval(‘s’));
Option nr.3: http://stackoverflow.com/questions/7877 … oincrement
Best regards,