I’m new to ETL. I need to execute a select statement that uses incoming metadata in its where clause and then passes on the results to the output port. The select statement is looking for overlapping regions on a number line so there is no joinKey.
The input metadata is: location_id, start_position, and end_position.
The query is:
SELECT feature_id
FROM Feature f
WHERE $start_position between f.start_position AND f.end_position
OR $end_position between f.start_position AND f.end_position
The output data is: location_id, feature_id
The query can return 0 to n results. I’d prefer not to use a stored procedure because we have multiple DBMS vendors.
I have not been able to find anything in the components reference that can handle the above. I’d be grateful if someone can point me to an appropriate strategy.
Thanks,
Bill
Hello Bill,
you need to use DBJoin component. Please see attached graph for detailed settings.
Thanks, that worked great!
There is one little issue remaining with the graph though. I’m querying a SYBASE database. The feature_id field in the Feature table is Numeric(12). With other DB components I map numeric fields to long in the meta data without issue, but the DB_JOIN component says it cannot convert a decimal to long (even though it is not a decimal, it’s numeric). So, I change the the meta data to decimal instead of long and it works until the index gets too large with the error:
‘Too many digits before decimal dot, must be 6 at most (1501129704); specified precision [8,2]’
Thanks again for your help,
Bill
I was able to create a workaround with a Java transformer. I don’t know if this is the best way but it does work.
@Override
public int transform(DataRecord[] source, DataRecord[] target)
throws TransformException
{
target[0].getField("map_element_id").setValue(source[0].getField("map_element_id"));
// get the long value from the DecimalDataField input data
long ddf = ((DecimalDataField)source[1].getField("locus_id")).getLong();
//Create a new LongDataField with the long value
LongDataField ldf = new LongDataField(target[0].getField("locus_id").getMetadata(),ddf);
//transfer the LongDataField to the output data
target[0].getField("locus_id").setValue(ldf);
return RecordTransform.ALL;
}
Thanks again for your help.
Bill
Hello Bill,
it should be enough just to change length and scale properties for this field:
That is definitely easier. I just noticed that there is a"DB Metadata" property for the DBJOIN component. Can I set the feature_id to long there as another alternative?
“DB Metadata” attribute defines metadata returned by query (SELECT feature_id FROM Feature f), so only with one field. From above I would expect, that this field should be decimal with length=12 and scale=0.
It is defined as numeric(12) in our sybase database. For other DB components I’ve used (DB_INPUT_TABLE, and DB_OUTPUT_TABLE) we use long in the metadata for the id columns with no problem. You don’t think this is a bug with the DB_JOIN component?
Hello Biil,
I investigated the issue a bit deeper and realized that your graph throws “cannot convert from ‘decimal’ to ‘long’” exception, when database metadata is not defined, but work properly if I the field is of the type long in database metadata. The reason is, that by default NUMERIC field in a database is converted to DECIMAL Clover type and in CTL2 you can’t assign decimal number to a LONG field. It would work if you use implicit conversion:
$0.feature_id = decimal2long($1.feature_id);
However when you set data base metadata, the type compatibility is proper for CTL2, but transfer from database can be unsuccessful in the runtime. In your case it should work fine, but if you, e.g. set the type to INTEGER you could get error like "Numeric overflow in conversion of value 21,057,340,052 to type INTEGER.