VARCHAR: Extract metadata from iSeries / DB400

Hi

When extracting metadata from an iSeries file, Clover maps VARCHAR(11) to BYTE(11). When running the graph, the field is filled, but the text is not useable/readable.

We are using the jt400.jar driver (AS400JDBC from IBM).

I tried to change the fieldtype of the receiving field to other formats (string, etc.), but this results in runtime errors.

Any suggestions how to receive the proper content of the file ?

Thanks
Otto Brunner

Hi Otto,

Generally speaking, varchar should be always mapped to string. It is quite unusual that the default in your case is byte. I would like to know a few more details, please:

  1. What version of CloverETL do you have?
  2. Send me the graph run log containing the mentioned runtime errors.
  3. Send me your jdbc driver. I downloaded my own but I would like to have the same version as you.
  4. Externalize your DB connection and send it to me, without any sensitive information like password.
  5. If the error message contains any identifier of the record that caused the error, send me the value from the varchar field.
  6. Send me the create statement of the DB table you are using.

Thanks.

Hi Lubos

Sorry, It took some time to prepare testcases.

Meanwhile we think our problems are not caused by the VARCHAR, but by a strange “character code set identifier (CCSID)“ on field level. We created 2 test files as extracts from the production files (using SELECT INTO) and a graph to download the file content (see ZIP-file). In both cases we get “unusable” data in Clover.

In our test case 3 we can fix the problem when modifying the CCSID for the whole file to 37 (CHGPF command), but with test case 2 this does not work.

We use CloverETL 4.3.0.15.

If you need further info, do not hesitate to contact me.

Best regards
Otto

Otto,

no need for apologies, thanks for the detailed description. It is an interesting situation.

Is there any chance that you could try some other driver (a newer version for instance)?

What is the version of the DB?

And what encoding is configured in the DB itself?

Thank you.

Hi Lubos

  1. I tested also jt400-9.1.jar from the jtopen-package. It made no difference
  2. The iSeries Release is V7R1M0
  3. The iSeries-“DB” is not really a database, only a collection of files with a SQL-Frontend on top of it. so every file has its own encoding (CCSID). The system value for CCSID on this machine is 65535, but the files have normally 37

I looked through some more documentation and websites. From the JDBC FAQ:

Why is the Toolbox JDBC returning EBCDIC characters to my Java program?
Normally, the Toolbox JDBC driver will translate EBCDIC characters to Unicode Strings automatically. If it appears to be returning untranslated EBCDIC characters instead, then it is likely that the field in the IBM i database is tagged with CCSID 65535. The Toolbox JDBC driver recognizes this CCSID as a field that should not be translated. To avoid this behavior, tag fields on the IBM i that you want to be translated, with a valid CCSID. Alternately, you can set the “translate binary” connection property to “true”, which instructs the JDBC driver to translate all fields, including those tagged with CCSID 65535. The easiest way to do this is to add
“translate binary=true”
to the end of the URL used when connecting to the database.

I added this property to the advanced parameters of the connection and my problems are solved ! :smiley:

Thanks anyway
Otto

Thank you for the update, Otto. This is very interesting. Hopefully, some other users will benefit from it too.