How to read documents separation of binary data

hwhwhw · August 27, 2007, 12:00am

How to read delimited text file of binary data ?
i will be binary data output to a delimited text file a.txt,
use DelimitedDataReader read delimited text file a.txt Exception
Node DELIMITED_DATA_READER_0 finished with status: ERROR caused by: java.io.IOException:Field too long or can not find delimiter [;]
How would I do it?

mzila · August 27, 2007, 9:06am

I can’t find out what is a reason of the problem. Could you please send me an example of data file and graph file (or metadata at least) to milan.zila at javlinconsulting.cz

Milan

mzila · August 27, 2007, 3:43pm

Your problem is probably caused by too large data in blob field. Try to increase following values in file:
cloveretl.engine.jar\org\jetel\data\defaultProperties

Record.MAX_RECORD_SIZE = 8192
DataParser.FIELD_BUFFER_LENGTH = 512
DataFormatter.FIELD_BUFFER_LENGTH = 512

Default values are optimized for speed.

Milan

hwhwhw · August 30, 2007, 8:27am

Try the two days, without succe

I modified defaultProperties
+++++++++++++++++++++++++++++++++++++++
Record.MAX_RECORD_SIZE = 2192000
DataParser.FIELD_BUFFER_LENGTH = 2192000
DataFormatter.FIELD_BUFFER_LENGTH = 2192000
++++++++++++++++++++++++++++++++++++++++++

blobfile.grf first run, and then run blobfiletoblobfile.grf and fileblob.grf

run blobfile.grf output files blobfile.txt,run blobfiletoblobfile.grf output files test_blobfile.txt .
run fileblob.grf the test1_1 table did not record.
Please help me .

The source table t_blob and target table test1_1 is structured as follows

db2 => describe table t_blob

Column Type Type
name schema name Length Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
F1 SYSIBM DECIMAL 12 0 Yes
F2 SYSIBM DECIMAL 10 2 Yes
F3 SYSIBM VARCHAR 50 0 Yes
F4 SYSIBM BLOB 8388608 0 Yes
F5 SYSIBM DATE 4 0 Yes

5 record(s) selected.

db2 => describe table test1_1

Column Type Type
name schema name Length Scale Nulls
------------------------------ --------- ------------------ -------- ----- ------
FIELD0 SYSIBM DECIMAL 12 0 Yes
FIELD1 SYSIBM DECIMAL 12 2 Yes
FIELD2 SYSIBM VARCHAR 50 0 Yes
FIELD3 SYSIBM BLOB 8388608 0 Yes
FIELD4 SYSIBM DATE 4 0 Yes

5 record(s) selected.

db2 => select f1,f2,f3,length(f4) from t_blob

F1 F2 F3 F4
-------------- ------------ -------------------------------------------------- -----------
2. 88.67 wzhy 515569
3. 987.40 def 1666156
1. 100.23 abc 726486

3 record(s) selected.

Details of fat to your mailbox a.

kaamoss · December 18, 2007, 7:42pm

I’m experiencing a similar problem with CloverETL right now. I’m parsing a huge csv file, and one of the columns, called “options”(the options on a car) is an incredibly long string(1060 chars) with options separated by ‘;’ the field delimiter is ‘,’. I did as you suggested hwhwhw and changed my defaultProperties and repacked the jar. Unfortunately I still get the same problem…Is this column just too huge to deal with? Here is the error I am currently getting with a simple Delimited Data Reader → Broadcast → Trash:


INFO  [WatchDog] - Sucessfully started all nodes in phase!
FATAL [WatchDog] - !!! Fatal Error !!! - graph execution is aborting
ERROR [WatchDog] - Node DELIMITED_DATA_READER0 finished with status: ERROR caused by: java.io.IOException:Field too long or can not find delimiter [,]
 when parsing record #2 field options
DEBUG [WatchDog] - Node DELIMITED_DATA_READER0 error details:
java.lang.RuntimeException: java.io.IOException:Field too long or can not find delimiter [,]
 when parsing record #2 field options
	at org.jetel.data.parser.DelimitedDataParser.parseNext(DelimitedDataParser.java:437)
	at org.jetel.data.parser.DelimitedDataParser.getNext0(DelimitedDataParser.java:170)
	at org.jetel.data.parser.DelimitedDataParser.getNext(DelimitedDataParser.java:166)
	at org.jetel.util.MultiFileReader.getNext(MultiFileReader.java:229)
	at org.jetel.component.DelimitedDataReader.execute(DelimitedDataReader.java:148)
	at org.jetel.graph.Node.run(Node.java:366)
	at java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: Field too long or can not find delimiter [,]

	at org.jetel.data.parser.DelimitedDataParser.parseNext(DelimitedDataParser.java:411)
	... 6 more

I’d appreciate any help that anyone could give me, as I’m new to CloverETL . Thanks

hwhwhw · December 19, 2007, 1:06am

If it is read very large string, the following parameters by modifying the problem can be solved.
+++++++++++++++++++++++++++++++
Record.MAX_RECORD_SIZE = 65536
DataParser.FIELD_BUFFER_LENGTH = 32768
DataFormatter.FIELD_BUFFER_LENGTH = 32768
DEFAULT_INTERNAL_IO_BUFFER_SIZE = 131072
+++++++++++++++++++++++++++++++
This should be ok for record sizes up to 65KBs.

I found that the value of these parameters is a set of rules

kaamoss · December 19, 2007, 7:39pm

Thanks for the advice, I tried using the values you suggested, and still got the error, so I tried some absurdly high values. Unfortunately I’m still having no luck. Here are the values that I’m currently using:


Record.MAX_RECORD_SIZE = 10240000
DataParser.FIELD_BUFFER_LENGTH = 10485760
DataFormatter.FIELD_BUFFER_LENGTH = 10485760
DEFAULT_INTERNAL_IO_BUFFER_SIZE = 25600000

10485760 Bytes is 10240KB
25600000 Bytes is 25000KB (*roughly 2.5x the MAX_RECORD_SIZE)
10240000 Bytes is 10000KB

I read in the comments in the defaultProperties file that “(java stores strings in unicode - 16bits per character)”

Since my graph breaks on a field named options, which is a string looking like: {Option One; Option Two; Option Another; And So Forth; I think you get the idea,} I know that the end delimiter(‘,’) is at the end of the field for each record, so it must be choking on the size of the string. But the string ranges in length from 1600-2000 chars long, which should only be (3200 Bytes - 4000 Bytes).

Thanks for your help.

hwhwhw · December 20, 2007, 1:40am

If the problem still exists, you can release the graphFile and fmtFile out that all parties analysis.

kaamoss · December 21, 2007, 7:00pm

I’m still having the same issue, so I threw the files in question up on my webserver. If anyone could take a look and let me know, I’d greatly appreciate it.

Graph
Metadata
Data

hwhwhw · December 24, 2007, 9:00am

According to my observation, vehicleImport.fmt.xml definition of the number of fields for 86, but I found vehicleImport.csv of records from the 17th, a comma-separated fields of less than 86, which may cause the procedures in the absence of from a sufficient number of field values, did not stop to read the record of 17, continue to read the records of 18 fields, precise speaking from the 18th of the record “Vehicle_ID” value out of the first to write 17 record “Window_Sticker_Last_Published” field, and the same token, that the procedures in Article 17 of the record 86 of the 18 fields into the records of all remaining field values.
So I think the problem from vehicleImport.csv of data.

Topic		Replies	Views
It Worked CloverDX Platform	1	9	July 16, 2007
Another Issue CloverDX Platform	1	13	July 16, 2007
Max_record_size CloverDX Platform	5	6	March 4, 2009
Clover is not able to parse file remotely CloverDX Platform	1	20	September 17, 2009
Set appropriate parameter in defaultProperties file CloverDX Platform	10	10	February 17, 2017

How to read documents separation of binary data

Related topics