Exception when I parse XLS file

Hi All,

I need to parse XLS file , I used the following code:

XLSXDataParser parser=new XLSXDataParser();

try {
parser.init(metadata);
} catch (ComponentNotReadyException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}

parser.setDataSource(file); // file is an excel file as input stream, here throws exception

DataRecord record = new DataRecord(metadata);
record.init();
while((record=parser.getNext(record))!=null){
// do something
}

so I have this exception:

Exception in thread “ThreadPoolTaskExecutor-2” java.lang.NoClassDefFoundError: org/apache/xmlbeans/XmlException
at org.jetel.data.parser.XLSXDataParser.setDataSource(XLSXDataParser.java:82)
.
.
.
Caused by: java.lang.ClassNotFoundException: org.apache.xmlbeans.XmlException
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
.
.

can you help me plz?

thanks for your support.

Hi,
you need to add the xmlbeans-2.3.0.jar to your classpath. This JAR is distributed with CloverETL Engine, you can find it in the “lib” directory. In general you should add all JARS from the “lib” directory to your classpath.

Jaro

Thanks jurban,
after I added the required JARs , I have the following exception:

The XLSX workbook has invalid format!
at org.jetel.data.parser.XLSXDataParser.setDataSource(XLSXDataParser.java:86)
at testDataParsing.main(testDataParsing.java:117)
Caused by: org.openxml4j.exceptions.InvalidFormatException: Package should contain a content type part [M1.13]
at org.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:158)
at org.openxml4j.opc.Package.getParts(Package.java:598)
at org.openxml4j.opc.Package.open(Package.java:227)
at org.jetel.data.parser.XLSXDataParser.setDataSource(XLSXDataParser.java:82)
… 1 more

why it is invalid? it is ok when the file is csv (and the parser is DataParser) . is there any change should i do with metadata?

Thanks,

It seems the XLS file is invalid. How did you create the file?

Jaro

I use MS Office ,
however, I noticed that when I try to parse xls file (2003) , I have this exception

The XLSX workbook has invalid format!
at org.jetel.data.parser.XLSXDataParser.setDataSource(XLSXDataParser.java:86)
at testDataParsing.main(testDataParsing.java:117)
Caused by: org.openxml4j.exceptions.InvalidFormatException: Package should contain a content type part [M1.13]
at org.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:158)
at org.openxml4j.opc.Package.getParts(Package.java:598)
at org.openxml4j.opc.Package.open(Package.java:227)
at org.jetel.data.parser.XLSXDataParser.setDataSource(XLSXDataParser.java:82)
… 1 more

and when I try to parse xlsx file (2007) , I have the following exception :

Exception in thread “ThreadPoolTaskExecutor-2” java.lang.NoSuchMethodError: org.apache.poi.util.IOUtils.copy(Ljava/io/InputStream;Ljava/io/OutputStream;)V
at org.apache.poi.util.PackageHelper.clone(PackageHelper.java:72)
at org.apache.poi.util.PackageHelper.clone(PackageHelper.java:44)
at org.apache.poi.POIXMLDocument.ensureWriteAccess(POIXMLDocument.java:182)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.(XSSFWorkbook.java:150)
at org.jetel.data.parser.XLSXDataParser.setDataSource(XLSXDataParser.java:81)

Notice that I added all the required JARS.

can u help me ?

Thanks,