OutOfMemoryError with XMLWriter

Hi,
I want to transform data and Write it to an XML-File. The functionality of the XMLWriter is the perfect solution for my requirements, there is only one problem, that I have to work with huge amounts of data. When I use the XMLWriter with my data I get a “java.lang.OutOfMemoryError: Java heap space”. I know that there is the possibility to enlarge this memory, I have assigned 1024 MB. I tried to use the StructuredDataWriter and I can execute simple examples with huge amounts of data, but I also have difficult calculations where I need the mappingpossibilities of the XMLWriter. I already tried to use the XMLWriter only for the mappingtransformation and send the result to the outputport and to an other Writer, but it didn’t work. Does anyone know a solution for this problem?

thanks a lot!

Hello,
XMLWriter puts all data into memory and then format them in xml file, so only way to use XMLWriter with big files is to increase the memory. StructureWriter writes data do file sequentially, so it has no problem with memory even for big files. The hack, that could solve your problem is to use joiner instead of XMLWriter, which will product xml records (one string field as a part of xml file) and then use StructerWriter with static header and footer.

Thanks for your reply!

Hi,
I tried to work the problem out with a Joiner, but my problem is, that the Joiner-Output needs a hierarchical structure, to create a hierarchical XML-document similar to the output of the XMLWriter, but I didn’t find a posibility to create a transformation into a hierarchical structure. I can only asign to fields of a record, and not to a hierarchical XML-Structure. What can I do?

Hello,
I think about something like:

<?xml version="1.0" encoding="UTF-8"?>
<Graph author="avackova" created="Thu Jun 11 13:46:15 CEST 2009" guiVersion="0.0.0.devel" id="1244721135669" licenseType="Evaluation license." modified="Thu Jun 11 14:16:26 CEST 2009" modifiedBy="avackova" name="xmlJoin" revision="1.24">
<Global>
<Metadata id="Metadata0">
<Record fieldDelimiter="|" name="rec" recordDelimiter="\n" type="delimited">
<Field name="key" type="string"/>
<Field name="data" type="string"/>
</Record>
</Metadata>
<Metadata id="Metadata1">
<Record fieldDelimiter="|" name="xml" recordDelimiter="\n" type="delimited">
<Field name="field1" type="string"/>
</Record>
</Metadata>
<Property fileURL="workspace.prm" id="GraphParameter0"/>
</Global>
<Phase number="0">
<Node enabled="enabled" generate="//#TL&#10;&#10;// Generates output record.&#10;function generate() {&#10;&#9;$0.key := 'key';&#10;&#9;$0.data := 'master data';&#10;}&#10;&#10;// Called during component initialization.&#10;// function init() {}&#10;&#10;// Called after the component finishes.&#10;// function finished() {}&#10;" guiHeight="0" guiName="DataGenerator" guiWidth="0" guiX="73" guiY="55" id="DATA_GENERATOR0" recordsNumber="1" type="DATA_GENERATOR"/>
<Node enabled="enabled" generate="//#TL&#10;&#10;// Generates output record.&#10;function generate() {&#10;&#9;$0.key := 'key';&#10;&#9;$0.data := 'slave data: '+random_string(3,5);&#10;}&#10;&#10;// Called during component initialization.&#10;// function init() {}&#10;&#10;// Called after the component finishes.&#10;// function finished() {}&#10;" guiHeight="0" guiName="DataGenerator" guiWidth="0" guiX="74" guiY="159" id="DATA_GENERATOR1" recordsNumber="2" type="DATA_GENERATOR"/>
<Node enabled="enabled" generate="function generate() {&#10;&#9;$0.key := 'key';&#10;&#9;$0.data := 'slave data: '+random_string(3,5);&#10;}&#10;" guiHeight="0" guiName="DataGenerator" guiWidth="0" guiX="69" guiY="270" id="DATA_GENERATOR2" recordsNumber="3" type="DATA_GENERATOR"/>
<Node enabled="enabled" guiHeight="0" guiName="ExtHashJoin" guiWidth="0" guiX="351" guiY="52" id="EXT_HASH_JOIN0" joinKey="$key=$key;#$key=$key;#" slaveDuplicates="true" type="EXT_HASH_JOIN">
<attr name="transform"><![CDATA[//#TL

// Transforms input record into output record.
function transform() {
	$0.field1 := '<master key=\"'+$0.key + '\" data=\"'+$0.data+'\">'+"\\n"+'\t<slave1=\"'+$1.data+'\"/>'+"\\n"+'\t<slave2=\"'+$2.data + '\"/>'+"\\n"+'</master>';
}

// Called during component initialization.
// function init() {}

// Called after the component finishes.
// function finished() {}
]]></attr>
</Node>
<Node enabled="enabled" guiHeight="0" guiName="Trash" guiWidth="0" guiX="576" guiY="47" id="TRASH0" type="TRASH"/>
<Edge fromNode="DATA_GENERATOR0:0" guiBendpoints="" id="Edge0" inPort="Port 0 (driver)" metadata="Metadata0" outPort="Port 0 (out)" toNode="EXT_HASH_JOIN0:0"/>
<Edge fromNode="DATA_GENERATOR1:0" guiBendpoints="" id="Edge1" inPort="Port 1 (slave)" metadata="Metadata0" outPort="Port 0 (out)" toNode="EXT_HASH_JOIN0:1"/>
<Edge fromNode="DATA_GENERATOR2:0" guiBendpoints="" id="Edge2" inPort="Port 2 (slave)" metadata="Metadata0" outPort="Port 0 (out)" toNode="EXT_HASH_JOIN0:2"/>
<Edge debugMode="true" fromNode="EXT_HASH_JOIN0:0" guiBendpoints="" id="Edge3" inPort="Port 0 (in)" metadata="Metadata1" outPort="Port 0 (out)" toNode="TRASH0:0"/>
</Phase>
</Graph>