Problem reading XML file from remote location

Hi,

I am running transformation from XML to Database. I am using XMLExtract with Merge Join and DBOutPut components. I am able to run transformation when I am giving local file system path but when I am specifying path of remote file system things are not working.

I am sending you the graph log for both file systems.

Graph log for local file system:


INFO  [WatchDog] - Sucessfully started all nodes in phase!
INFO  [WatchDog] - ---------------------** Start of tracking Log for phase [0] **-------------------
INFO  [WatchDog] - Time: 17/09/09 11:37:04
INFO  [WatchDog] - Node                   Status     Port      #Records         #KB  Rec/s    KB/s
INFO  [WatchDog] - ---------------------------------------------------------------------------------
INFO  [WatchDog] - DB_OUTPUT_TABLE0       RUNNING        
INFO  [WatchDog] -  %cpu:..                           In:0         1560          89      0       0
INFO  [WatchDog] - EXT_MERGE_JOIN0        RUNNING        
INFO  [WatchDog] -  %cpu:..                           In:0         1562         194      0       0
INFO  [WatchDog] -                                    In:1         1562         194      0       0
INFO  [WatchDog] -                                    In:2         1562         101      0       0
INFO  [WatchDog] -                                   Out:0         1560          89      0       0
INFO  [WatchDog] - XML_EXTRACT0           RUNNING        
INFO  [WatchDog] -  %cpu:0.81                        Out:0         6048         194      0       0
INFO  [WatchDog] -                                   Out:1         6048         194      0       0
INFO  [WatchDog] -                                   Out:2        20877         101      0       0
INFO  [WatchDog] - ---------------------------------** End of Log **--------------------------------

Graph log for remote Path system.


INFO  [WatchDog] - Sucessfully started all nodes in phase!
INFO  [WatchDog] - ---------------------** Start of tracking Log for phase [0] **-------------------
INFO  [WatchDog] - Time: 17/09/09 11:31:30
INFO  [WatchDog] - Node                   Status     Port      #Records         #KB  Rec/s    KB/s
INFO  [WatchDog] - ---------------------------------------------------------------------------------
INFO  [WatchDog] - DB_OUTPUT_TABLE0       RUNNING        
INFO  [WatchDog] -  %cpu:..                           In:0            0           0      0       0
INFO  [WatchDog] - EXT_MERGE_JOIN0        RUNNING        
INFO  [WatchDog] -  %cpu:..                           In:0            0           0      0       0
INFO  [WatchDog] -                                    In:1            0           0      0       0
INFO  [WatchDog] -                                    In:2            0           0      0       0
INFO  [WatchDog] -                                   Out:0            0           0      0       0
INFO  [WatchDog] - XML_EXTRACT0           RUNNING        
INFO  [WatchDog] -  %cpu:0.01                        Out:0           22           0      0       0
INFO  [WatchDog] -                                   Out:1           22           0      0       0
INFO  [WatchDog] -                                   Out:2           73           0      0       0
INFO  [WatchDog] - ---------------------------------** End of Log **--------------------------------
INFO  [WatchDog] - ---------------------** Start of tracking Log for phase [0] **-------------------
INFO  [WatchDog] - Time: 17/09/09 11:31:35
INFO  [WatchDog] - Node                   Status     Port      #Records         #KB  Rec/s    KB/s
INFO  [WatchDog] - ---------------------------------------------------------------------------------
INFO  [WatchDog] - DB_OUTPUT_TABLE0       RUNNING        
INFO  [WatchDog] -  %cpu:..                           In:0            0           0      0       0
INFO  [WatchDog] - EXT_MERGE_JOIN0        RUNNING        
INFO  [WatchDog] -  %cpu:..                           In:0            0           2      0       0
INFO  [WatchDog] -                                    In:1            0           2      0     347
INFO  [WatchDog] -                                    In:2            0           1      0     181
INFO  [WatchDog] -                                   Out:0            0           0      0       0
INFO  [WatchDog] - XML_EXTRACT0           RUNNING        
INFO  [WatchDog] -  %cpu:0.01                        Out:0           75           2     10       0
INFO  [WatchDog] -                                   Out:1           75           2     10       0
INFO  [WatchDog] -                                   Out:2          256           1     36       0
INFO  [WatchDog] - ---------------------------------** End of Log **--------------------------------
INFO  [WatchDog] - ---------------------** Start of tracking Log for phase [0] **-------------------
INFO  [WatchDog] - Time: 17/09/09 11:31:40
INFO  [WatchDog] - Node                   Status     Port      #Records         #KB  Rec/s    KB/s
INFO  [WatchDog] - ---------------------------------------------------------------------------------
INFO  [WatchDog] - DB_OUTPUT_TABLE0       RUNNING        
INFO  [WatchDog] -  %cpu:..                           In:0            0           0      0       0
INFO  [WatchDog] - EXT_MERGE_JOIN0        RUNNING        
INFO  [WatchDog] -  %cpu:..                           In:0            0           3      0       0
INFO  [WatchDog] -                                    In:1            0           3      0     247
INFO  [WatchDog] -                                    In:2            0           1      0     131
INFO  [WatchDog] -                                   Out:0            0           0      0       0
INFO  [WatchDog] - XML_EXTRACT0           RUNNING        
INFO  [WatchDog] -  %cpu:0.01                        Out:0          113           3      7       0
INFO  [WatchDog] -                                   Out:1          113           3      7       0
INFO  [WatchDog] -                                   Out:2          389           1     26       0
INFO  [WatchDog] - ---------------------------------** End of Log **--------------------------------
INFO  [WatchDog] - ---------------------** Start of tracking Log for phase [0] **-------------------
INFO  [WatchDog] - Time: 17/09/09 11:31:46
INFO  [WatchDog] - Node                   Status     Port      #Records         #KB  Rec/s    KB/s
INFO  [WatchDog] - ---------------------------------------------------------------------------------
INFO  [WatchDog] - DB_OUTPUT_TABLE0       RUNNING        
INFO  [WatchDog] -  %cpu:..                           In:0            0           0      0       0
INFO  [WatchDog] - EXT_MERGE_JOIN0        RUNNING        
INFO  [WatchDog] -  %cpu:..                           In:0            0           5      0       0
INFO  [WatchDog] -                                    In:1            0           5      0     334
INFO  [WatchDog] -                                    In:2            0           2      0     164
INFO  [WatchDog] -                                   Out:0            0           0      0       0
INFO  [WatchDog] - XML_EXTRACT0           RUNNING        
INFO  [WatchDog] -  %cpu:..                          Out:0          164           5     10       0
INFO  [WatchDog] -                                   Out:1          164           5     10       0
INFO  [WatchDog] -                                   Out:2          555           2     32       0
INFO  [WatchDog] - ---------------------------------** End of Log **--------------------------------
INFO  [WatchDog] - ---------------------** Start of tracking Log for phase [0] **-------------------
INFO  [WatchDog] - Time: 17/09/09 11:31:51
INFO  [WatchDog] - Node                   Status     Port      #Records         #KB  Rec/s    KB/s
INFO  [WatchDog] - ---------------------------------------------------------------------------------
INFO  [WatchDog] - DB_OUTPUT_TABLE0       RUNNING        
INFO  [WatchDog] -  %cpu:..                           In:0            0           0      0       0
INFO  [WatchDog] - EXT_MERGE_JOIN0        RUNNING        
INFO  [WatchDog] -  %cpu:..                           In:0            0           6      0       0
INFO  [WatchDog] -                                    In:1            0           6      0     254
INFO  [WatchDog] -                                    In:2            0           3      0     123
INFO  [WatchDog] -                                   Out:0            0           0      0       0
INFO  [WatchDog] - XML_EXTRACT0           RUNNING        
INFO  [WatchDog] -  %cpu:..                          Out:0          203           6      7       0
INFO  [WatchDog] -                                   Out:1          203           6      7       0
INFO  [WatchDog] -                                   Out:2          680           3     24       0
INFO  [WatchDog] - ---------------------------------** End of Log **---------------------

I thing some where syncronization issue is coming for Merge Join component.

Please suggest some solution for reading file from remote location.

Thanks
Pushpendra

Hi,

I am specifying the graph for the above issue for reading xml from remote location which is not working.


<?xml version="1.0" encoding="UTF-8"?>
<Graph author="user" created="IST 2009" guiVersion="2.2.2" id="1251871535775" licenseType="Evaluation license." modified="Thu Sep 17 11:36:56 IST 2009" modifiedBy="user" name="TestGraph" revision="1.198">
<Global>
<Metadata id="Metadata0" previewAttachmentCharset="ISO-8859-1">
<Record fieldDelimiter="|" name="recordName1" previewAttachmentCharset="ISO-8859-1" recordDelimiter="\r\n" type="delimited">
<Field name="field1" type="string"/>
<Field name="count" type="integer"/>
</Record>
</Metadata>
<Metadata id="Metadata1" previewAttachmentCharset="ISO-8859-1">
<Record fieldDelimiter="|" name="recordName2" previewAttachmentCharset="ISO-8859-1" recordDelimiter="\r\n" type="delimited">
<Field name="field1" type="string"/>
<Field name="count" type="integer"/>
</Record>
</Metadata>
<Metadata id="Metadata2" previewAttachmentCharset="ISO-8859-1">
<Record fieldDelimiter="|" name="recordName3" previewAttachmentCharset="ISO-8859-1" recordDelimiter="\r\n" type="delimited">
<Field name="field1" type="string"/>
<Field name="count" type="integer"/>
</Record>
</Metadata>
<Metadata id="Metadata3" previewAttachmentCharset="ISO-8859-1">
<Record fieldDelimiter="|" name="recordName4" previewAttachmentCharset="ISO-8859-1" recordDelimiter="\r\n" type="delimited">
<Field name="field1" type="string"/>
<Field name="field2" type="string"/>
<Field name="field3" type="string"/>
</Record>
</Metadata>
<Connection database="POSTGRE" dbURL="jdbc:postgresql://IP/DB" id="Connection0" jdbcSpecific="POSTGRE" jndiName="" name="NewConnection" password="superuserfor9aisle" type="JDBC" user="postgres"/>
<Sequence cached="100" fileURL="C:/misc/seq.dat" id="Sequence0" name="Sequence0" start="0" step="1" type="SIMPLE_SEQUENCE"/>
</Global>
<Phase number="0">
<Node cloverFields="field1;field2;field3" dbConnection="Connection0" enabled="enabled" fieldMap="$field1:=name;$field2:=sku;$field3:=brand;" guiHeight="0" guiName="DBOutputTable" guiWidth="0" guiX="512" guiY="132" id="DB_OUTPUT_TABLE0" passThroughInputPort="0" sqlQuery="insert into product (sku,name,brand) values(?,?,?)" type="DB_OUTPUT_TABLE"/>
<Node ascendingInputs="true" charset="UTF-8" enabled="enabled" guiHeight="0" guiName="ExtMergeJoin" guiWidth="0" guiX="325" guiY="126" id="EXT_MERGE_JOIN0" joinKey="$count;#$count;#$count;#" passThroughOutputPort="0" type="EXT_MERGE_JOIN">
<attr name="transform"><![CDATA[//#TL

// Transforms input record into output record.
function transform() {
	$0.field1 := $0.field1;
	$0.field2 := $1.field1;
	$0.field3 := $2.field1;
}

// Called during component initialization.
// function init() {}

// Called after the component finishes.
// function finished() {}
]]></attr>
</Node>
<Node charset="UTF-8" enabled="enabled" guiHeight="0" guiName="XMLExtract" guiWidth="0" guiX="87" guiY="127" id="XML_EXTRACT0" passThroughInputPort="0" passThroughOutputPort="0" sourceUri="http://IP:Port/SmallSync.xml" type="XML_EXTRACT">
<attr name="mapping"><![CDATA[<Mappings> 
	<Mapping element="us:SyncItemMaster">
		<Mapping element="us:DataArea">
			<Mapping element="us:ItemMaster">
				<Mapping element="us:GlobalItem">
					<Mapping cloverFields="field1" element="us:GTINBox" outPort="0" sequenceField="count" sequenceID="Sequence0" xmlFields="us:GTINBox"/>
					<Mapping cloverFields="field1" element="us:GTINCarton" outPort="1" sequenceField="count" sequenceID="Sequence0" xmlFields="us:GTINCarton"/>
				</Mapping>
				<Mapping element="us:ItemMasterHeader">
					<Mapping element="oa:ItemID">
						<Mapping cloverFields="field1" element="oa:ID" outPort="2" sequenceField="count" sequenceID="Sequence0" xmlFields="field1"/>
					</Mapping>	
				</Mapping>
			</Mapping>
		</Mapping>
	</Mapping>
</Mappings>]]></attr>
</Node>
<Edge fromNode="EXT_MERGE_JOIN0:0" guiBendpoints="" id="Edge4" inPort="Port 0 (in)" metadata="Metadata3" outPort="Port 0 (out)" toNode="DB_OUTPUT_TABLE0:0"/>
<Edge fromNode="XML_EXTRACT0:0" guiBendpoints="" id="Edge1" inPort="Port 0 (driver)" metadata="Metadata0" outPort="Port 0 (out)" toNode="EXT_MERGE_JOIN0:0"/>
<Edge fromNode="XML_EXTRACT0:1" guiBendpoints="" id="Edge2" inPort="Port 1 (slave)" metadata="Metadata0" outPort="Port 1 (out)" toNode="EXT_MERGE_JOIN0:1"/>
<Edge fromNode="XML_EXTRACT0:2" guiBendpoints="" id="Edge3" inPort="Port 2 (slave)" metadata="Metadata0" outPort="Port 2 (out)" toNode="EXT_MERGE_JOIN0:2"/>
</Phase>
</Graph>

Size of xml file is around 500MB and I am rading from remote location using http protocol. I think this graph will help in debugging the issue for XML Extract and Merge Join component.

Thanks
Pushpendra

Hello Pushpendra,
have you tried to wait longer if the graph finished?
I suspect that your connection is so slow, that edge buffer has never fulfilled and hasn’t sent data to the next component. To check it, you can set edgeType=“directFastPropagate” for the XMLExtract output edges. Then the data are sent to the next component immediately.