A dreadful error for users, but for admins too. Though, usually it’s easily fixed (in most cases there’s a problem with the load or licensing), but not in this particular case I encountered.
Well, one of the first things to do is to check out the event viewer of the Web Interface. I noticed I encountered a whole lot of 30102 errors coming from Citix Web Interface:
What’s this error about?
Now there can be a lot of things that can be causing this error. What’s of importance are the XML server which is queried and what function is reported to cause the error. The latter is [com.citrix.xml.NFuseProtocol.RequestLaunchRef].
The first thing I tried is restarting the XML service on the reported server (usually this is the top server you entered as server in the Citrix Farm (Web Interface > Server Farms), unless you use the list to load balance the XML request), but that didn’t help.
I tried various other things, but eventually I fired up WireShark, my tried and trusted networking tool, on the Web Interface server. I captured all traffic, but filtered out traffic coming from and going to the XML server (filter: ip.src == <ipaddress> || ip.dst == <ipaddress>), and looked at packets which were flowing when the 30102 errors occurred in the event viewer. I found the following issue:
POST /scripts/wpnbr.dll HTTP/1.1 Content-Type: text/xml Host: xxxxxxx:9090 Content-Length: 561 Connection: Keep-Alive <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE NFuseProtocol SYSTEM "NFuse.dtd"> <NFuseProtocol version="5.4"> <RequestLaunchRef> <LaunchRefType>ICA</LaunchRefType> <TicketTag>IMAHostId:15274</TicketTag> <DeviceId>WI_QZBNv5ztEDpLu0THQ</DeviceId> <SessionSharingKey>-peML5lFv4meHDULuC8Y14e</SessionSharingKey> <Credentials> <UserName>xxxxxxxx</UserName> <Domain type="NT">xxxxxxxx</Domain> </Credentials> <TimetoLive>200</TimetoLive> </RequestLaunchRef> </NFuseProtocol>
HTTP/1.1 100 Continue Server: Citrix Web PN Server Date: Fri, 25 May 2012 09:22:12 GMT
HTTP/1.1 200 OK Server: Citrix Web PN Server Date: Fri, 25 May 2012 09:22:12 GMT Content-type: text/xml Content-length: 217 <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE NFuseProtocol SYSTEM "NFuse.dtd"> <NFuseProtocol version="5.2"> <ResponseLaunchRef> <ErrorId>unspecified</ErrorId> </ResponseLaunchRef> </NFuseProtocol>
Eventually, I found out that every time this happened the same TicketTag (in this case IMAHostId:15274) was given. When the TicketTag was different the response gave a valid ticket:
HTTP/1.1 200 OK Server: Citrix Web PN Server Date: Fri, 25 May 2012 08:06:29 GMT Content-type: text/xml Content-length: 228 <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE NFuseProtocol SYSTEM "NFuse.dtd"> <NFuseProtocol version="5.2"> <ResponseTicket> <Ticket>11CE036B24379D427366DA81E873C0</Ticket> </ResponseTicket> </NFuseProtocol>
So it must have something to do with this particular TicketTag. Now, this IMAHostId responds to a specific server. When you go look on a XenApp server at HKLM\SOFTWARE\Wow6432Node\Citrix\IMA\RUNTIME\HostId you find the specific IMAHostId of the server (in decimal).
So putting all this together, it seemd that getting the LaunchReference from a specific server was failing. When I found the server by the IMAHostId (I just made a quick & dirty script to echo the HostId’s of all the servers in the farm), I restarted its IMA service, et voila, LaunchRequests to the server worked again!
(Eventually I found out it was not one, but two servers which were showing this behaviour, so I had to restart both IMA services. Anyway, I’m glad that this issue was solved!)
Happy troubleshooting! 🙂
UPDATE: On a recent project I’ve encountered this issue again, though this time it wasn’t the IMA service which was in trouble, but the XML broker didn’t trust XML requests sent to it. Enabling “Trust XML requests” Citrix computer policy on the XML brokers was the fix for this.