Tuesday, December 8, 2009

WTRN0078E: An attempt by the transaction manager to call start on a transactional resource has resulted in error

Watch out when you r working with WPS and MQ ( If MQ is the messaging provider ) moreover its MQ Z/0S and your MQ adminstrator is bugging you to have MQDISCT ( MQ disconnect ) on the server connection channel.

MQDISCT - allows a channel instance to timeout if there is not activity on the channel for a specific time interval. This setting seems to be good from MQ admin point of view , as he/she would not want the channel instances to grow in production ( specially if its a shared queue manager for several projects )

But be aware this configuration can really affect your WAS/WPS environment - as this would cause your listenerPort to be restarted every time the MDB session tries to pick message off a queue - since the server part of the handle would have already been disconnected by MQ ( timeout ) your listenerPort will be restarted whenever you have message on the queue after the MQDISCT.

-------------------------------------------------------------------------------------
We had a a J2EE application comprising of Stateless session beans and MDB's, We have deployed these on Websphere Process Server 6.0.2.5

The messaging for WPS is on Websphere MQ on Z/OS.We sometimes face some XA errors with respect to MQ.

-------------------------------------------------------------------------------------

There is a strong reasoning on this XA errors with respect to MQ - it would cause your listenerPort to get restarted with these errors :-


[7/14/09 9:42:39:186 PDT] 0000005a XATransaction E J2CA0027E: An exception occ
JMSManagedConnection@194476453, within transaction ID {XidImpl: formatId(5741534
9ec0588062b4c152a8f24de408b033938e658132000001227a24f8ae0000000100000d5a9ec05880
.transaction.xa.XAException: XA operation failed, see errorCode
at com.ibm.mq.MQXAResource.start(MQXAResource.java:447

[7/14/09 9:42:39:222 PDT] 0000005a RegisteredRes E WTRN0078E: An attempt by th
e transaction manager to call start on a transactional resource has resulted in
an error. The error code was XAER_RMERR. The exception stack trace follows: java
x.transaction.xa.XAException: XA operation failed, see errorCode
at com.ibm.mq.MQXAResource.start(MQXAResource.java:447)
at com.ibm.ejs.jms.JMSManagedSession$JMSXAResource.start(JMSManagedSessi
on.java:1318)
at com.ibm.ejs.j2c.XATransactionWrapper.start(XATransactionWrapper.java(

at com.ibm.ws.Transaction.JTA.JTAResourceBase.start(JTAResourceBase.java(Compiled Code))
... 21 more
while trying to enlist resources from datasource JMS$QCF$JMSManagedConnection@194476453 with the Transaction Manager for the current transaction, and threw
a ResourceException.
[7/14/09 9:42:39:281 PDT] 0000005a ConnectionEve A J2CA0056I: The Connection Manager received a fatal connection error from the Resource Adaptor for resour
ce JMS$QCF$JMSManagedConnection@194476453. The exception which was received is javax.jms.JMSException: Enlist failed
[7/14/09 9:42:39:284 PDT] 0000005a ConnectionEve A J2CA0056I: The Connection Manager received a fatal connection error from the Resource Adaptor for resour
ce jms/QCF. The exception which was received is javax.jms.JMSException: Enlist failed
[7/14/09 9:42:39:316 PDT] 0000005a JMSExceptionL E WMSG0018E: Error on JMSConnection for MDB ImportMDB , JMSDestination jms/ImportingQ : javax.jms.JM
SException: Enlist failed
at com.ibm.ejs.jms.JMSCMUtils.mapToJMSException(JMSCMUtils.java:135)
at com.ibm.ejs.jms.JMSManagedSession.enlist(JMSManagedSession.java:931)
at com.ibm.ejs.jms.JMSSessionHandle.enlist(JMSSessionHandle.java:1002)
at com.ibm.ejs.jms.listener.ServerSession.onMessage(ServerSession.java:611)
at com.ibm.ejs.jms.listener.ServerSession.dispatch(ServerSession.java:594)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)



With WAS v6.0.2 using a ListenerPort to drive an MDB from a WebSphere MQ
queue, the application server makes a connection to the Queue Manager.
The default behaviour is that when a message is found on the WMQ queue,
a new connection is made to the Queue Manager to retrieve and
subsequently process and pass the message to the MDB. When the MDB
processing has completed, this new MQ Connection is put into a WAS
controlled pool, ready to be called upon to get the next message off the
WMQ queue there will be no activity on this MQ connection unless a
message requires processing. The Listener Port MQ connection is
separate, and is always active, and does not get disconnected. The
important point to note here is that this connection will continue to
exist for the lifetime of the ListenerPort, it will not timeout or be
discarded unless:
(a) The ListenerPort is restarted
(b) The connection to the Queue Manager is found to be disconnected.

When a DISCINT value is used on the Queue Manager, the socket is
closed on the Queue Manager after the DISCINT timeout value is reached
if no activity has been observed on the socket. However, when this
happens, there is nothing to let the WAS pooled WMQ connection know that
the socket has been closed. As a consequence, the next time the
connection is attempted to be used it received a 2009 return code
(MQRC_CONNECTION_BROKEN) and the ListenerPort stops.

A simplified timeline of events to illustrate the
problem is as follows:

(1) ListenerPort thread: Message detected on WMQ Queue
(2) ListenerPort thread: Request pooled MQ Connection retrieve message
and process in MDB
(3) MQConnection_1: Start XA transaction, get message, process in MDB.
End XA transaction.
(4) ... 30 minute delay (say) before another message is required to be
processed by MQConnection_1
(5) WMQ Queue Manager closes socket as DISCINT time has been reached for
connection
(6) ListenerPort thread: Message detected on WMQ Queue
(7) ListenerPort thread: Request pooled MQConnection_1 retrieve message
and process in MDB
(8) MQConnection_1: Request XA transaction start boundary.
(9) MQConnection_1: Connection to Queue Manager broken detected (RC
2009), throwing XA error messages seen in logs.
(10) Stop ListenerPort.
(11) 60 second wait
(12) Start ListenerPort
(13) ListenerPort thread: Message detected on WMQ Queue
(14) ListenerPort thread: Request pooled MQ Connection retrieve message
and process in MDB
(15) Pool is empty, so new MQConnection made to Queue Manager
(16) MQConnection_2: Start XA transaction, get message, process in MDB.
End XA transaction.

So the overall effect of using the DISCINT value on the Queue Manager
in this scenario is to cause the ListenerPort to restart (with a 60
second delay, which is a configurable value, see
http://publib.boulder.ibm.com/infocenter/wasinfo/v6r0/index.jsp?topic=/c
om.ibm.websphere.base.doc/info/aes/ae/umb_prolscp.html), with an
associated number of error messages being printed out the logs.

Note also that the amount of time specified by the DISCONNECT value
could be larger than the amount of time between messages appearing on
the WMQ Queue, and the issue could still be seen. Suppose two
WMQConnections had been created and stored in the WAS pool (from
previously processing two messages on the WMQ queue simultaneously). If
there was a 15 minute wait before another message appeared, which was
processed by MQConnection_1, then another 15 minute wait before the next
message appeared, MQConnection_2 would be disconnected by the Queue
Manager even though a message has been put on the queue every 15
minutes.

The simplest solution is to discontinue the use of the DISCINT value on the Queue
Manager. Setting it to 0 prevents the Queue Manager from closing the
sockets, and thus preventing the problem. Is there some reason why the
customer is using this variable?

Further information about DISCINT can be found in the following article
in the Information Center:

http://publib.boulder.ibm.com/infocenter/wmqv6/v6r0/topic/com.ibm.mq.csq
zae.doc/ic11690_.htm?resultof=%22%44%49%53%43%49%4e%54%22%20%22%64%69%73
%63%69%6e%74%22%20

The WMQ heartbeat mechanism does not come into play in this scenario, as
the MQGET calls are made with short timeout intervals, so the HBINT
interval is not reached.

----------------------------------------------------------------------

Sunday, September 20, 2009

Enable/Disable tracing using a specified trace string

#!/bin/sh############################################################ #This shell script is used to enable and disable tracing of Websphere Process Server with a #chosen trace string passed as parameter to the script ###########################################################
WPS_HOME=/opt/wps6/ProcServer
WPS_USER=
WPS_PWD=
printUsage() {
echo -e "\t ######################################"
echo -e "\t ./wpsTracing.sh enable/disable "
echo -e "\t enable -> enable the trace on the server"
echo -e "\t disable -> disable the trace on the server"
echo -e "\t -> Enter tracestring as the second parameter" echo -e "\t #####################################################################"}
if [ $# -eq 0 -o $# -eq 1 ]; then printUsage; exit 1;fi
#echo "#####################################################################"#echo "Creating dynamic jacl script to "$1 "tracing on WPSNodes with trace string "$2#echo "#####################################################################"if [ $1 == "enable" ]
then
trace=true
truetraceString=$2
elsetrace=false
traceString=$2
fi
### Creating a jacl script dynamically
cat > setTrace.jacl <#### setTrace.jacl
## Automatically enable/disable tracing on the WPS Servers using a specific trace String##################################################################################################set cellName [\$AdminControl getCell]
set Node1 {WPHWPS01Node01} #Set Node1 to your nodeName
set Node2 {WPHWPS02Node01} #Set Node2 to your nodeName
set WPS01 [\$AdminConfig getid /Cell:\$cellName/Node:\$Node1/]
set WPS02 [\$AdminConfig getid /Cell:\$cellName/Node:\$Node2/]
####### FOR Node1 #################
set traceServices1 [\$AdminConfig list TraceService \$WPS01]
foreach traceService1 \$traceServices1
{\$AdminConfig modify \$traceService1 {{startupTraceSpecification $traceString}}\$AdminConfig modify \$traceService1 {{enable $trace}}
break
}

####### FOR Node2 NOW #################
set traceServices2 [\$AdminConfig list TraceService \$WPS02]
foreach traceService2 \$traceServices2
{\$AdminConfig modify \$traceService2 {{startupTraceSpecification $traceString}}\$AdminConfig modify \$traceService2 {{enable $trace}}
break
}
\$AdminConfig saveputs "Saved"
EOF
./wsadmin.sh -username $WPS_USER -password $WPS_PWD -f setTrace.jacl
echo "#####################################################################"echo "App Server Trace String Updated : "echo "#####################################################################"
# Removing intermediate JACL script after the work
rm -r setTrace.jacl

Wednesday, August 12, 2009

Jacl script to enable/disable tracing on Websphere Process Server

# Jacl Script to enable/disable server tracing with a definite trace string
# usage Windows : wsadmin.bat -username -password
# Unix :./wsadmin.sh -username -password
# Version 1.0
# Author : Vineet Chauhan
################################################################################

set cellName [$AdminControl getCell]
set nodeName [$AdminControl getNode]
set server [$AdminConfig getid /Cell:$cellName/Node:$nodeName/Server:server1/]
set traceService [$AdminConfig list TraceService $server]
puts "########## Properties of trace Service #############"
set traceServiceBefore [$AdminConfig show $traceService]
puts $tracServiceBefore
puts "##############################################"
$AdminConfig modify $traceService {{enable false}}
$AdminConfig save
puts "Saved**"
puts "+++++++++Trace Properteies after change +++++++++++"
set traceServiceAfter [$AdminConfig show $traceService]
puts $traceService
puts "+++++++++++++++++++++++++++++++++++++++++++"

# P.S - I will be posting a more refined version very soon..!!

ProcessTemplateNotFoundException: CWWBA0008E:

If you encounter these errors, the following steps would help you to resolve Business ProcessTemplate not found errors.

- Log onto BPC explorer and look for the ProcessTemplate under "My Process Template"
- That process template should be running.
-If the template is not found from BPC explorer there could be several reasons :

a. There could be no record in the PROCESS_TEMPLATE_B_T table under the BPC schema , check if you find a row for the business process.

b. If you don't find any record in the PROCESS_TEMPLATE_B_T table then it could deployment issues, BPEL code may not be deployed properly , check for errors encontered while deployment and try to redeploy the code.

c. If the problem does not get resolved even after redeployment and you don't see the Template in the BPC explorer , the problem is very odd and it occured to me just once , I dont know if it was some kind of a bug.

d. There must be a record in the PROCESS_TEMPLATE_B_T table but still you wont see the template as sometime the STATUS column in the table marks its state = 2 , it happened to me , all you need to do is to mark its status = 1 ...and refresh BPC explorer you would see the Process Template now....