Some of the common problems that occur in this environment are:
1.MQ channel stopped
2.Queue full
3.Messages in the dead-letter queue
4.Messages in a queue and no open processes
5.Isolating MQ problems between IBM z/OS and distributed systems.
MQ channel stopped
An MQ “Channel Stopped” error is easy to detect either by activating the supplied situation (alert) based on an MQ event or by querying the channel status attribute. The “Channel Stopped” event alerts only when the return code indicates that the channel has stopped in error.
This event provides an easy, low-overhead way to alert you on this situation. By using the channel status attribute, the alert is enhanced by checking the depth of the XmitQ associated with the channel and then checking the status of the channel only when there are messages in the XmitQ. This action reduces the need to check every channel; you only check the channels that are truly problematic
So, for channel down problems you can have two basic alerts. One is based on the MQ event of channel stop and the return code indicating an error. The other monitors for messages in the XmitQ and the associated channel that is not running.
Queue full
On the issue of a “Queue Full” event, again there are two ways to alert on this issue. The first is to use the Queue Full MQ event, which is a relatively easy way to detect this issue. However, to truly monitor the status of the queue depth, you must be able to monitor the actual depth of the queue. MQ events alert you when a queue depth crosses the high-depth threshold and when it is full, but these events are not resolved until the queue depth drops below the queue low threshold. So, a queue could go full and then as the problem is being rectified (maybe by starting the process to read from that queue) the queue could be in a steady state of half full, with messages going on and off the queue, without an event being generated to tell us that the queue is no longer full. You should therefore set up another situation that actually monitors the depth of the queue. Then when the first message comes off the queue that alert would be automatically closed, indicating that the queue is no longer full.
Messages in the dead-letter queue
Most, if not all, queue managers have a designated dead-letter queue (DLQ). The DLQ prevents MQ from bringing a channel down because of an undeliverable message. However, you must have a situation in place to alert you to when any messages arrive on the DLQ. One of the many enhanced metrics in OMEGAMON XE for Messaging is the translation of the reason code for the message being put on the DLQ. This identifies the root cause and reduces the mean time to repair (MTTR) of the problem. On resolution of the problem, use OMEGAMON XE for Messaging to delete or forward the message back to the original destination or some other queue.
Messages in a queue and no open processes
If there are messages in a queue and no processes have the queue open for input or output, it is probably at least worth a warning alert. This situation can be a problem unless the queue is expected to hold messages for some future processing window. It is simple enough to set up alerts in OMEGAMON XE for Messaging based on these requirements. The alert can be further refined to not trigger within certain expected time frames when you expect that the queue will not be opened.
Isolating MQ problems between IBM z/OS and distributed systems
One of the key reasons that most companies make the decision to use IBM MQ is the ability to easily connect applications across disparate systems with a common application programming interface (API). However, these same companies continue to manage MQ as separate entities in the distributed and z/OS environments. To correctly diagnose MQ issues, you must be able to manage MQ with a common set of tools across all architectures. OMEGAMON XE for Messaging normalizes MQ metrics across platforms so that one person or team can easily manage MQ from a single vantage point. You can manage channels, queues or other objects in the same way, wherever they exist, which results in a more effective level-one analysis. It no longer requires two or more distinct teams to perform level-one diagnostics and management of the MQ environment. A platform specialist is required only for underlying platform-specific issues.
No comments:
Post a Comment