Admin Alert: A Client Access Mystery Solved. . . with No-Prizes!!!
July 16, 2008 Joe Hertvik
In April, I introduced Rick, an admin with a Client Access problem. His 9406-720 AS/400 would not accept more than 70 5250 client sessions. Try as I might, I was unable to solve Rick’s issue. In desperation, I submitted the problem to my legions of readers (at least 50), offering a variation on a Marvel Comics’ No-Prize to anyone who could solve it. Here’s what happened.
Part One: The Deluge of Answers
Never in the history of Admin Alert (serving the Power i, iSeries, and AS/400 community since 2002) had I received so large a response to a single column. There were so many responses (50 in all) that it took Rick a full two months to get through all of them and discover his solution. Both Rick and I wanted to thank everyone who wrote in with tricks, techniques, or comments. It was a great display of camaraderie and well appreciated.
Part Two: The Problem in Detail
Rick’s problem was that he had seemingly run into a limit on the number of IBM Client Access sessions that his OS/400 V4R4 machine could accept. After hitting the limit, users would open the Client Access software, start a 5250 session, and find they were unable to connect. Rick’s Autoconfigure virtual devices system value (QAUTOVRT) was set to *NOMAX, which allowed his machine to create an unlimited number of QPADEVxxxx devices for his 5250 session users. To make matters worse, PC5250 wasn’t even providing an informative error message to Rick’s users. It just displayed a blank connection screen without a logon box, displaying a simple Line 23 message that no host connection could be made.
Times were bleak in that shop.
A Clue, and We Don’t Mean the Board Game
But there were hints. Rick started noticing that the problem would appear after about 65 to 70 users had logged on to the system. Doing some basic detective work, he started counting the number of open Telnet sessions that were on the machine each time the AS/400 stopped accepting connections. According to Rick, the number varied somewhat. Non-connecting users would sometimes appear when 62 Telnet users were on the machine; sometimes they appeared when there were 68 users.
“Clue number one was that if the number varied, then it couldn’t be a hard limit,” wrote Rick. “So I began to look for any type of program limit, maybe some program was not allowing additional jobs to start or maybe it was even changing system values. But I did notice that the problem seemed to be most prevalent around lunch time.”
Hungry for Answers…and a Bagel
Sensing that whatever was happening was specifically designed to keep him from eating his lunch, Rick took drastic action. He set the limit device session system value (QLMTDEVSSN) to one so that general user access was limited to one and only one device session for each user. He had reduced the problem by reducing the number of sessions on his system.
“That caused such an outcry from the users that I was almost tarred and feathered by an angry lynch mob,” Rick said. “And finally I was able to eat my lunch in peace, but only twice before [the problem] came up again.”
Hungry and driven to frustration, Rick was examining a job log associated with a hung session when he asked that user how his session came to be hung. The user replied that his PC locked up and he had to reboot it. When asked if this happened often, Rick felt a chill when the user replied.
“Yea, just about every day, especially when I get back from lunch.”
Looking around, Rick found several other examples of users who washed down their lunch with a nice warm system reboot. A pattern was emerging.
Ding, Dong, the Switch is Dead
So Rick went for broke, and removed the one device per user limit (by changing the QLMTDEVSSN value back to zero) and instructed the users to open as many sessions as they wanted. He counted 64 open sessions, and waited for someone’s PC to freeze up. But it wasn’t just one PC that went cold. Three other PCs were locked up.
And the culprit wasn’t his AS/400.
“It was a faulty switch losing network connectivity that froze the machines,” exclaimed Rick. “After replacing the switch and ending the users’ still active sessions, I looked at the QPADEVxxxx devices [on the machine] and found several that were varied off. BINGO.”
Rick discovered that after deleting the non-active device descriptions, his users could log on again. It turned out that the AS/400 was having a problem cleaning up older, disconnected QPADEVxxxx device descriptions. Many of the QPADEVxxxx descriptions were varied off, which prevented users from getting a sign-on screen when the device was assigned to an incoming Telnet session. When the devices were varied on, the users started getting sign-on screens again, although the users would sometimes receive pop-up messages about damaged message queues associated with the device.
“A faulty switch exaggerated the issue, but I still have users who do not log off the system before they close the Client Access software on their PCs,” said a happy, well-fed Rick. “I wrote a program to vary off unused sessions and delete the QPADEVxxxx device descriptions and message queues every night. And if need be, I can run the program during the day to clean up offending devices if I get a call about not being able to get a logon screen.”
Like many problems, Rick’s issue had multiple causes. Over the years, his AS/400 had accumulated a number of damaged and varied-off QPADEVxxxx devices that were limiting the number of users who could sign on. In addition, his faulty network switch exacerbated the problem by knocking more user sessions off each day during a high traffic period, when users logged on again after lunch. Once these two problems were corrected, life went back to normal, and Rick’s AS/400 was once again able to break the artificial limit of 60 to 70 users. The last time he wrote me, Rick had counted a new high of 93 5250 sessions started on his machine.
And the No-Prizes Go To. . .
Although no one suggested that a faulty piece of network equipment might be exacerbating the issue, three readers did suggest that Rick examine his QPADEVxxxx devices and vary on all varied-off devices. In alphabetical order, these stalwart Client Access detective readers were:
All three of you, please stand up and take a bow. And be sure to look for your No-Prize in next week’s email. It will be sent from my secured Swiss No-Account directly to the world’s best No-ISP who will quickly deliver it to No-Email address that belongs to you. To distinguish it from SPAM, the subject line will either read Oil falls below $100 a barrel, Weight Loss, or The best way to meet singles in your city.
In fact, since I’m feeling generous, I’m going to send Good Try No-Prizes to everyone who entered the contest, including the following people and people-ish names that I derived from the email addresses of the contestants:
Thanks, everyone. We’ll have to do this again sometime.