« Reading List: La grammaire est une chanson douce | Main | Comet Machholz Approaches the Pleiades »

Wednesday, January 5, 2005

Entropic Storm: Palm/Hansdpring Treo "Radio Reset" and HotSync BSOD mitigation

The traditional end-of-year entropic storm was somewhat later than usual arriving at Fourmilab, but seems to be building up to its customary gale force intensity.

For the last few years, I've used Handspring (now acquired by Palm) Treo PalmOS organisers integrated with mobile phones--packing this gizmo lets me send and receive voice calls, SMS text messages, browse the Web, read and answer E-mail, run Palm applications, and read books from my pocket library wherever I happen to be.

These devices are superbly designed from the user interface standpoint, but the hardware doesn't always meet the same standard. My original Treo 180 died a month or so after it went out of warranty due to an open circuit in the flex cable which connects the speaker in the flip cover to the main circuit board. The first Treo 270 I bought to replace it was dead on arrival--not even the charge LED illuminated when it was connected to the mains adaptor and plugged in. The replacement I received upon sending that one back was also apparently dead when received but at least the charge light worked and I discovered that a hard reset (not a problem, since I hadn't the opportunity to load any data onto it) sufficed to resuscitate it.

Everything's gone more or less OK since then, until yesterday when the Treo became "sticky"--switching between screens, which usually takes a fraction of a second, took up to a minute, and attempting to turn wireless mode off and back on resulted in a hang in the "Network Search..." mode before the search dialogue box displayed.

As it turns out, this is a known problem. and there's a Palm application to "fix" it, Radio Reset. Apparently, the GSM radio subsystem in the Treo can hang, and needst be "reset" in order to put things back on track. This reset is accomplished by the elegant expedient of completely discharging the battery until everything dies (which causes loss of all PDA data just like a hard reset), then recharging and powering back on, after which one hopes the radio will have recovered from its funk. All the Radio Reset application does is circumvent the automatic power-off in order to run down the battery, which takes about two hours if it's fully charged to begin with.

Now, allowing the battery to run down to the last gasping electron means you're going to lose all the memory on the Treo, but that's nothing to worry about if you use BackupBuddy, which will restore all your applications and settings to the status quo ante the last HotSync. Before running Radio Reset, I wanted to HotSync to make sure I'd backed up any changes since the last time, whereupon I discovered that HotSync had ceased to work--HotSync Manager was running and configured as usual, but the HotSync never connected and timed out. After the usual flailing around (killing and restarting HotSync Manager, rebooting Windows XP, re-installing Palm Desktop, etc., etc.), just when I was about to conclude the HotSync failure was related to the radio hang-up in some bizarre fashion, I happened to plug a backup HotSync cable (from the dead Treo 180) I was experimenting with into a four port USB hub instead of the direct USB connection I usually use, and when I pressed the HotSync button, hey presto, Windows XP "detected new hardware", went through the usual rigamarole installing the Palm USB driver, whereupon HotSync worked fine. Naturally, I then tried moving the USB cable back to the original USB port--no go--beats me; the hub port continued to work correctly.

With the PDA files backed up, I then went ahead and launched Radio Reset to run down the battery and went off to do other things while that thrilling process was underway. Once the PDA was well and truly black-screen dead, I put it into the cradle and allowed it to fully recharge, which took another couple of hours. Then, after going through the full resurrection reset process (calibrate the touch screen, choose the language, find the "Z" key on the keyboard, etc.), I turned on the GSM radio, which promptly found the Swisscom SWISS GSM network I use and connected, indicating GPRS service available.

Now it was simply a matter of restoring all the data on the PDA. Booting back into Windows XP (since I'd been running Linux in the interest of doing productive work while the battery discharged and recharged), and with the HotSync cable still plugged into the USB hub, I pressed the button, and HotSync began to restore all the data onto the handheld . . . for about a minute . . . after which Windows XP crashed with a Blue Screen of Death (BSOD) fingering the Palm USB driver as the culprit and requiring a full power down (the "three finger salute": CTRL-ALT-DEL, was ineffective, and judging from the fan on the Dell Inspiron 9100, the CPU was in a compute loop).

After rebooting, I tried the HotSync again--BSOD--reboot--and again: same thing. The crash didn't always happen at the same point in the HotSync, but it never got more than five minutes into the recovery before going blooie. I tried disabling BackupBuddy (which shouldn't be in the loop at this point in a HotSync recovery, but you never know), and the crash persisted.

At this point I moseyed over to Google in search of lore regarding HotSync BSODs, and eventually came across this flabbergasting PalmOne support document which reveals that HotSync Manager is incompatible with Intel CPUs which employ Hyper-Threading technology and with symmetric multi-processor systems in general! The only way to reliably run HotSync on such systems is to launch HotSync Manager, then pop up the Windows Task Manager (appropriately, with CTRL-ALT-DEL!), select the "Processes" tab, right click the HOTSYNC.EXE item and select "Set Affinity" whereupon a dialogue box appears which allows you to restrict the process to a subset of CPUs. Check just one CPU (I chose logical CPU 0), which will force HotSync Manager to run only on that CPU, after which the HotSync will run to completion, sans BSOD.

Setting the Processor Affinity ("CPU dedication" to EXEC-8 old-timers) does not persist across multiple program executions--it affects only the running process. If you kill and restart HotSync Manager or reboot, you'll need to reset the Processor Affinity every time. Note that this is terribly inconvenient if you've configured HotSync Manager to run only when Palm Desktop is running, since each time you launch the desktop application, you need to reset the processor affinity for HOTSYNC.EXE. I am unaware of any way to set the processor affinity for an executable file so it always runs on a given CPU, or to specify affinity on the command line, which would permit invoking HotSync Manager with a shortcut with such a specification.

As somebody who's been writing multi-threaded programs for multiprocessor systems for more than thirty years, I gotta tell you that screwing up something as simple-minded as a USB kernel driver so it doesn't work on multiple CPU systems is something of an accomplishment. Further, here is Palm, pioneer and erstwhile leader in the handheld computing market, shipping a product to all of their customers which crashes the operating system with the largest global market share, when performing a routine operation on top-end Intel microprocessor (Pentium 4 with Hyper-Threading) systems. And rather than rushing out a patch for this horrific problem, they bury the information in a support library document, offering only the lamest, user-hostile, and inconvenient of work-arounds.

If you have a Palm and have been HotSyncing it to a Windows XP system with Hyper-Threading or multiple CPUs without crashes, you've been lucky so far. Many routine HotSyncs complete quickly enough to dodge the bullet, but a full backup or restore is almost certain to provoke the problem. (I'd suffered this crash a couple of times before, but since it didn't repeat, I assumed it was "just one of those things").

And HotSync still only works with the cable plugged into the USB hub.

Posted at January 5, 2005 00:34