Thermal runaway/process overload

Help with Raspberry Pi Foundation Boards (Raspberry Pi, Pi Zero, Pi Zero W, Pi A, Pi B, Pi B+, Pi2 B, Pi3 B, Pi 3 B+)
Post Reply
K9DJN
Posts: 38
Joined: Mon Dec 24, 2018 6:25 pm

Thermal runaway/process overload

Post by K9DJN »

Hello,

I have a total of four hotspots, each one dedicated to a unique digital mode. The one for YSF experienced a huge process load today after a power outage at my home. I mean the numbers were in the 1.0 to 1.4 range! It employs a Chinese JumboSpot clone that has worked well until now, but clearly that level of processor usage created a thermal overload. I set the startup host to none, and it came right down. But once I tried to connect to a room, it shot right back up again. I do have the heatsinks on all the chips, and the board is on an open frame tower, so plenty of air to circulate around it. When I shut it down, I felt the heatsink for the processor and it was quite warm. All of my spots are running the latest PiStar, 4.1.1, and none of the others are doing this.

Any idea what would be eating so many processor cycles?

Doug
K9DJN
KN2TOD
Posts: 264
Joined: Sun Nov 11, 2018 6:36 pm

Re: Thermal runaway/process overload

Post by KN2TOD »

Don't know if it's related but: two 24/7 hotspots, both Pi-3B+'s/Zumspots, DMR only, one running 4.1.1, the other 4.1.0. After 4 days, the 4.1.1 HS was getting sluggish, running nearly 20x the processor load of the other (2.24/1.84/1.84 versus 0.19/0.5/0.49, per dashboards), but temps were within a degree of each other. A reboot seems to have returned the one back to normal, at least for the moment.
AF6VN
Posts: 821
Joined: Fri Jul 20, 2018 1:15 am

Re: Thermal runaway/process overload

Post by AF6VN »

While Pi-Star normally runs in R/O mode, my experience is that often if there was an overnight update it will get stuck in R/W.

Combining that with a power-outage (uncontrolled shutdown) could mean you have a corrupt SD card.

--
AF6VN
Dennis L Bieber
K9DJN
Posts: 38
Joined: Mon Dec 24, 2018 6:25 pm

Re: Thermal runaway/process overload

Post by K9DJN »

I did use the top command and watched it for a while. No obvious hogs. So I just shut it down overnight and let it cool off. The background on the temperature indicator had changed to orange, so I figured this was the best course.

Interestingly, though, this is a Pi that I use exclusively for YSF. When I turned it on again today, and went to the admin screen to check temps, I see there is now a room selector ((YSF Link Manager) similar to what there is on D-Star and the Brandmeister API on DMR. So maybe it was hung up in that update, and just needed a solid reboot to get it. I wish Any would add a Link Manager to NXDN as well.

All seems well now with this particular Pi. Thanks for all the help.
KN2TOD
Posts: 264
Joined: Sun Nov 11, 2018 6:36 pm

Re: Thermal runaway/process overload

Post by KN2TOD »

Curious: for the hotspot with the thermal runaway/process overload, does it happen to be hard-wired to your network?
K9DJN
Posts: 38
Joined: Mon Dec 24, 2018 6:25 pm

Re: Thermal runaway/process overload

Post by K9DJN »

It's a Pi -3 Model B. Nothing exotic.

Now I see that Andy did update the NXDN dashboard to have a Link Manager drop down. This is awesome!
K9DJN
Posts: 38
Joined: Mon Dec 24, 2018 6:25 pm

Re: Thermal runaway/process overload

Post by K9DJN »

KN2TOD wrote: Thu May 07, 2020 12:49 am Curious: for the hotspot with the thermal runaway/process overload, does it happen to be hard-wired to your network?
Yes, all of my hotspots are hardwired. But I can repeat this behaviour almost anytime this particular Pi is in use, and the top command always shows the YSF Gateway process using 100% of the CPU. So it's definitely a bug in that program from what I can see. All I have to do to resolve it, at least temporarily, is to shut off the YSF mode completely and then re-start it later. If I leave it on long enough, the high CPU usage comes back.
K9DJN
Posts: 38
Joined: Mon Dec 24, 2018 6:25 pm

Re: Thermal runaway/process overload

Post by K9DJN »

Final post on this will be the screenshot of the YSF Gateway program eating 101% of the CPU. There is clearly something wrong with this process if left running for a long time. Even a solid reboot won't fix it. Only a shutdown for several hours seems to clear it up.
Attachments
Screenshot from 2020-07-30 20-19-26.jpg
Screenshot from 2020-07-30 20-19-26.jpg (252.56 KiB) Viewed 3673 times
Post Reply