How to monitor if a remote Rpi freezes and reboots it, using a watchdog timer or otherwiseWatchDog Daemon not restarting PI after fork bombHow to remotely turn off the cooling fan of an already software shut down Rpi?Are there reliable methods for connecting/reconnecting to multiple bluetooth LE devices to a raspberry Pi?Slight dip on the GPIO 5v line when LEDs are onUSB modem ppp0 automatic connection improvementhow do I automatically power off the raspberry pi and cut power every night and then power it back on each day?Strange segfault after several hours of running programConnecting pihut controller to pi zero. Syntax error when calling a scriptIs it safe to connect a GPIO Pi pin directly to a 3.3V Arduino pin, and vice versa?Green light always on after a while for Raspberry pi 3 Model B V 1.2 2015How to measure Power Supply without Extra Hardware on B+ and RPi 2RPI 2 B freezes when Wi-Fi dongle is inserted and doesn't detect itHow to send and receive data between RPi and PIC33E microcontroller via USB?.Combine Raspberry Pi and a PC on a single monitor - using one switchPowering RPI 3 and using relay from same sourcePowering Rpi won't boot using custom microusb and supply
Dealing with recruiters who clearly didn't look at my resume
Why combine commands on a single line in a Bash script?
How to initiate a conversation with a person who recently had transition but you were not in touch with them?
Print your char count in words, in many languages
Does the difficulty for the Google dinosaur game infinitely increase or stop at a certain point?
Best way to get my money back from a friend having family problems
Simple n-body class in C++
What are those two silvery objects resting on Ben Kenobi's table when R2-D2 plays Princess Leia's message?
How can I manage my team to maintain a reasonable productivity when my employer doesn't treat employees well?
Practically, how does an 'observer' collapse a wave function?
How can I run a cable past a horizontal block between studs in my wall?
How does text classification reduce manpower costs?
Should I pay closing cost and replace HVAC for buyer
180W Laptop charged with 45W charger, is it dead?
What does Google's claim of "Quantum Supremacy" mean for the question of BQP vs BPP vs NP?
Why do non-aerobatic aircraft have a negative G limit?
"When you Frankenstein a team together..." - Is "Frankenstein" a new verb?
(Would be) teammate called me privately to tell me he does not wish to work with me
How offensive is Fachidiot?
“These days are over” vs. “those days are over”
Why couldn't the Romulans simply circumvent Starfleet's blockade?
Is there an uncertainty associated with the value 0 K for absolute zero?
Who originated the dangerous avocado-pitting technique?
Function defined everywhere but continuous nowhere
How to monitor if a remote Rpi freezes and reboots it, using a watchdog timer or otherwise
WatchDog Daemon not restarting PI after fork bombHow to remotely turn off the cooling fan of an already software shut down Rpi?Are there reliable methods for connecting/reconnecting to multiple bluetooth LE devices to a raspberry Pi?Slight dip on the GPIO 5v line when LEDs are onUSB modem ppp0 automatic connection improvementhow do I automatically power off the raspberry pi and cut power every night and then power it back on each day?Strange segfault after several hours of running programConnecting pihut controller to pi zero. Syntax error when calling a scriptIs it safe to connect a GPIO Pi pin directly to a 3.3V Arduino pin, and vice versa?Green light always on after a while for Raspberry pi 3 Model B V 1.2 2015How to measure Power Supply without Extra Hardware on B+ and RPi 2RPI 2 B freezes when Wi-Fi dongle is inserted and doesn't detect itHow to send and receive data between RPi and PIC33E microcontroller via USB?.Combine Raspberry Pi and a PC on a single monitor - using one switchPowering RPI 3 and using relay from same sourcePowering Rpi won't boot using custom microusb and supply
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;
I'm building a system with a raspberry pi located in a very remote area connected to internet with an internet stick. The tests are promising so far but the pi freezes every here and then and I'm not able to connect to the pi anymore. Because I don't want to take a 2 hour drive everytime it freezes I want to build a redundant system which checks the other system.
The worst case will be to cut the frozen system from power to reboot. This should be done by the working pi.
Now the question as a total noob when it comes to building electronics.
I checked out the ATXRaspi R3 but I'm not sure how to "digitally" fire off the 6sec press on that power controller to cut the power by the other pi...
What would be the easiest way to cut power by another pi? Any hints are greatly welcomed.
hardware power-supply
add a comment
|
I'm building a system with a raspberry pi located in a very remote area connected to internet with an internet stick. The tests are promising so far but the pi freezes every here and then and I'm not able to connect to the pi anymore. Because I don't want to take a 2 hour drive everytime it freezes I want to build a redundant system which checks the other system.
The worst case will be to cut the frozen system from power to reboot. This should be done by the working pi.
Now the question as a total noob when it comes to building electronics.
I checked out the ATXRaspi R3 but I'm not sure how to "digitally" fire off the 6sec press on that power controller to cut the power by the other pi...
What would be the easiest way to cut power by another pi? Any hints are greatly welcomed.
hardware power-supply
1
Not sure anyone is going to design this circuit for you. But one additional thing to consider: Whatever causes the first Pi to freeze might have a common failure mode to the second Pi. For example, if it's freezing because of a power fluctuation, you might end up with two frozen Pis instead of the independent redundancy that you want. Might be worth trying to understand why that first Pi freezes first.
– Brick
Jun 14 at 12:33
1
How quickly do you need the pi to come back online? A simple holiday light timer could cycle the power every X hours, as long as you don't mind waiting until the reset interval to have it back online again.
– Tim
Jun 14 at 17:30
@Jurudocs, I followed #berto's watchdog timer tutorial and found everything good. I don't quite understand what the watchdog is doing, but I am 90% sure that the watchdog timer method should solve your problem, much cleaner to my proposed hardware solution.
– tlfong01
Jun 17 at 6:14
add a comment
|
I'm building a system with a raspberry pi located in a very remote area connected to internet with an internet stick. The tests are promising so far but the pi freezes every here and then and I'm not able to connect to the pi anymore. Because I don't want to take a 2 hour drive everytime it freezes I want to build a redundant system which checks the other system.
The worst case will be to cut the frozen system from power to reboot. This should be done by the working pi.
Now the question as a total noob when it comes to building electronics.
I checked out the ATXRaspi R3 but I'm not sure how to "digitally" fire off the 6sec press on that power controller to cut the power by the other pi...
What would be the easiest way to cut power by another pi? Any hints are greatly welcomed.
hardware power-supply
I'm building a system with a raspberry pi located in a very remote area connected to internet with an internet stick. The tests are promising so far but the pi freezes every here and then and I'm not able to connect to the pi anymore. Because I don't want to take a 2 hour drive everytime it freezes I want to build a redundant system which checks the other system.
The worst case will be to cut the frozen system from power to reboot. This should be done by the working pi.
Now the question as a total noob when it comes to building electronics.
I checked out the ATXRaspi R3 but I'm not sure how to "digitally" fire off the 6sec press on that power controller to cut the power by the other pi...
What would be the easiest way to cut power by another pi? Any hints are greatly welcomed.
hardware power-supply
hardware power-supply
edited 1 hour ago
tlfong01
2,2062 gold badges5 silver badges18 bronze badges
2,2062 gold badges5 silver badges18 bronze badges
asked Jun 14 at 7:43
JurudocsJurudocs
1134 bronze badges
1134 bronze badges
1
Not sure anyone is going to design this circuit for you. But one additional thing to consider: Whatever causes the first Pi to freeze might have a common failure mode to the second Pi. For example, if it's freezing because of a power fluctuation, you might end up with two frozen Pis instead of the independent redundancy that you want. Might be worth trying to understand why that first Pi freezes first.
– Brick
Jun 14 at 12:33
1
How quickly do you need the pi to come back online? A simple holiday light timer could cycle the power every X hours, as long as you don't mind waiting until the reset interval to have it back online again.
– Tim
Jun 14 at 17:30
@Jurudocs, I followed #berto's watchdog timer tutorial and found everything good. I don't quite understand what the watchdog is doing, but I am 90% sure that the watchdog timer method should solve your problem, much cleaner to my proposed hardware solution.
– tlfong01
Jun 17 at 6:14
add a comment
|
1
Not sure anyone is going to design this circuit for you. But one additional thing to consider: Whatever causes the first Pi to freeze might have a common failure mode to the second Pi. For example, if it's freezing because of a power fluctuation, you might end up with two frozen Pis instead of the independent redundancy that you want. Might be worth trying to understand why that first Pi freezes first.
– Brick
Jun 14 at 12:33
1
How quickly do you need the pi to come back online? A simple holiday light timer could cycle the power every X hours, as long as you don't mind waiting until the reset interval to have it back online again.
– Tim
Jun 14 at 17:30
@Jurudocs, I followed #berto's watchdog timer tutorial and found everything good. I don't quite understand what the watchdog is doing, but I am 90% sure that the watchdog timer method should solve your problem, much cleaner to my proposed hardware solution.
– tlfong01
Jun 17 at 6:14
1
1
Not sure anyone is going to design this circuit for you. But one additional thing to consider: Whatever causes the first Pi to freeze might have a common failure mode to the second Pi. For example, if it's freezing because of a power fluctuation, you might end up with two frozen Pis instead of the independent redundancy that you want. Might be worth trying to understand why that first Pi freezes first.
– Brick
Jun 14 at 12:33
Not sure anyone is going to design this circuit for you. But one additional thing to consider: Whatever causes the first Pi to freeze might have a common failure mode to the second Pi. For example, if it's freezing because of a power fluctuation, you might end up with two frozen Pis instead of the independent redundancy that you want. Might be worth trying to understand why that first Pi freezes first.
– Brick
Jun 14 at 12:33
1
1
How quickly do you need the pi to come back online? A simple holiday light timer could cycle the power every X hours, as long as you don't mind waiting until the reset interval to have it back online again.
– Tim
Jun 14 at 17:30
How quickly do you need the pi to come back online? A simple holiday light timer could cycle the power every X hours, as long as you don't mind waiting until the reset interval to have it back online again.
– Tim
Jun 14 at 17:30
@Jurudocs, I followed #berto's watchdog timer tutorial and found everything good. I don't quite understand what the watchdog is doing, but I am 90% sure that the watchdog timer method should solve your problem, much cleaner to my proposed hardware solution.
– tlfong01
Jun 17 at 6:14
@Jurudocs, I followed #berto's watchdog timer tutorial and found everything good. I don't quite understand what the watchdog is doing, but I am 90% sure that the watchdog timer method should solve your problem, much cleaner to my proposed hardware solution.
– tlfong01
Jun 17 at 6:14
add a comment
|
5 Answers
5
active
oldest
votes
Before you go looking into additional hardware, please read up on what's called a "watchdog timer". The Raspberry Pi has a hardware watchdog built in that will power cycle it if the chip is not refreshed within a certain interval.
I have setup the watchdog on a Raspberry Pi 3 and a new'ish version of Raspbian with very little configuration. The first thing to check is that the hardware watchdog is available (I checked my system and it looks like the version of Raspbian I have installed compiles watchdog support right into the kernel; no need to load a kernel module):
pi@unicornpi:~ $ ls -al /dev/watchdog*
crw------- 1 root root 10, 130 Nov 3 2016 /dev/watchdog
crw------- 1 root root 252, 0 Nov 3 2016 /dev/watchdog0
If you see /dev/watchdog
you're all set. All you have to do is configure the watchdog facility built into Systemd.
In the file /etc/systemd/system.conf
, set the following lines:
pi@unicornpi:~ $ grep Watchdog /etc/systemd/system.conf
RuntimeWatchdogSec=10
ShutdownWatchdogSec=10min
What the lines above say is:
refresh the hardware watchdog every 10 seconds. if for some reason the refresh fails (I believe after 3 intervals; i.e. 30s) power cycle the system
on shutdown, if the system takes more than 10 minutes to reboot, power cycle the system
Once you have this configured and reboot, you will see something like this in the dmesg
logs:
pi@orangepi:~ $ dmesg | grep -i watchdog
[ 0.763148] bcm2835-wdt 3f100000.watchdog: Broadcom BCM2835 watchdog timer
[ 1.997557] systemd[1]: Hardware watchdog 'Broadcom BCM2835 Watchdog timer', version 0
[ 2.000728] systemd[1]: Set hardware watchdog to 10s.
If you see Set hardware watchdog to 10s
you're all set.
The best way I've found to verify that the watchdog works is to overload the system. I've done this with a "fork bomb", which will completely saturate the system with garbage process forks. If you run this the Pi will become unresponsive and the watchdog should kick in. Your system should be up and running again after about a minute:
:() :;:
Paste that into a shell and your system will be taken down. You've been warned.
More info on the watchdog system built into Systemd is on the author's website.
Many thanks for advice. I have heard watchdog for a long time but never tried it, because no necessity, until now, building smart rooftop garden away from home (actually 50 feet above home). Another reason did not try because tutorials not newbie friendly. When started Rpi1 years ago, I found terminal commands very scary (it took me more than three hours to download a zip (tar actually) and extracted it, but I did not know where to find the extracted files!) Now I find terminal commands not that scary, but sometimes very efficient, though I still love Win PowerShell terminal commands, ...
– tlfong01
Jun 15 at 4:03
And the advice at the beginning of your answer of first reading up what is a watch dog is very good. I did not know that watchdog is actually "watchdog TIMER" in short. This is important because if I know it is a timer beforehand, I can understand things better. And as usual, I started with Wiki, which is always a good read for newbies. Now I know that watch dog is actually some sort of hardware sitting alongside the Rpi. So even Rpi messes up things, the outside guy can come to rescue (or "kick in"?). Reading Wiki let me know that "kick in" is not slang, but technical term.
– tlfong01
Jun 15 at 5:10
I also didn't know what is a "daemon". When I was a child, I read the Bible that daemon is a bad guy, so righteous programmers like me should not use daemons, otherwise I might go be Hell. But then Wiki tells me who the MIT/UNIX guys coined the name and why it spells "daemon" not demon. It also clarifies that daemons can be good and even the righteous guy Socrates owns a daemon. Anyway, I finished reading Wikis, and now ready to start your tutorials, :)
– tlfong01
Jun 15 at 5:16
So I have followed your very detailed watchdog tutorial and found everything OK to the point of setting the watchdog to 10 seconds. Next step is to try a fork bomb, perhaps late this evening or tomorrow.
– tlfong01
Jun 15 at 9:17
Thank you for suggesting to call it a “watchdog timer”. I’ve made the edit 👍🏽
– berto
Jun 15 at 13:17
|
show 6 more comments
Cutting power is a brute force method and has risks.
The conventional solution to lock-up problems is to use a watchdog.
There is a BCM hardware watchdog; If you want to start the hardware watchdog include dtparam=watchdog=on
in /boot/config.txt
In and of itself this does little, although it should restart the system if not "kicked" regularly. You can write code which opens /dev/watchdog to kick it off.
There is also a watchdog daemon which you can configure to activate the watchdog; you should be able to start with sudo systemctl enable watchdog
PS Incidentally, if you want to pursue the brute force approach - don't bother cutting power - just pull the Reset pin (labeled RUN) low. This is equivalent to powering off then on again.
add a comment
|
Question
Remote Rpi's freeze from time to time. How to wake them up?
Answer
Update 2019jul27hkt1406
I recently upgraded my Rpi3B+ stretch to Rpi4B buster and again I followed @berto's tutorial to set the watch dog timer. I found everything works as smoothly as before. In other words, no changes need to make to @berto's tutorial when upgrading to Rpi4.
Last time I knew nothing about the watchdog timer thing. So it took me more than 3 hours to google to understand everything inside out (well, almost inside out). This time I know what is going on, and all the linux tricks, so it took me only a couple of minutes to complete @berto's tutorial.
2019jun18 Updates
After more thoughts, I concluded that my answer is coming to an end.
My conclusion it that @berto's watchdog tutorial and experiment
suggestion is good, and his answer is the real answer for the OP's
question.
I did his suggested experiment successfully, verified results by the
forkbomb program, and after a lot of googling and reading for more
than 10 hours, I think I finally understood thoroughly the idea of
watchdog timer.
Earlier I wrongly thought that I still needed to learn how to set the
timer to 10 seconds or more. But as @berto says, 10 seconds is all
that to be set. I also read that I can set timer to as long as 16
seconds, and linux watchdog default is even one minute. But that is
not critical.
I have removed all the long winded reading notes in the appendices, to
make the answer shorter. I would suggest newbies not to try to
understand all the details of watchdog, not to mention the much more
complicated daemon SystemD, because our life is short, and those
system things are too complicated for non professionals.
I would like to add two points to end my answer.
(1) There are many reasons for an Rpi to hang in a couple of days
(but usually not months). Often it is not the application program's
fault, but because of the drivers or library functions creating too
much garbage, eg. sockets created, used but not properly disposed. If
it is the application program itself making garbage, the program can
do "garbage collection" and problem solved. But it is hard to remove
garbage sockets which are not generated by the application program.
So a watchdog timer is useful here.
(2) Other ways to avoid too much garbage using up resources include
rebooting every now and then by software or hardware. I do think
rebooting every morning and also use software switchable power supply
to do the system resetting adds another layer of protection. And
using only one Rpi is not very safe. Using two Rpi's as each other's
watchdog (using URT for message passing, eg) add one more layer of
protection. Another method I have not explored is using ESP8266 Wifi
sockets. I hope I can try that later.
This the the end of my answer. Cheers.
2019jun17 Updates
So I tried the fork bomb. The system rebooted after executing the program, in about 15 seconds.
2019jun16 Updates
I found @berto's fork bomb program is a bit newbie scary. So I am learning Bash to find out what that fork bomb is doing. Basically it is just a function named ":", which is defined as a function calling itself two times, thus forking indefinitely, as fast as rabbits growing exponentially, using up all the resources, and crashing linux.
I have also found the following interesting version of forkbomb using Unicode symbols:
💣 ( ) 💣 ; 💣
2019jun14/15 Updates
@thesnow suggests a very nice layered approach using a smart plug. I
think the smart plug or smart IoT stuff is the way to go. However, I
am a not so smart newbie in smart stuffm though I am keen to learn.
So I am going to buy a smart plug, do some research, and improve my
answer afterwards. For now, I have added some related learning
resources in the reference section below.
I found @berto's suggestion of using Rpi's hardware watchdog timer also very good. I have not played with any watchdoog stuff before. So I am going to try it now. @berto's instructions are very detailed, but still a bit hard for me, because I don't know very well the meaning of the commands "grep" and "dmseg". So I googled and made some reading notes in the appendices below. Then I followed @berto's suggestion, and strugged a bit to complete part 1. I have not yet reboot, because I need to take a break to digest things. Anyway, here is the screen capture.
I rebooted and got the following dmesg:
I think I am going too fast and now need to take a break to first study more linux things, like systemd, before coming back to carry on the test on watchdog.
/ to continue, ...
The Answer
I have the same problem. I am building a rooftop garden with a couple of Rpi's each of which connects to various wireless stuff (BlueTooth, Wifi) sensors, relays, and solenoids. There are two huge motors near by, controlling big water tanks and lifts. The motors generate EMI and from time to time freeze nearby electronics things.
My plan is to use software switchable PSUs (Power Supply Units) to power switch off/on frozen Rpi's and other devices (Bluetooth devices freeze most often. The BlueTooth and other little devices do not have any software reset command or hardware reset pin, so powering off/on their 5V Vcc is a quick and dirty, but still safe get around). In short, The Rpi's regularly watch each other and their devices and POR (Power On Reset) any guy fallen to sleep.
Of course I can also use a GPIO pin to trigger the Rpi hardware on board reset pin. But I am too lazy to do extra wiring, and too poor a hobbyist to afford professional/industrial grade non stop system devices such as the SwitchDoc Labs Dual WatchDog Timer (see reference below)
I modify ordinary DC-DC (12V to 5V) PSUs' so that any Rpi or MCP23x17 GPIO pins can power on/off the LM2956/LM2947 voltage regulator chip of the PSU. (LM2941 can be used for 1A current switches, LM2596 for 5V 3A PSU. The on/off pin is also connected to a push button, for manual power on/off testing.)
Actually each of my 7 Rpi3B+'s is connected to a cheapy DS3231 Real Time Clock Module which has a hardware interrupt pin to reset PSU, Rpi, or other devices.
Whenever possible and practical I tie up all the devices' reset pins together (removing some of the pull up resistors, so not to overload the GPIO pin).
Now the external DS3231 RTC wakes up everybody in the morning, and switches off lights at midnight, so everybody goes to bed.
References
1. LM2596/LM2941 Based Software Resettable PSU / Current Switches - Rpi StkEx Discussion
Rpi Hardware watchdog Discussion
SwitchDoc Labs Dual WatchDog Timer
ATXRaspi R3 - LowPowerLab US$14.95
A hackable ESP8266 inside a smart plug Want to play with ESP8266 without worrying about the hardware? - Mat 2017aug06
Reverse Engineering 101 of the Xiaomi IoT ecosystem HITCON Community 2018 – Dennis Giese
Xiaomi WiFi socket + MiHome app 21,307 views
espHome [ESP8266/ESP32]
AliExpress WiFi Smart Plug
Smart device -Wikipedia
WiFi Garage Door Opener using ESP8266 - Ray Wang 2016may13 56,335 views
Appendices
Appendix A - WatchDog Timer Reading Notes
Watchdog timer -Wikipedia
Linux WatchDog Man Page
Linux Watchdog - General Tests
Appendix B - Linux commands grep and dmesg reading notes
Appendix C - systemd references
systemd System and Service Manager - FreeDeskTop
systemd - Wikipedia
Appendix D - Fork and Fork Bomb References
Fork (system call) Wikipedia
Appendix E - Bash Learning Notes
Such a great answer! Thanks also for the pictures. Glad that you didn't took it just for this question :-D So I guess what I need is the LM25966S PSU to connect it to the GPIO as you said. I will try!!! Good that I have still my old soldering iron...
– Jurudocs
Jun 14 at 8:55
@Jurudocs Thank your for your nice words. I cut and pasted, and modify my old answers for your question, so it did not take me much time. I am a PSU hobbyist, and I DIYed PSUs using LM2596 chips and inductor coils etc. But nowadays everything goes SMD and assembled modules are dirt cheap, so I have been lazy to "make" things. By the way, to messy around the LM2596 PSU, you don't need to test by using Rpi GPIO. You can just test by hand! :) Good luck!
– tlfong01
Jun 14 at 9:15
I noticed you mentioned reading up on Systemd. While I definitely recommend you do that because it's a significant component to the way modern Linux systems work, fully understanding it is going to take a long time and not necessary to try out the watchdog. :)
– berto
Jun 15 at 14:43
1
@berto, I agree it might take me a very long time to understand the complicated SystemD. As Poettering says: "[systemd] never finished, never complete, but tracking progress of technology". I remember Oliver Heaviside, saying: "Am I to refuse to eat because I do not fully understand the mechanism of digestion?" - en.wikiquote.org/wiki/Oliver_Heaviside So I will forget systemd now and come back to watchdog. Actually I need to learn Bash first, before I can understand the weird Bash script of Fork Bomb.
– tlfong01
Jun 16 at 6:30
The fork bomb line is pretty simple once you understand what you are looking at. It’s a function named:
that calls itself recursively and puts a copy of itself in the background which also calls itself recursively. The Wikipedia page you have in your notes explains this further.
– berto
Jun 17 at 2:01
|
show 3 more comments
I have quite a few Pis. All of them, except one ran flawlessly. The problem child would crash periodically and would never recover after a power outage without being power cycled again. I had it reboot itself every night via cron and that helped somewhat.
What fixed it though was taking the SD card and sensor hardware and putting them into another Pi. It has run without error ever since. Maybe you too have a hardware issue.
I didn't catch your second paragraph about the hardware problem. Did you mean that the SD card and sensor caused all the trouble, and replacing them solved the problem?
– tlfong01
Jun 15 at 2:44
No, The Pi itself was the problem. I had a spare one, so I transferred the SD card and the sensors to the spare and used it instead of the original. No problems since.
– Wildbill
Jun 16 at 11:42
I see. So it is always a good idea to have a spare Rpi for swap troubleshooting. Perhaps the OP should also consider this.
– tlfong01
Jun 16 at 13:02
add a comment
|
If you have wi-fi and just need to power off / power on, you could also consider using a smart plug. Amazon makes one for ~$25, you can power it on / off remotely and also set up timer routines if that's preferable. I've had a few for several months and they're quite reliable. You don't actually need an Echo or any other dedicated device. I use my smart phone. Amazon Smart Plug
Edit: I realize this doesn't provide a solution to the first part of the question, but if I had the prospect of a 2 hour drive if something went wrong I'd consider a layered approach.
, I appreciate very much your suggestion of a layered approach, with a smart plug at the top layer. Actually some months I have been trying to DIY a smart plug based on the ESP8266 WiFi controller. However I found the ESP8266 with NodeMCU Lua has a very steep learning curve. It took the newbie, ie, me over 100 hours just to blink a LED (compared to less than one hour writing an Arduino or Rpi blinky program) So I sadly gave up and now decide cheat by buying a ESP8266 XiaoMi smart plug and modify it. I am going to add your suggestion to my answer soon. Many thanks again! :)
– tlfong01
Jun 15 at 2:17
add a comment
|
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("schematics", function ()
StackExchange.schematics.init();
);
, "cicuitlab");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "447"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fraspberrypi.stackexchange.com%2fquestions%2f99584%2fhow-to-monitor-if-a-remote-rpi-freezes-and-reboots-it-using-a-watchdog-timer-or%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
Before you go looking into additional hardware, please read up on what's called a "watchdog timer". The Raspberry Pi has a hardware watchdog built in that will power cycle it if the chip is not refreshed within a certain interval.
I have setup the watchdog on a Raspberry Pi 3 and a new'ish version of Raspbian with very little configuration. The first thing to check is that the hardware watchdog is available (I checked my system and it looks like the version of Raspbian I have installed compiles watchdog support right into the kernel; no need to load a kernel module):
pi@unicornpi:~ $ ls -al /dev/watchdog*
crw------- 1 root root 10, 130 Nov 3 2016 /dev/watchdog
crw------- 1 root root 252, 0 Nov 3 2016 /dev/watchdog0
If you see /dev/watchdog
you're all set. All you have to do is configure the watchdog facility built into Systemd.
In the file /etc/systemd/system.conf
, set the following lines:
pi@unicornpi:~ $ grep Watchdog /etc/systemd/system.conf
RuntimeWatchdogSec=10
ShutdownWatchdogSec=10min
What the lines above say is:
refresh the hardware watchdog every 10 seconds. if for some reason the refresh fails (I believe after 3 intervals; i.e. 30s) power cycle the system
on shutdown, if the system takes more than 10 minutes to reboot, power cycle the system
Once you have this configured and reboot, you will see something like this in the dmesg
logs:
pi@orangepi:~ $ dmesg | grep -i watchdog
[ 0.763148] bcm2835-wdt 3f100000.watchdog: Broadcom BCM2835 watchdog timer
[ 1.997557] systemd[1]: Hardware watchdog 'Broadcom BCM2835 Watchdog timer', version 0
[ 2.000728] systemd[1]: Set hardware watchdog to 10s.
If you see Set hardware watchdog to 10s
you're all set.
The best way I've found to verify that the watchdog works is to overload the system. I've done this with a "fork bomb", which will completely saturate the system with garbage process forks. If you run this the Pi will become unresponsive and the watchdog should kick in. Your system should be up and running again after about a minute:
:() :;:
Paste that into a shell and your system will be taken down. You've been warned.
More info on the watchdog system built into Systemd is on the author's website.
Many thanks for advice. I have heard watchdog for a long time but never tried it, because no necessity, until now, building smart rooftop garden away from home (actually 50 feet above home). Another reason did not try because tutorials not newbie friendly. When started Rpi1 years ago, I found terminal commands very scary (it took me more than three hours to download a zip (tar actually) and extracted it, but I did not know where to find the extracted files!) Now I find terminal commands not that scary, but sometimes very efficient, though I still love Win PowerShell terminal commands, ...
– tlfong01
Jun 15 at 4:03
And the advice at the beginning of your answer of first reading up what is a watch dog is very good. I did not know that watchdog is actually "watchdog TIMER" in short. This is important because if I know it is a timer beforehand, I can understand things better. And as usual, I started with Wiki, which is always a good read for newbies. Now I know that watch dog is actually some sort of hardware sitting alongside the Rpi. So even Rpi messes up things, the outside guy can come to rescue (or "kick in"?). Reading Wiki let me know that "kick in" is not slang, but technical term.
– tlfong01
Jun 15 at 5:10
I also didn't know what is a "daemon". When I was a child, I read the Bible that daemon is a bad guy, so righteous programmers like me should not use daemons, otherwise I might go be Hell. But then Wiki tells me who the MIT/UNIX guys coined the name and why it spells "daemon" not demon. It also clarifies that daemons can be good and even the righteous guy Socrates owns a daemon. Anyway, I finished reading Wikis, and now ready to start your tutorials, :)
– tlfong01
Jun 15 at 5:16
So I have followed your very detailed watchdog tutorial and found everything OK to the point of setting the watchdog to 10 seconds. Next step is to try a fork bomb, perhaps late this evening or tomorrow.
– tlfong01
Jun 15 at 9:17
Thank you for suggesting to call it a “watchdog timer”. I’ve made the edit 👍🏽
– berto
Jun 15 at 13:17
|
show 6 more comments
Before you go looking into additional hardware, please read up on what's called a "watchdog timer". The Raspberry Pi has a hardware watchdog built in that will power cycle it if the chip is not refreshed within a certain interval.
I have setup the watchdog on a Raspberry Pi 3 and a new'ish version of Raspbian with very little configuration. The first thing to check is that the hardware watchdog is available (I checked my system and it looks like the version of Raspbian I have installed compiles watchdog support right into the kernel; no need to load a kernel module):
pi@unicornpi:~ $ ls -al /dev/watchdog*
crw------- 1 root root 10, 130 Nov 3 2016 /dev/watchdog
crw------- 1 root root 252, 0 Nov 3 2016 /dev/watchdog0
If you see /dev/watchdog
you're all set. All you have to do is configure the watchdog facility built into Systemd.
In the file /etc/systemd/system.conf
, set the following lines:
pi@unicornpi:~ $ grep Watchdog /etc/systemd/system.conf
RuntimeWatchdogSec=10
ShutdownWatchdogSec=10min
What the lines above say is:
refresh the hardware watchdog every 10 seconds. if for some reason the refresh fails (I believe after 3 intervals; i.e. 30s) power cycle the system
on shutdown, if the system takes more than 10 minutes to reboot, power cycle the system
Once you have this configured and reboot, you will see something like this in the dmesg
logs:
pi@orangepi:~ $ dmesg | grep -i watchdog
[ 0.763148] bcm2835-wdt 3f100000.watchdog: Broadcom BCM2835 watchdog timer
[ 1.997557] systemd[1]: Hardware watchdog 'Broadcom BCM2835 Watchdog timer', version 0
[ 2.000728] systemd[1]: Set hardware watchdog to 10s.
If you see Set hardware watchdog to 10s
you're all set.
The best way I've found to verify that the watchdog works is to overload the system. I've done this with a "fork bomb", which will completely saturate the system with garbage process forks. If you run this the Pi will become unresponsive and the watchdog should kick in. Your system should be up and running again after about a minute:
:() :;:
Paste that into a shell and your system will be taken down. You've been warned.
More info on the watchdog system built into Systemd is on the author's website.
Many thanks for advice. I have heard watchdog for a long time but never tried it, because no necessity, until now, building smart rooftop garden away from home (actually 50 feet above home). Another reason did not try because tutorials not newbie friendly. When started Rpi1 years ago, I found terminal commands very scary (it took me more than three hours to download a zip (tar actually) and extracted it, but I did not know where to find the extracted files!) Now I find terminal commands not that scary, but sometimes very efficient, though I still love Win PowerShell terminal commands, ...
– tlfong01
Jun 15 at 4:03
And the advice at the beginning of your answer of first reading up what is a watch dog is very good. I did not know that watchdog is actually "watchdog TIMER" in short. This is important because if I know it is a timer beforehand, I can understand things better. And as usual, I started with Wiki, which is always a good read for newbies. Now I know that watch dog is actually some sort of hardware sitting alongside the Rpi. So even Rpi messes up things, the outside guy can come to rescue (or "kick in"?). Reading Wiki let me know that "kick in" is not slang, but technical term.
– tlfong01
Jun 15 at 5:10
I also didn't know what is a "daemon". When I was a child, I read the Bible that daemon is a bad guy, so righteous programmers like me should not use daemons, otherwise I might go be Hell. But then Wiki tells me who the MIT/UNIX guys coined the name and why it spells "daemon" not demon. It also clarifies that daemons can be good and even the righteous guy Socrates owns a daemon. Anyway, I finished reading Wikis, and now ready to start your tutorials, :)
– tlfong01
Jun 15 at 5:16
So I have followed your very detailed watchdog tutorial and found everything OK to the point of setting the watchdog to 10 seconds. Next step is to try a fork bomb, perhaps late this evening or tomorrow.
– tlfong01
Jun 15 at 9:17
Thank you for suggesting to call it a “watchdog timer”. I’ve made the edit 👍🏽
– berto
Jun 15 at 13:17
|
show 6 more comments
Before you go looking into additional hardware, please read up on what's called a "watchdog timer". The Raspberry Pi has a hardware watchdog built in that will power cycle it if the chip is not refreshed within a certain interval.
I have setup the watchdog on a Raspberry Pi 3 and a new'ish version of Raspbian with very little configuration. The first thing to check is that the hardware watchdog is available (I checked my system and it looks like the version of Raspbian I have installed compiles watchdog support right into the kernel; no need to load a kernel module):
pi@unicornpi:~ $ ls -al /dev/watchdog*
crw------- 1 root root 10, 130 Nov 3 2016 /dev/watchdog
crw------- 1 root root 252, 0 Nov 3 2016 /dev/watchdog0
If you see /dev/watchdog
you're all set. All you have to do is configure the watchdog facility built into Systemd.
In the file /etc/systemd/system.conf
, set the following lines:
pi@unicornpi:~ $ grep Watchdog /etc/systemd/system.conf
RuntimeWatchdogSec=10
ShutdownWatchdogSec=10min
What the lines above say is:
refresh the hardware watchdog every 10 seconds. if for some reason the refresh fails (I believe after 3 intervals; i.e. 30s) power cycle the system
on shutdown, if the system takes more than 10 minutes to reboot, power cycle the system
Once you have this configured and reboot, you will see something like this in the dmesg
logs:
pi@orangepi:~ $ dmesg | grep -i watchdog
[ 0.763148] bcm2835-wdt 3f100000.watchdog: Broadcom BCM2835 watchdog timer
[ 1.997557] systemd[1]: Hardware watchdog 'Broadcom BCM2835 Watchdog timer', version 0
[ 2.000728] systemd[1]: Set hardware watchdog to 10s.
If you see Set hardware watchdog to 10s
you're all set.
The best way I've found to verify that the watchdog works is to overload the system. I've done this with a "fork bomb", which will completely saturate the system with garbage process forks. If you run this the Pi will become unresponsive and the watchdog should kick in. Your system should be up and running again after about a minute:
:() :;:
Paste that into a shell and your system will be taken down. You've been warned.
More info on the watchdog system built into Systemd is on the author's website.
Before you go looking into additional hardware, please read up on what's called a "watchdog timer". The Raspberry Pi has a hardware watchdog built in that will power cycle it if the chip is not refreshed within a certain interval.
I have setup the watchdog on a Raspberry Pi 3 and a new'ish version of Raspbian with very little configuration. The first thing to check is that the hardware watchdog is available (I checked my system and it looks like the version of Raspbian I have installed compiles watchdog support right into the kernel; no need to load a kernel module):
pi@unicornpi:~ $ ls -al /dev/watchdog*
crw------- 1 root root 10, 130 Nov 3 2016 /dev/watchdog
crw------- 1 root root 252, 0 Nov 3 2016 /dev/watchdog0
If you see /dev/watchdog
you're all set. All you have to do is configure the watchdog facility built into Systemd.
In the file /etc/systemd/system.conf
, set the following lines:
pi@unicornpi:~ $ grep Watchdog /etc/systemd/system.conf
RuntimeWatchdogSec=10
ShutdownWatchdogSec=10min
What the lines above say is:
refresh the hardware watchdog every 10 seconds. if for some reason the refresh fails (I believe after 3 intervals; i.e. 30s) power cycle the system
on shutdown, if the system takes more than 10 minutes to reboot, power cycle the system
Once you have this configured and reboot, you will see something like this in the dmesg
logs:
pi@orangepi:~ $ dmesg | grep -i watchdog
[ 0.763148] bcm2835-wdt 3f100000.watchdog: Broadcom BCM2835 watchdog timer
[ 1.997557] systemd[1]: Hardware watchdog 'Broadcom BCM2835 Watchdog timer', version 0
[ 2.000728] systemd[1]: Set hardware watchdog to 10s.
If you see Set hardware watchdog to 10s
you're all set.
The best way I've found to verify that the watchdog works is to overload the system. I've done this with a "fork bomb", which will completely saturate the system with garbage process forks. If you run this the Pi will become unresponsive and the watchdog should kick in. Your system should be up and running again after about a minute:
:() :;:
Paste that into a shell and your system will be taken down. You've been warned.
More info on the watchdog system built into Systemd is on the author's website.
edited Jun 15 at 13:14
answered Jun 15 at 3:09
bertoberto
9116 silver badges11 bronze badges
9116 silver badges11 bronze badges
Many thanks for advice. I have heard watchdog for a long time but never tried it, because no necessity, until now, building smart rooftop garden away from home (actually 50 feet above home). Another reason did not try because tutorials not newbie friendly. When started Rpi1 years ago, I found terminal commands very scary (it took me more than three hours to download a zip (tar actually) and extracted it, but I did not know where to find the extracted files!) Now I find terminal commands not that scary, but sometimes very efficient, though I still love Win PowerShell terminal commands, ...
– tlfong01
Jun 15 at 4:03
And the advice at the beginning of your answer of first reading up what is a watch dog is very good. I did not know that watchdog is actually "watchdog TIMER" in short. This is important because if I know it is a timer beforehand, I can understand things better. And as usual, I started with Wiki, which is always a good read for newbies. Now I know that watch dog is actually some sort of hardware sitting alongside the Rpi. So even Rpi messes up things, the outside guy can come to rescue (or "kick in"?). Reading Wiki let me know that "kick in" is not slang, but technical term.
– tlfong01
Jun 15 at 5:10
I also didn't know what is a "daemon". When I was a child, I read the Bible that daemon is a bad guy, so righteous programmers like me should not use daemons, otherwise I might go be Hell. But then Wiki tells me who the MIT/UNIX guys coined the name and why it spells "daemon" not demon. It also clarifies that daemons can be good and even the righteous guy Socrates owns a daemon. Anyway, I finished reading Wikis, and now ready to start your tutorials, :)
– tlfong01
Jun 15 at 5:16
So I have followed your very detailed watchdog tutorial and found everything OK to the point of setting the watchdog to 10 seconds. Next step is to try a fork bomb, perhaps late this evening or tomorrow.
– tlfong01
Jun 15 at 9:17
Thank you for suggesting to call it a “watchdog timer”. I’ve made the edit 👍🏽
– berto
Jun 15 at 13:17
|
show 6 more comments
Many thanks for advice. I have heard watchdog for a long time but never tried it, because no necessity, until now, building smart rooftop garden away from home (actually 50 feet above home). Another reason did not try because tutorials not newbie friendly. When started Rpi1 years ago, I found terminal commands very scary (it took me more than three hours to download a zip (tar actually) and extracted it, but I did not know where to find the extracted files!) Now I find terminal commands not that scary, but sometimes very efficient, though I still love Win PowerShell terminal commands, ...
– tlfong01
Jun 15 at 4:03
And the advice at the beginning of your answer of first reading up what is a watch dog is very good. I did not know that watchdog is actually "watchdog TIMER" in short. This is important because if I know it is a timer beforehand, I can understand things better. And as usual, I started with Wiki, which is always a good read for newbies. Now I know that watch dog is actually some sort of hardware sitting alongside the Rpi. So even Rpi messes up things, the outside guy can come to rescue (or "kick in"?). Reading Wiki let me know that "kick in" is not slang, but technical term.
– tlfong01
Jun 15 at 5:10
I also didn't know what is a "daemon". When I was a child, I read the Bible that daemon is a bad guy, so righteous programmers like me should not use daemons, otherwise I might go be Hell. But then Wiki tells me who the MIT/UNIX guys coined the name and why it spells "daemon" not demon. It also clarifies that daemons can be good and even the righteous guy Socrates owns a daemon. Anyway, I finished reading Wikis, and now ready to start your tutorials, :)
– tlfong01
Jun 15 at 5:16
So I have followed your very detailed watchdog tutorial and found everything OK to the point of setting the watchdog to 10 seconds. Next step is to try a fork bomb, perhaps late this evening or tomorrow.
– tlfong01
Jun 15 at 9:17
Thank you for suggesting to call it a “watchdog timer”. I’ve made the edit 👍🏽
– berto
Jun 15 at 13:17
Many thanks for advice. I have heard watchdog for a long time but never tried it, because no necessity, until now, building smart rooftop garden away from home (actually 50 feet above home). Another reason did not try because tutorials not newbie friendly. When started Rpi1 years ago, I found terminal commands very scary (it took me more than three hours to download a zip (tar actually) and extracted it, but I did not know where to find the extracted files!) Now I find terminal commands not that scary, but sometimes very efficient, though I still love Win PowerShell terminal commands, ...
– tlfong01
Jun 15 at 4:03
Many thanks for advice. I have heard watchdog for a long time but never tried it, because no necessity, until now, building smart rooftop garden away from home (actually 50 feet above home). Another reason did not try because tutorials not newbie friendly. When started Rpi1 years ago, I found terminal commands very scary (it took me more than three hours to download a zip (tar actually) and extracted it, but I did not know where to find the extracted files!) Now I find terminal commands not that scary, but sometimes very efficient, though I still love Win PowerShell terminal commands, ...
– tlfong01
Jun 15 at 4:03
And the advice at the beginning of your answer of first reading up what is a watch dog is very good. I did not know that watchdog is actually "watchdog TIMER" in short. This is important because if I know it is a timer beforehand, I can understand things better. And as usual, I started with Wiki, which is always a good read for newbies. Now I know that watch dog is actually some sort of hardware sitting alongside the Rpi. So even Rpi messes up things, the outside guy can come to rescue (or "kick in"?). Reading Wiki let me know that "kick in" is not slang, but technical term.
– tlfong01
Jun 15 at 5:10
And the advice at the beginning of your answer of first reading up what is a watch dog is very good. I did not know that watchdog is actually "watchdog TIMER" in short. This is important because if I know it is a timer beforehand, I can understand things better. And as usual, I started with Wiki, which is always a good read for newbies. Now I know that watch dog is actually some sort of hardware sitting alongside the Rpi. So even Rpi messes up things, the outside guy can come to rescue (or "kick in"?). Reading Wiki let me know that "kick in" is not slang, but technical term.
– tlfong01
Jun 15 at 5:10
I also didn't know what is a "daemon". When I was a child, I read the Bible that daemon is a bad guy, so righteous programmers like me should not use daemons, otherwise I might go be Hell. But then Wiki tells me who the MIT/UNIX guys coined the name and why it spells "daemon" not demon. It also clarifies that daemons can be good and even the righteous guy Socrates owns a daemon. Anyway, I finished reading Wikis, and now ready to start your tutorials, :)
– tlfong01
Jun 15 at 5:16
I also didn't know what is a "daemon". When I was a child, I read the Bible that daemon is a bad guy, so righteous programmers like me should not use daemons, otherwise I might go be Hell. But then Wiki tells me who the MIT/UNIX guys coined the name and why it spells "daemon" not demon. It also clarifies that daemons can be good and even the righteous guy Socrates owns a daemon. Anyway, I finished reading Wikis, and now ready to start your tutorials, :)
– tlfong01
Jun 15 at 5:16
So I have followed your very detailed watchdog tutorial and found everything OK to the point of setting the watchdog to 10 seconds. Next step is to try a fork bomb, perhaps late this evening or tomorrow.
– tlfong01
Jun 15 at 9:17
So I have followed your very detailed watchdog tutorial and found everything OK to the point of setting the watchdog to 10 seconds. Next step is to try a fork bomb, perhaps late this evening or tomorrow.
– tlfong01
Jun 15 at 9:17
Thank you for suggesting to call it a “watchdog timer”. I’ve made the edit 👍🏽
– berto
Jun 15 at 13:17
Thank you for suggesting to call it a “watchdog timer”. I’ve made the edit 👍🏽
– berto
Jun 15 at 13:17
|
show 6 more comments
Cutting power is a brute force method and has risks.
The conventional solution to lock-up problems is to use a watchdog.
There is a BCM hardware watchdog; If you want to start the hardware watchdog include dtparam=watchdog=on
in /boot/config.txt
In and of itself this does little, although it should restart the system if not "kicked" regularly. You can write code which opens /dev/watchdog to kick it off.
There is also a watchdog daemon which you can configure to activate the watchdog; you should be able to start with sudo systemctl enable watchdog
PS Incidentally, if you want to pursue the brute force approach - don't bother cutting power - just pull the Reset pin (labeled RUN) low. This is equivalent to powering off then on again.
add a comment
|
Cutting power is a brute force method and has risks.
The conventional solution to lock-up problems is to use a watchdog.
There is a BCM hardware watchdog; If you want to start the hardware watchdog include dtparam=watchdog=on
in /boot/config.txt
In and of itself this does little, although it should restart the system if not "kicked" regularly. You can write code which opens /dev/watchdog to kick it off.
There is also a watchdog daemon which you can configure to activate the watchdog; you should be able to start with sudo systemctl enable watchdog
PS Incidentally, if you want to pursue the brute force approach - don't bother cutting power - just pull the Reset pin (labeled RUN) low. This is equivalent to powering off then on again.
add a comment
|
Cutting power is a brute force method and has risks.
The conventional solution to lock-up problems is to use a watchdog.
There is a BCM hardware watchdog; If you want to start the hardware watchdog include dtparam=watchdog=on
in /boot/config.txt
In and of itself this does little, although it should restart the system if not "kicked" regularly. You can write code which opens /dev/watchdog to kick it off.
There is also a watchdog daemon which you can configure to activate the watchdog; you should be able to start with sudo systemctl enable watchdog
PS Incidentally, if you want to pursue the brute force approach - don't bother cutting power - just pull the Reset pin (labeled RUN) low. This is equivalent to powering off then on again.
Cutting power is a brute force method and has risks.
The conventional solution to lock-up problems is to use a watchdog.
There is a BCM hardware watchdog; If you want to start the hardware watchdog include dtparam=watchdog=on
in /boot/config.txt
In and of itself this does little, although it should restart the system if not "kicked" regularly. You can write code which opens /dev/watchdog to kick it off.
There is also a watchdog daemon which you can configure to activate the watchdog; you should be able to start with sudo systemctl enable watchdog
PS Incidentally, if you want to pursue the brute force approach - don't bother cutting power - just pull the Reset pin (labeled RUN) low. This is equivalent to powering off then on again.
edited Jun 14 at 8:44
answered Jun 14 at 8:21
MilliwaysMilliways
34.3k14 gold badges59 silver badges130 bronze badges
34.3k14 gold badges59 silver badges130 bronze badges
add a comment
|
add a comment
|
Question
Remote Rpi's freeze from time to time. How to wake them up?
Answer
Update 2019jul27hkt1406
I recently upgraded my Rpi3B+ stretch to Rpi4B buster and again I followed @berto's tutorial to set the watch dog timer. I found everything works as smoothly as before. In other words, no changes need to make to @berto's tutorial when upgrading to Rpi4.
Last time I knew nothing about the watchdog timer thing. So it took me more than 3 hours to google to understand everything inside out (well, almost inside out). This time I know what is going on, and all the linux tricks, so it took me only a couple of minutes to complete @berto's tutorial.
2019jun18 Updates
After more thoughts, I concluded that my answer is coming to an end.
My conclusion it that @berto's watchdog tutorial and experiment
suggestion is good, and his answer is the real answer for the OP's
question.
I did his suggested experiment successfully, verified results by the
forkbomb program, and after a lot of googling and reading for more
than 10 hours, I think I finally understood thoroughly the idea of
watchdog timer.
Earlier I wrongly thought that I still needed to learn how to set the
timer to 10 seconds or more. But as @berto says, 10 seconds is all
that to be set. I also read that I can set timer to as long as 16
seconds, and linux watchdog default is even one minute. But that is
not critical.
I have removed all the long winded reading notes in the appendices, to
make the answer shorter. I would suggest newbies not to try to
understand all the details of watchdog, not to mention the much more
complicated daemon SystemD, because our life is short, and those
system things are too complicated for non professionals.
I would like to add two points to end my answer.
(1) There are many reasons for an Rpi to hang in a couple of days
(but usually not months). Often it is not the application program's
fault, but because of the drivers or library functions creating too
much garbage, eg. sockets created, used but not properly disposed. If
it is the application program itself making garbage, the program can
do "garbage collection" and problem solved. But it is hard to remove
garbage sockets which are not generated by the application program.
So a watchdog timer is useful here.
(2) Other ways to avoid too much garbage using up resources include
rebooting every now and then by software or hardware. I do think
rebooting every morning and also use software switchable power supply
to do the system resetting adds another layer of protection. And
using only one Rpi is not very safe. Using two Rpi's as each other's
watchdog (using URT for message passing, eg) add one more layer of
protection. Another method I have not explored is using ESP8266 Wifi
sockets. I hope I can try that later.
This the the end of my answer. Cheers.
2019jun17 Updates
So I tried the fork bomb. The system rebooted after executing the program, in about 15 seconds.
2019jun16 Updates
I found @berto's fork bomb program is a bit newbie scary. So I am learning Bash to find out what that fork bomb is doing. Basically it is just a function named ":", which is defined as a function calling itself two times, thus forking indefinitely, as fast as rabbits growing exponentially, using up all the resources, and crashing linux.
I have also found the following interesting version of forkbomb using Unicode symbols:
💣 ( ) 💣 ; 💣
2019jun14/15 Updates
@thesnow suggests a very nice layered approach using a smart plug. I
think the smart plug or smart IoT stuff is the way to go. However, I
am a not so smart newbie in smart stuffm though I am keen to learn.
So I am going to buy a smart plug, do some research, and improve my
answer afterwards. For now, I have added some related learning
resources in the reference section below.
I found @berto's suggestion of using Rpi's hardware watchdog timer also very good. I have not played with any watchdoog stuff before. So I am going to try it now. @berto's instructions are very detailed, but still a bit hard for me, because I don't know very well the meaning of the commands "grep" and "dmseg". So I googled and made some reading notes in the appendices below. Then I followed @berto's suggestion, and strugged a bit to complete part 1. I have not yet reboot, because I need to take a break to digest things. Anyway, here is the screen capture.
I rebooted and got the following dmesg:
I think I am going too fast and now need to take a break to first study more linux things, like systemd, before coming back to carry on the test on watchdog.
/ to continue, ...
The Answer
I have the same problem. I am building a rooftop garden with a couple of Rpi's each of which connects to various wireless stuff (BlueTooth, Wifi) sensors, relays, and solenoids. There are two huge motors near by, controlling big water tanks and lifts. The motors generate EMI and from time to time freeze nearby electronics things.
My plan is to use software switchable PSUs (Power Supply Units) to power switch off/on frozen Rpi's and other devices (Bluetooth devices freeze most often. The BlueTooth and other little devices do not have any software reset command or hardware reset pin, so powering off/on their 5V Vcc is a quick and dirty, but still safe get around). In short, The Rpi's regularly watch each other and their devices and POR (Power On Reset) any guy fallen to sleep.
Of course I can also use a GPIO pin to trigger the Rpi hardware on board reset pin. But I am too lazy to do extra wiring, and too poor a hobbyist to afford professional/industrial grade non stop system devices such as the SwitchDoc Labs Dual WatchDog Timer (see reference below)
I modify ordinary DC-DC (12V to 5V) PSUs' so that any Rpi or MCP23x17 GPIO pins can power on/off the LM2956/LM2947 voltage regulator chip of the PSU. (LM2941 can be used for 1A current switches, LM2596 for 5V 3A PSU. The on/off pin is also connected to a push button, for manual power on/off testing.)
Actually each of my 7 Rpi3B+'s is connected to a cheapy DS3231 Real Time Clock Module which has a hardware interrupt pin to reset PSU, Rpi, or other devices.
Whenever possible and practical I tie up all the devices' reset pins together (removing some of the pull up resistors, so not to overload the GPIO pin).
Now the external DS3231 RTC wakes up everybody in the morning, and switches off lights at midnight, so everybody goes to bed.
References
1. LM2596/LM2941 Based Software Resettable PSU / Current Switches - Rpi StkEx Discussion
Rpi Hardware watchdog Discussion
SwitchDoc Labs Dual WatchDog Timer
ATXRaspi R3 - LowPowerLab US$14.95
A hackable ESP8266 inside a smart plug Want to play with ESP8266 without worrying about the hardware? - Mat 2017aug06
Reverse Engineering 101 of the Xiaomi IoT ecosystem HITCON Community 2018 – Dennis Giese
Xiaomi WiFi socket + MiHome app 21,307 views
espHome [ESP8266/ESP32]
AliExpress WiFi Smart Plug
Smart device -Wikipedia
WiFi Garage Door Opener using ESP8266 - Ray Wang 2016may13 56,335 views
Appendices
Appendix A - WatchDog Timer Reading Notes
Watchdog timer -Wikipedia
Linux WatchDog Man Page
Linux Watchdog - General Tests
Appendix B - Linux commands grep and dmesg reading notes
Appendix C - systemd references
systemd System and Service Manager - FreeDeskTop
systemd - Wikipedia
Appendix D - Fork and Fork Bomb References
Fork (system call) Wikipedia
Appendix E - Bash Learning Notes
Such a great answer! Thanks also for the pictures. Glad that you didn't took it just for this question :-D So I guess what I need is the LM25966S PSU to connect it to the GPIO as you said. I will try!!! Good that I have still my old soldering iron...
– Jurudocs
Jun 14 at 8:55
@Jurudocs Thank your for your nice words. I cut and pasted, and modify my old answers for your question, so it did not take me much time. I am a PSU hobbyist, and I DIYed PSUs using LM2596 chips and inductor coils etc. But nowadays everything goes SMD and assembled modules are dirt cheap, so I have been lazy to "make" things. By the way, to messy around the LM2596 PSU, you don't need to test by using Rpi GPIO. You can just test by hand! :) Good luck!
– tlfong01
Jun 14 at 9:15
I noticed you mentioned reading up on Systemd. While I definitely recommend you do that because it's a significant component to the way modern Linux systems work, fully understanding it is going to take a long time and not necessary to try out the watchdog. :)
– berto
Jun 15 at 14:43
1
@berto, I agree it might take me a very long time to understand the complicated SystemD. As Poettering says: "[systemd] never finished, never complete, but tracking progress of technology". I remember Oliver Heaviside, saying: "Am I to refuse to eat because I do not fully understand the mechanism of digestion?" - en.wikiquote.org/wiki/Oliver_Heaviside So I will forget systemd now and come back to watchdog. Actually I need to learn Bash first, before I can understand the weird Bash script of Fork Bomb.
– tlfong01
Jun 16 at 6:30
The fork bomb line is pretty simple once you understand what you are looking at. It’s a function named:
that calls itself recursively and puts a copy of itself in the background which also calls itself recursively. The Wikipedia page you have in your notes explains this further.
– berto
Jun 17 at 2:01
|
show 3 more comments
Question
Remote Rpi's freeze from time to time. How to wake them up?
Answer
Update 2019jul27hkt1406
I recently upgraded my Rpi3B+ stretch to Rpi4B buster and again I followed @berto's tutorial to set the watch dog timer. I found everything works as smoothly as before. In other words, no changes need to make to @berto's tutorial when upgrading to Rpi4.
Last time I knew nothing about the watchdog timer thing. So it took me more than 3 hours to google to understand everything inside out (well, almost inside out). This time I know what is going on, and all the linux tricks, so it took me only a couple of minutes to complete @berto's tutorial.
2019jun18 Updates
After more thoughts, I concluded that my answer is coming to an end.
My conclusion it that @berto's watchdog tutorial and experiment
suggestion is good, and his answer is the real answer for the OP's
question.
I did his suggested experiment successfully, verified results by the
forkbomb program, and after a lot of googling and reading for more
than 10 hours, I think I finally understood thoroughly the idea of
watchdog timer.
Earlier I wrongly thought that I still needed to learn how to set the
timer to 10 seconds or more. But as @berto says, 10 seconds is all
that to be set. I also read that I can set timer to as long as 16
seconds, and linux watchdog default is even one minute. But that is
not critical.
I have removed all the long winded reading notes in the appendices, to
make the answer shorter. I would suggest newbies not to try to
understand all the details of watchdog, not to mention the much more
complicated daemon SystemD, because our life is short, and those
system things are too complicated for non professionals.
I would like to add two points to end my answer.
(1) There are many reasons for an Rpi to hang in a couple of days
(but usually not months). Often it is not the application program's
fault, but because of the drivers or library functions creating too
much garbage, eg. sockets created, used but not properly disposed. If
it is the application program itself making garbage, the program can
do "garbage collection" and problem solved. But it is hard to remove
garbage sockets which are not generated by the application program.
So a watchdog timer is useful here.
(2) Other ways to avoid too much garbage using up resources include
rebooting every now and then by software or hardware. I do think
rebooting every morning and also use software switchable power supply
to do the system resetting adds another layer of protection. And
using only one Rpi is not very safe. Using two Rpi's as each other's
watchdog (using URT for message passing, eg) add one more layer of
protection. Another method I have not explored is using ESP8266 Wifi
sockets. I hope I can try that later.
This the the end of my answer. Cheers.
2019jun17 Updates
So I tried the fork bomb. The system rebooted after executing the program, in about 15 seconds.
2019jun16 Updates
I found @berto's fork bomb program is a bit newbie scary. So I am learning Bash to find out what that fork bomb is doing. Basically it is just a function named ":", which is defined as a function calling itself two times, thus forking indefinitely, as fast as rabbits growing exponentially, using up all the resources, and crashing linux.
I have also found the following interesting version of forkbomb using Unicode symbols:
💣 ( ) 💣 ; 💣
2019jun14/15 Updates
@thesnow suggests a very nice layered approach using a smart plug. I
think the smart plug or smart IoT stuff is the way to go. However, I
am a not so smart newbie in smart stuffm though I am keen to learn.
So I am going to buy a smart plug, do some research, and improve my
answer afterwards. For now, I have added some related learning
resources in the reference section below.
I found @berto's suggestion of using Rpi's hardware watchdog timer also very good. I have not played with any watchdoog stuff before. So I am going to try it now. @berto's instructions are very detailed, but still a bit hard for me, because I don't know very well the meaning of the commands "grep" and "dmseg". So I googled and made some reading notes in the appendices below. Then I followed @berto's suggestion, and strugged a bit to complete part 1. I have not yet reboot, because I need to take a break to digest things. Anyway, here is the screen capture.
I rebooted and got the following dmesg:
I think I am going too fast and now need to take a break to first study more linux things, like systemd, before coming back to carry on the test on watchdog.
/ to continue, ...
The Answer
I have the same problem. I am building a rooftop garden with a couple of Rpi's each of which connects to various wireless stuff (BlueTooth, Wifi) sensors, relays, and solenoids. There are two huge motors near by, controlling big water tanks and lifts. The motors generate EMI and from time to time freeze nearby electronics things.
My plan is to use software switchable PSUs (Power Supply Units) to power switch off/on frozen Rpi's and other devices (Bluetooth devices freeze most often. The BlueTooth and other little devices do not have any software reset command or hardware reset pin, so powering off/on their 5V Vcc is a quick and dirty, but still safe get around). In short, The Rpi's regularly watch each other and their devices and POR (Power On Reset) any guy fallen to sleep.
Of course I can also use a GPIO pin to trigger the Rpi hardware on board reset pin. But I am too lazy to do extra wiring, and too poor a hobbyist to afford professional/industrial grade non stop system devices such as the SwitchDoc Labs Dual WatchDog Timer (see reference below)
I modify ordinary DC-DC (12V to 5V) PSUs' so that any Rpi or MCP23x17 GPIO pins can power on/off the LM2956/LM2947 voltage regulator chip of the PSU. (LM2941 can be used for 1A current switches, LM2596 for 5V 3A PSU. The on/off pin is also connected to a push button, for manual power on/off testing.)
Actually each of my 7 Rpi3B+'s is connected to a cheapy DS3231 Real Time Clock Module which has a hardware interrupt pin to reset PSU, Rpi, or other devices.
Whenever possible and practical I tie up all the devices' reset pins together (removing some of the pull up resistors, so not to overload the GPIO pin).
Now the external DS3231 RTC wakes up everybody in the morning, and switches off lights at midnight, so everybody goes to bed.
References
1. LM2596/LM2941 Based Software Resettable PSU / Current Switches - Rpi StkEx Discussion
Rpi Hardware watchdog Discussion
SwitchDoc Labs Dual WatchDog Timer
ATXRaspi R3 - LowPowerLab US$14.95
A hackable ESP8266 inside a smart plug Want to play with ESP8266 without worrying about the hardware? - Mat 2017aug06
Reverse Engineering 101 of the Xiaomi IoT ecosystem HITCON Community 2018 – Dennis Giese
Xiaomi WiFi socket + MiHome app 21,307 views
espHome [ESP8266/ESP32]
AliExpress WiFi Smart Plug
Smart device -Wikipedia
WiFi Garage Door Opener using ESP8266 - Ray Wang 2016may13 56,335 views
Appendices
Appendix A - WatchDog Timer Reading Notes
Watchdog timer -Wikipedia
Linux WatchDog Man Page
Linux Watchdog - General Tests
Appendix B - Linux commands grep and dmesg reading notes
Appendix C - systemd references
systemd System and Service Manager - FreeDeskTop
systemd - Wikipedia
Appendix D - Fork and Fork Bomb References
Fork (system call) Wikipedia
Appendix E - Bash Learning Notes
Such a great answer! Thanks also for the pictures. Glad that you didn't took it just for this question :-D So I guess what I need is the LM25966S PSU to connect it to the GPIO as you said. I will try!!! Good that I have still my old soldering iron...
– Jurudocs
Jun 14 at 8:55
@Jurudocs Thank your for your nice words. I cut and pasted, and modify my old answers for your question, so it did not take me much time. I am a PSU hobbyist, and I DIYed PSUs using LM2596 chips and inductor coils etc. But nowadays everything goes SMD and assembled modules are dirt cheap, so I have been lazy to "make" things. By the way, to messy around the LM2596 PSU, you don't need to test by using Rpi GPIO. You can just test by hand! :) Good luck!
– tlfong01
Jun 14 at 9:15
I noticed you mentioned reading up on Systemd. While I definitely recommend you do that because it's a significant component to the way modern Linux systems work, fully understanding it is going to take a long time and not necessary to try out the watchdog. :)
– berto
Jun 15 at 14:43
1
@berto, I agree it might take me a very long time to understand the complicated SystemD. As Poettering says: "[systemd] never finished, never complete, but tracking progress of technology". I remember Oliver Heaviside, saying: "Am I to refuse to eat because I do not fully understand the mechanism of digestion?" - en.wikiquote.org/wiki/Oliver_Heaviside So I will forget systemd now and come back to watchdog. Actually I need to learn Bash first, before I can understand the weird Bash script of Fork Bomb.
– tlfong01
Jun 16 at 6:30
The fork bomb line is pretty simple once you understand what you are looking at. It’s a function named:
that calls itself recursively and puts a copy of itself in the background which also calls itself recursively. The Wikipedia page you have in your notes explains this further.
– berto
Jun 17 at 2:01
|
show 3 more comments
Question
Remote Rpi's freeze from time to time. How to wake them up?
Answer
Update 2019jul27hkt1406
I recently upgraded my Rpi3B+ stretch to Rpi4B buster and again I followed @berto's tutorial to set the watch dog timer. I found everything works as smoothly as before. In other words, no changes need to make to @berto's tutorial when upgrading to Rpi4.
Last time I knew nothing about the watchdog timer thing. So it took me more than 3 hours to google to understand everything inside out (well, almost inside out). This time I know what is going on, and all the linux tricks, so it took me only a couple of minutes to complete @berto's tutorial.
2019jun18 Updates
After more thoughts, I concluded that my answer is coming to an end.
My conclusion it that @berto's watchdog tutorial and experiment
suggestion is good, and his answer is the real answer for the OP's
question.
I did his suggested experiment successfully, verified results by the
forkbomb program, and after a lot of googling and reading for more
than 10 hours, I think I finally understood thoroughly the idea of
watchdog timer.
Earlier I wrongly thought that I still needed to learn how to set the
timer to 10 seconds or more. But as @berto says, 10 seconds is all
that to be set. I also read that I can set timer to as long as 16
seconds, and linux watchdog default is even one minute. But that is
not critical.
I have removed all the long winded reading notes in the appendices, to
make the answer shorter. I would suggest newbies not to try to
understand all the details of watchdog, not to mention the much more
complicated daemon SystemD, because our life is short, and those
system things are too complicated for non professionals.
I would like to add two points to end my answer.
(1) There are many reasons for an Rpi to hang in a couple of days
(but usually not months). Often it is not the application program's
fault, but because of the drivers or library functions creating too
much garbage, eg. sockets created, used but not properly disposed. If
it is the application program itself making garbage, the program can
do "garbage collection" and problem solved. But it is hard to remove
garbage sockets which are not generated by the application program.
So a watchdog timer is useful here.
(2) Other ways to avoid too much garbage using up resources include
rebooting every now and then by software or hardware. I do think
rebooting every morning and also use software switchable power supply
to do the system resetting adds another layer of protection. And
using only one Rpi is not very safe. Using two Rpi's as each other's
watchdog (using URT for message passing, eg) add one more layer of
protection. Another method I have not explored is using ESP8266 Wifi
sockets. I hope I can try that later.
This the the end of my answer. Cheers.
2019jun17 Updates
So I tried the fork bomb. The system rebooted after executing the program, in about 15 seconds.
2019jun16 Updates
I found @berto's fork bomb program is a bit newbie scary. So I am learning Bash to find out what that fork bomb is doing. Basically it is just a function named ":", which is defined as a function calling itself two times, thus forking indefinitely, as fast as rabbits growing exponentially, using up all the resources, and crashing linux.
I have also found the following interesting version of forkbomb using Unicode symbols:
💣 ( ) 💣 ; 💣
2019jun14/15 Updates
@thesnow suggests a very nice layered approach using a smart plug. I
think the smart plug or smart IoT stuff is the way to go. However, I
am a not so smart newbie in smart stuffm though I am keen to learn.
So I am going to buy a smart plug, do some research, and improve my
answer afterwards. For now, I have added some related learning
resources in the reference section below.
I found @berto's suggestion of using Rpi's hardware watchdog timer also very good. I have not played with any watchdoog stuff before. So I am going to try it now. @berto's instructions are very detailed, but still a bit hard for me, because I don't know very well the meaning of the commands "grep" and "dmseg". So I googled and made some reading notes in the appendices below. Then I followed @berto's suggestion, and strugged a bit to complete part 1. I have not yet reboot, because I need to take a break to digest things. Anyway, here is the screen capture.
I rebooted and got the following dmesg:
I think I am going too fast and now need to take a break to first study more linux things, like systemd, before coming back to carry on the test on watchdog.
/ to continue, ...
The Answer
I have the same problem. I am building a rooftop garden with a couple of Rpi's each of which connects to various wireless stuff (BlueTooth, Wifi) sensors, relays, and solenoids. There are two huge motors near by, controlling big water tanks and lifts. The motors generate EMI and from time to time freeze nearby electronics things.
My plan is to use software switchable PSUs (Power Supply Units) to power switch off/on frozen Rpi's and other devices (Bluetooth devices freeze most often. The BlueTooth and other little devices do not have any software reset command or hardware reset pin, so powering off/on their 5V Vcc is a quick and dirty, but still safe get around). In short, The Rpi's regularly watch each other and their devices and POR (Power On Reset) any guy fallen to sleep.
Of course I can also use a GPIO pin to trigger the Rpi hardware on board reset pin. But I am too lazy to do extra wiring, and too poor a hobbyist to afford professional/industrial grade non stop system devices such as the SwitchDoc Labs Dual WatchDog Timer (see reference below)
I modify ordinary DC-DC (12V to 5V) PSUs' so that any Rpi or MCP23x17 GPIO pins can power on/off the LM2956/LM2947 voltage regulator chip of the PSU. (LM2941 can be used for 1A current switches, LM2596 for 5V 3A PSU. The on/off pin is also connected to a push button, for manual power on/off testing.)
Actually each of my 7 Rpi3B+'s is connected to a cheapy DS3231 Real Time Clock Module which has a hardware interrupt pin to reset PSU, Rpi, or other devices.
Whenever possible and practical I tie up all the devices' reset pins together (removing some of the pull up resistors, so not to overload the GPIO pin).
Now the external DS3231 RTC wakes up everybody in the morning, and switches off lights at midnight, so everybody goes to bed.
References
1. LM2596/LM2941 Based Software Resettable PSU / Current Switches - Rpi StkEx Discussion
Rpi Hardware watchdog Discussion
SwitchDoc Labs Dual WatchDog Timer
ATXRaspi R3 - LowPowerLab US$14.95
A hackable ESP8266 inside a smart plug Want to play with ESP8266 without worrying about the hardware? - Mat 2017aug06
Reverse Engineering 101 of the Xiaomi IoT ecosystem HITCON Community 2018 – Dennis Giese
Xiaomi WiFi socket + MiHome app 21,307 views
espHome [ESP8266/ESP32]
AliExpress WiFi Smart Plug
Smart device -Wikipedia
WiFi Garage Door Opener using ESP8266 - Ray Wang 2016may13 56,335 views
Appendices
Appendix A - WatchDog Timer Reading Notes
Watchdog timer -Wikipedia
Linux WatchDog Man Page
Linux Watchdog - General Tests
Appendix B - Linux commands grep and dmesg reading notes
Appendix C - systemd references
systemd System and Service Manager - FreeDeskTop
systemd - Wikipedia
Appendix D - Fork and Fork Bomb References
Fork (system call) Wikipedia
Appendix E - Bash Learning Notes
Question
Remote Rpi's freeze from time to time. How to wake them up?
Answer
Update 2019jul27hkt1406
I recently upgraded my Rpi3B+ stretch to Rpi4B buster and again I followed @berto's tutorial to set the watch dog timer. I found everything works as smoothly as before. In other words, no changes need to make to @berto's tutorial when upgrading to Rpi4.
Last time I knew nothing about the watchdog timer thing. So it took me more than 3 hours to google to understand everything inside out (well, almost inside out). This time I know what is going on, and all the linux tricks, so it took me only a couple of minutes to complete @berto's tutorial.
2019jun18 Updates
After more thoughts, I concluded that my answer is coming to an end.
My conclusion it that @berto's watchdog tutorial and experiment
suggestion is good, and his answer is the real answer for the OP's
question.
I did his suggested experiment successfully, verified results by the
forkbomb program, and after a lot of googling and reading for more
than 10 hours, I think I finally understood thoroughly the idea of
watchdog timer.
Earlier I wrongly thought that I still needed to learn how to set the
timer to 10 seconds or more. But as @berto says, 10 seconds is all
that to be set. I also read that I can set timer to as long as 16
seconds, and linux watchdog default is even one minute. But that is
not critical.
I have removed all the long winded reading notes in the appendices, to
make the answer shorter. I would suggest newbies not to try to
understand all the details of watchdog, not to mention the much more
complicated daemon SystemD, because our life is short, and those
system things are too complicated for non professionals.
I would like to add two points to end my answer.
(1) There are many reasons for an Rpi to hang in a couple of days
(but usually not months). Often it is not the application program's
fault, but because of the drivers or library functions creating too
much garbage, eg. sockets created, used but not properly disposed. If
it is the application program itself making garbage, the program can
do "garbage collection" and problem solved. But it is hard to remove
garbage sockets which are not generated by the application program.
So a watchdog timer is useful here.
(2) Other ways to avoid too much garbage using up resources include
rebooting every now and then by software or hardware. I do think
rebooting every morning and also use software switchable power supply
to do the system resetting adds another layer of protection. And
using only one Rpi is not very safe. Using two Rpi's as each other's
watchdog (using URT for message passing, eg) add one more layer of
protection. Another method I have not explored is using ESP8266 Wifi
sockets. I hope I can try that later.
This the the end of my answer. Cheers.
2019jun17 Updates
So I tried the fork bomb. The system rebooted after executing the program, in about 15 seconds.
2019jun16 Updates
I found @berto's fork bomb program is a bit newbie scary. So I am learning Bash to find out what that fork bomb is doing. Basically it is just a function named ":", which is defined as a function calling itself two times, thus forking indefinitely, as fast as rabbits growing exponentially, using up all the resources, and crashing linux.
I have also found the following interesting version of forkbomb using Unicode symbols:
💣 ( ) 💣 ; 💣
2019jun14/15 Updates
@thesnow suggests a very nice layered approach using a smart plug. I
think the smart plug or smart IoT stuff is the way to go. However, I
am a not so smart newbie in smart stuffm though I am keen to learn.
So I am going to buy a smart plug, do some research, and improve my
answer afterwards. For now, I have added some related learning
resources in the reference section below.
I found @berto's suggestion of using Rpi's hardware watchdog timer also very good. I have not played with any watchdoog stuff before. So I am going to try it now. @berto's instructions are very detailed, but still a bit hard for me, because I don't know very well the meaning of the commands "grep" and "dmseg". So I googled and made some reading notes in the appendices below. Then I followed @berto's suggestion, and strugged a bit to complete part 1. I have not yet reboot, because I need to take a break to digest things. Anyway, here is the screen capture.
I rebooted and got the following dmesg:
I think I am going too fast and now need to take a break to first study more linux things, like systemd, before coming back to carry on the test on watchdog.
/ to continue, ...
The Answer
I have the same problem. I am building a rooftop garden with a couple of Rpi's each of which connects to various wireless stuff (BlueTooth, Wifi) sensors, relays, and solenoids. There are two huge motors near by, controlling big water tanks and lifts. The motors generate EMI and from time to time freeze nearby electronics things.
My plan is to use software switchable PSUs (Power Supply Units) to power switch off/on frozen Rpi's and other devices (Bluetooth devices freeze most often. The BlueTooth and other little devices do not have any software reset command or hardware reset pin, so powering off/on their 5V Vcc is a quick and dirty, but still safe get around). In short, The Rpi's regularly watch each other and their devices and POR (Power On Reset) any guy fallen to sleep.
Of course I can also use a GPIO pin to trigger the Rpi hardware on board reset pin. But I am too lazy to do extra wiring, and too poor a hobbyist to afford professional/industrial grade non stop system devices such as the SwitchDoc Labs Dual WatchDog Timer (see reference below)
I modify ordinary DC-DC (12V to 5V) PSUs' so that any Rpi or MCP23x17 GPIO pins can power on/off the LM2956/LM2947 voltage regulator chip of the PSU. (LM2941 can be used for 1A current switches, LM2596 for 5V 3A PSU. The on/off pin is also connected to a push button, for manual power on/off testing.)
Actually each of my 7 Rpi3B+'s is connected to a cheapy DS3231 Real Time Clock Module which has a hardware interrupt pin to reset PSU, Rpi, or other devices.
Whenever possible and practical I tie up all the devices' reset pins together (removing some of the pull up resistors, so not to overload the GPIO pin).
Now the external DS3231 RTC wakes up everybody in the morning, and switches off lights at midnight, so everybody goes to bed.
References
1. LM2596/LM2941 Based Software Resettable PSU / Current Switches - Rpi StkEx Discussion
Rpi Hardware watchdog Discussion
SwitchDoc Labs Dual WatchDog Timer
ATXRaspi R3 - LowPowerLab US$14.95
A hackable ESP8266 inside a smart plug Want to play with ESP8266 without worrying about the hardware? - Mat 2017aug06
Reverse Engineering 101 of the Xiaomi IoT ecosystem HITCON Community 2018 – Dennis Giese
Xiaomi WiFi socket + MiHome app 21,307 views
espHome [ESP8266/ESP32]
AliExpress WiFi Smart Plug
Smart device -Wikipedia
WiFi Garage Door Opener using ESP8266 - Ray Wang 2016may13 56,335 views
Appendices
Appendix A - WatchDog Timer Reading Notes
Watchdog timer -Wikipedia
Linux WatchDog Man Page
Linux Watchdog - General Tests
Appendix B - Linux commands grep and dmesg reading notes
Appendix C - systemd references
systemd System and Service Manager - FreeDeskTop
systemd - Wikipedia
Appendix D - Fork and Fork Bomb References
Fork (system call) Wikipedia
Appendix E - Bash Learning Notes
edited Jul 27 at 6:11
answered Jun 14 at 8:27
tlfong01tlfong01
2,2062 gold badges5 silver badges18 bronze badges
2,2062 gold badges5 silver badges18 bronze badges
Such a great answer! Thanks also for the pictures. Glad that you didn't took it just for this question :-D So I guess what I need is the LM25966S PSU to connect it to the GPIO as you said. I will try!!! Good that I have still my old soldering iron...
– Jurudocs
Jun 14 at 8:55
@Jurudocs Thank your for your nice words. I cut and pasted, and modify my old answers for your question, so it did not take me much time. I am a PSU hobbyist, and I DIYed PSUs using LM2596 chips and inductor coils etc. But nowadays everything goes SMD and assembled modules are dirt cheap, so I have been lazy to "make" things. By the way, to messy around the LM2596 PSU, you don't need to test by using Rpi GPIO. You can just test by hand! :) Good luck!
– tlfong01
Jun 14 at 9:15
I noticed you mentioned reading up on Systemd. While I definitely recommend you do that because it's a significant component to the way modern Linux systems work, fully understanding it is going to take a long time and not necessary to try out the watchdog. :)
– berto
Jun 15 at 14:43
1
@berto, I agree it might take me a very long time to understand the complicated SystemD. As Poettering says: "[systemd] never finished, never complete, but tracking progress of technology". I remember Oliver Heaviside, saying: "Am I to refuse to eat because I do not fully understand the mechanism of digestion?" - en.wikiquote.org/wiki/Oliver_Heaviside So I will forget systemd now and come back to watchdog. Actually I need to learn Bash first, before I can understand the weird Bash script of Fork Bomb.
– tlfong01
Jun 16 at 6:30
The fork bomb line is pretty simple once you understand what you are looking at. It’s a function named:
that calls itself recursively and puts a copy of itself in the background which also calls itself recursively. The Wikipedia page you have in your notes explains this further.
– berto
Jun 17 at 2:01
|
show 3 more comments
Such a great answer! Thanks also for the pictures. Glad that you didn't took it just for this question :-D So I guess what I need is the LM25966S PSU to connect it to the GPIO as you said. I will try!!! Good that I have still my old soldering iron...
– Jurudocs
Jun 14 at 8:55
@Jurudocs Thank your for your nice words. I cut and pasted, and modify my old answers for your question, so it did not take me much time. I am a PSU hobbyist, and I DIYed PSUs using LM2596 chips and inductor coils etc. But nowadays everything goes SMD and assembled modules are dirt cheap, so I have been lazy to "make" things. By the way, to messy around the LM2596 PSU, you don't need to test by using Rpi GPIO. You can just test by hand! :) Good luck!
– tlfong01
Jun 14 at 9:15
I noticed you mentioned reading up on Systemd. While I definitely recommend you do that because it's a significant component to the way modern Linux systems work, fully understanding it is going to take a long time and not necessary to try out the watchdog. :)
– berto
Jun 15 at 14:43
1
@berto, I agree it might take me a very long time to understand the complicated SystemD. As Poettering says: "[systemd] never finished, never complete, but tracking progress of technology". I remember Oliver Heaviside, saying: "Am I to refuse to eat because I do not fully understand the mechanism of digestion?" - en.wikiquote.org/wiki/Oliver_Heaviside So I will forget systemd now and come back to watchdog. Actually I need to learn Bash first, before I can understand the weird Bash script of Fork Bomb.
– tlfong01
Jun 16 at 6:30
The fork bomb line is pretty simple once you understand what you are looking at. It’s a function named:
that calls itself recursively and puts a copy of itself in the background which also calls itself recursively. The Wikipedia page you have in your notes explains this further.
– berto
Jun 17 at 2:01
Such a great answer! Thanks also for the pictures. Glad that you didn't took it just for this question :-D So I guess what I need is the LM25966S PSU to connect it to the GPIO as you said. I will try!!! Good that I have still my old soldering iron...
– Jurudocs
Jun 14 at 8:55
Such a great answer! Thanks also for the pictures. Glad that you didn't took it just for this question :-D So I guess what I need is the LM25966S PSU to connect it to the GPIO as you said. I will try!!! Good that I have still my old soldering iron...
– Jurudocs
Jun 14 at 8:55
@Jurudocs Thank your for your nice words. I cut and pasted, and modify my old answers for your question, so it did not take me much time. I am a PSU hobbyist, and I DIYed PSUs using LM2596 chips and inductor coils etc. But nowadays everything goes SMD and assembled modules are dirt cheap, so I have been lazy to "make" things. By the way, to messy around the LM2596 PSU, you don't need to test by using Rpi GPIO. You can just test by hand! :) Good luck!
– tlfong01
Jun 14 at 9:15
@Jurudocs Thank your for your nice words. I cut and pasted, and modify my old answers for your question, so it did not take me much time. I am a PSU hobbyist, and I DIYed PSUs using LM2596 chips and inductor coils etc. But nowadays everything goes SMD and assembled modules are dirt cheap, so I have been lazy to "make" things. By the way, to messy around the LM2596 PSU, you don't need to test by using Rpi GPIO. You can just test by hand! :) Good luck!
– tlfong01
Jun 14 at 9:15
I noticed you mentioned reading up on Systemd. While I definitely recommend you do that because it's a significant component to the way modern Linux systems work, fully understanding it is going to take a long time and not necessary to try out the watchdog. :)
– berto
Jun 15 at 14:43
I noticed you mentioned reading up on Systemd. While I definitely recommend you do that because it's a significant component to the way modern Linux systems work, fully understanding it is going to take a long time and not necessary to try out the watchdog. :)
– berto
Jun 15 at 14:43
1
1
@berto, I agree it might take me a very long time to understand the complicated SystemD. As Poettering says: "[systemd] never finished, never complete, but tracking progress of technology". I remember Oliver Heaviside, saying: "Am I to refuse to eat because I do not fully understand the mechanism of digestion?" - en.wikiquote.org/wiki/Oliver_Heaviside So I will forget systemd now and come back to watchdog. Actually I need to learn Bash first, before I can understand the weird Bash script of Fork Bomb.
– tlfong01
Jun 16 at 6:30
@berto, I agree it might take me a very long time to understand the complicated SystemD. As Poettering says: "[systemd] never finished, never complete, but tracking progress of technology". I remember Oliver Heaviside, saying: "Am I to refuse to eat because I do not fully understand the mechanism of digestion?" - en.wikiquote.org/wiki/Oliver_Heaviside So I will forget systemd now and come back to watchdog. Actually I need to learn Bash first, before I can understand the weird Bash script of Fork Bomb.
– tlfong01
Jun 16 at 6:30
The fork bomb line is pretty simple once you understand what you are looking at. It’s a function named
:
that calls itself recursively and puts a copy of itself in the background which also calls itself recursively. The Wikipedia page you have in your notes explains this further.– berto
Jun 17 at 2:01
The fork bomb line is pretty simple once you understand what you are looking at. It’s a function named
:
that calls itself recursively and puts a copy of itself in the background which also calls itself recursively. The Wikipedia page you have in your notes explains this further.– berto
Jun 17 at 2:01
|
show 3 more comments
I have quite a few Pis. All of them, except one ran flawlessly. The problem child would crash periodically and would never recover after a power outage without being power cycled again. I had it reboot itself every night via cron and that helped somewhat.
What fixed it though was taking the SD card and sensor hardware and putting them into another Pi. It has run without error ever since. Maybe you too have a hardware issue.
I didn't catch your second paragraph about the hardware problem. Did you mean that the SD card and sensor caused all the trouble, and replacing them solved the problem?
– tlfong01
Jun 15 at 2:44
No, The Pi itself was the problem. I had a spare one, so I transferred the SD card and the sensors to the spare and used it instead of the original. No problems since.
– Wildbill
Jun 16 at 11:42
I see. So it is always a good idea to have a spare Rpi for swap troubleshooting. Perhaps the OP should also consider this.
– tlfong01
Jun 16 at 13:02
add a comment
|
I have quite a few Pis. All of them, except one ran flawlessly. The problem child would crash periodically and would never recover after a power outage without being power cycled again. I had it reboot itself every night via cron and that helped somewhat.
What fixed it though was taking the SD card and sensor hardware and putting them into another Pi. It has run without error ever since. Maybe you too have a hardware issue.
I didn't catch your second paragraph about the hardware problem. Did you mean that the SD card and sensor caused all the trouble, and replacing them solved the problem?
– tlfong01
Jun 15 at 2:44
No, The Pi itself was the problem. I had a spare one, so I transferred the SD card and the sensors to the spare and used it instead of the original. No problems since.
– Wildbill
Jun 16 at 11:42
I see. So it is always a good idea to have a spare Rpi for swap troubleshooting. Perhaps the OP should also consider this.
– tlfong01
Jun 16 at 13:02
add a comment
|
I have quite a few Pis. All of them, except one ran flawlessly. The problem child would crash periodically and would never recover after a power outage without being power cycled again. I had it reboot itself every night via cron and that helped somewhat.
What fixed it though was taking the SD card and sensor hardware and putting them into another Pi. It has run without error ever since. Maybe you too have a hardware issue.
I have quite a few Pis. All of them, except one ran flawlessly. The problem child would crash periodically and would never recover after a power outage without being power cycled again. I had it reboot itself every night via cron and that helped somewhat.
What fixed it though was taking the SD card and sensor hardware and putting them into another Pi. It has run without error ever since. Maybe you too have a hardware issue.
answered Jun 14 at 19:47
WildbillWildbill
111 bronze badge
111 bronze badge
I didn't catch your second paragraph about the hardware problem. Did you mean that the SD card and sensor caused all the trouble, and replacing them solved the problem?
– tlfong01
Jun 15 at 2:44
No, The Pi itself was the problem. I had a spare one, so I transferred the SD card and the sensors to the spare and used it instead of the original. No problems since.
– Wildbill
Jun 16 at 11:42
I see. So it is always a good idea to have a spare Rpi for swap troubleshooting. Perhaps the OP should also consider this.
– tlfong01
Jun 16 at 13:02
add a comment
|
I didn't catch your second paragraph about the hardware problem. Did you mean that the SD card and sensor caused all the trouble, and replacing them solved the problem?
– tlfong01
Jun 15 at 2:44
No, The Pi itself was the problem. I had a spare one, so I transferred the SD card and the sensors to the spare and used it instead of the original. No problems since.
– Wildbill
Jun 16 at 11:42
I see. So it is always a good idea to have a spare Rpi for swap troubleshooting. Perhaps the OP should also consider this.
– tlfong01
Jun 16 at 13:02
I didn't catch your second paragraph about the hardware problem. Did you mean that the SD card and sensor caused all the trouble, and replacing them solved the problem?
– tlfong01
Jun 15 at 2:44
I didn't catch your second paragraph about the hardware problem. Did you mean that the SD card and sensor caused all the trouble, and replacing them solved the problem?
– tlfong01
Jun 15 at 2:44
No, The Pi itself was the problem. I had a spare one, so I transferred the SD card and the sensors to the spare and used it instead of the original. No problems since.
– Wildbill
Jun 16 at 11:42
No, The Pi itself was the problem. I had a spare one, so I transferred the SD card and the sensors to the spare and used it instead of the original. No problems since.
– Wildbill
Jun 16 at 11:42
I see. So it is always a good idea to have a spare Rpi for swap troubleshooting. Perhaps the OP should also consider this.
– tlfong01
Jun 16 at 13:02
I see. So it is always a good idea to have a spare Rpi for swap troubleshooting. Perhaps the OP should also consider this.
– tlfong01
Jun 16 at 13:02
add a comment
|
If you have wi-fi and just need to power off / power on, you could also consider using a smart plug. Amazon makes one for ~$25, you can power it on / off remotely and also set up timer routines if that's preferable. I've had a few for several months and they're quite reliable. You don't actually need an Echo or any other dedicated device. I use my smart phone. Amazon Smart Plug
Edit: I realize this doesn't provide a solution to the first part of the question, but if I had the prospect of a 2 hour drive if something went wrong I'd consider a layered approach.
, I appreciate very much your suggestion of a layered approach, with a smart plug at the top layer. Actually some months I have been trying to DIY a smart plug based on the ESP8266 WiFi controller. However I found the ESP8266 with NodeMCU Lua has a very steep learning curve. It took the newbie, ie, me over 100 hours just to blink a LED (compared to less than one hour writing an Arduino or Rpi blinky program) So I sadly gave up and now decide cheat by buying a ESP8266 XiaoMi smart plug and modify it. I am going to add your suggestion to my answer soon. Many thanks again! :)
– tlfong01
Jun 15 at 2:17
add a comment
|
If you have wi-fi and just need to power off / power on, you could also consider using a smart plug. Amazon makes one for ~$25, you can power it on / off remotely and also set up timer routines if that's preferable. I've had a few for several months and they're quite reliable. You don't actually need an Echo or any other dedicated device. I use my smart phone. Amazon Smart Plug
Edit: I realize this doesn't provide a solution to the first part of the question, but if I had the prospect of a 2 hour drive if something went wrong I'd consider a layered approach.
, I appreciate very much your suggestion of a layered approach, with a smart plug at the top layer. Actually some months I have been trying to DIY a smart plug based on the ESP8266 WiFi controller. However I found the ESP8266 with NodeMCU Lua has a very steep learning curve. It took the newbie, ie, me over 100 hours just to blink a LED (compared to less than one hour writing an Arduino or Rpi blinky program) So I sadly gave up and now decide cheat by buying a ESP8266 XiaoMi smart plug and modify it. I am going to add your suggestion to my answer soon. Many thanks again! :)
– tlfong01
Jun 15 at 2:17
add a comment
|
If you have wi-fi and just need to power off / power on, you could also consider using a smart plug. Amazon makes one for ~$25, you can power it on / off remotely and also set up timer routines if that's preferable. I've had a few for several months and they're quite reliable. You don't actually need an Echo or any other dedicated device. I use my smart phone. Amazon Smart Plug
Edit: I realize this doesn't provide a solution to the first part of the question, but if I had the prospect of a 2 hour drive if something went wrong I'd consider a layered approach.
If you have wi-fi and just need to power off / power on, you could also consider using a smart plug. Amazon makes one for ~$25, you can power it on / off remotely and also set up timer routines if that's preferable. I've had a few for several months and they're quite reliable. You don't actually need an Echo or any other dedicated device. I use my smart phone. Amazon Smart Plug
Edit: I realize this doesn't provide a solution to the first part of the question, but if I had the prospect of a 2 hour drive if something went wrong I'd consider a layered approach.
edited Jun 14 at 20:41
answered Jun 14 at 20:15
thesnowthesnow
111 bronze badge
111 bronze badge
, I appreciate very much your suggestion of a layered approach, with a smart plug at the top layer. Actually some months I have been trying to DIY a smart plug based on the ESP8266 WiFi controller. However I found the ESP8266 with NodeMCU Lua has a very steep learning curve. It took the newbie, ie, me over 100 hours just to blink a LED (compared to less than one hour writing an Arduino or Rpi blinky program) So I sadly gave up and now decide cheat by buying a ESP8266 XiaoMi smart plug and modify it. I am going to add your suggestion to my answer soon. Many thanks again! :)
– tlfong01
Jun 15 at 2:17
add a comment
|
, I appreciate very much your suggestion of a layered approach, with a smart plug at the top layer. Actually some months I have been trying to DIY a smart plug based on the ESP8266 WiFi controller. However I found the ESP8266 with NodeMCU Lua has a very steep learning curve. It took the newbie, ie, me over 100 hours just to blink a LED (compared to less than one hour writing an Arduino or Rpi blinky program) So I sadly gave up and now decide cheat by buying a ESP8266 XiaoMi smart plug and modify it. I am going to add your suggestion to my answer soon. Many thanks again! :)
– tlfong01
Jun 15 at 2:17
, I appreciate very much your suggestion of a layered approach, with a smart plug at the top layer. Actually some months I have been trying to DIY a smart plug based on the ESP8266 WiFi controller. However I found the ESP8266 with NodeMCU Lua has a very steep learning curve. It took the newbie, ie, me over 100 hours just to blink a LED (compared to less than one hour writing an Arduino or Rpi blinky program) So I sadly gave up and now decide cheat by buying a ESP8266 XiaoMi smart plug and modify it. I am going to add your suggestion to my answer soon. Many thanks again! :)
– tlfong01
Jun 15 at 2:17
, I appreciate very much your suggestion of a layered approach, with a smart plug at the top layer. Actually some months I have been trying to DIY a smart plug based on the ESP8266 WiFi controller. However I found the ESP8266 with NodeMCU Lua has a very steep learning curve. It took the newbie, ie, me over 100 hours just to blink a LED (compared to less than one hour writing an Arduino or Rpi blinky program) So I sadly gave up and now decide cheat by buying a ESP8266 XiaoMi smart plug and modify it. I am going to add your suggestion to my answer soon. Many thanks again! :)
– tlfong01
Jun 15 at 2:17
add a comment
|
Thanks for contributing an answer to Raspberry Pi Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fraspberrypi.stackexchange.com%2fquestions%2f99584%2fhow-to-monitor-if-a-remote-rpi-freezes-and-reboots-it-using-a-watchdog-timer-or%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Not sure anyone is going to design this circuit for you. But one additional thing to consider: Whatever causes the first Pi to freeze might have a common failure mode to the second Pi. For example, if it's freezing because of a power fluctuation, you might end up with two frozen Pis instead of the independent redundancy that you want. Might be worth trying to understand why that first Pi freezes first.
– Brick
Jun 14 at 12:33
1
How quickly do you need the pi to come back online? A simple holiday light timer could cycle the power every X hours, as long as you don't mind waiting until the reset interval to have it back online again.
– Tim
Jun 14 at 17:30
@Jurudocs, I followed #berto's watchdog timer tutorial and found everything good. I don't quite understand what the watchdog is doing, but I am 90% sure that the watchdog timer method should solve your problem, much cleaner to my proposed hardware solution.
– tlfong01
Jun 17 at 6:14