I noticed my mining rig kept randomly shutting off the other day, so I decided to research some of the common problems as to why and this is what I found.
Why does my mining rig keep turning off? Some of the leading causes of rig crashes are things like overclocking, faulty risers, and heat. However, to know for sure what’s causing the problem its best to do a complete and thorough test of each mining rig component.
Over the last few years of mining my career, I’ve learned a few tricks that help troubleshoot my rigs faster if I ever have random shutdowns, crashes or freezes.
I took every single trick I’ve learned from fixing my rigs and compiled it into this checklist, but for some issues, you’ll need further resources which Ill do my best to link too.
MINING RIG TROUBLESHOOTING CHECKLIST
First we want to check for some common overlooked mistakes so we can rule those out as issues early on.
If your room temps are getting warm, then there’s a good chance your cards are even warmer. Running machines in an unproperly ventilated environment can cause for a long road of pain and misery. If your rooms temp
Step 1.a: If you’ve overclocked your cards, then you’ll need to reset the clock speed’s back to default and be sure to DDU then update your GPU drivers for good measure.
SIDENOTE: For AMD cards their specific mining drivers you’ll need to use for reliable results. For Nvidia drivers just use the most recent drivers.
Step 1.b: Now you’ll want to run a test for at least 24 hours to confirm the rig is stable. If this fixes your issue and your sure your overclock settings arent to high then go ahead and reapply the overclock setting to ONLY one card and run a 12-hour test. If the rig didn’t crash reapply the overclock setting to another card and run the 12-hour test again.
The idea behind this is keeping adding oversetting to your GPUs one at a time to eventually find the weak GPUs that don’t agree with those settings. Once you locate the GPUs with this method you’ll have to adjust the overclock settings back until the GPU is stable. Worse case
However, if this doesn’t fix your problem, then leave the cards with default clock settings and move the next step.
Don’t worry we’ll get back to overclocking your cards later on.
Windows 10 Settings
At this point, it’s best to do a standard check some windows settings to be sure it isn’t something simple.
Step 3.a: With a Windows operating system you’ll want to check and see if the power settings are set never to power off the computer. Its ok to have it turn the monitor off after so many minutes but not the machine.
Step 3.b: You also want to check and make sure your virtual memory is set to 20k maybe a little more if your mining eth these days and make sure Windows Defender is disabled as that sometimes shuts down a miner.
Be sure to do a test run if you had to change any of the settings in Windows and let it run for 24 hours to see if it fixes the issue.
If your mining rigs seem stable, then go ahead and set overclocks but do it slowly however if your problem persists then we need to move on and check for more hardware errors.
Mother BIOS Settings
Step 2.a: If you’re still having issues after turning off the overclock, then you’ll want to check and be sure your motherboards bios are updated.
Step 2.b: While you updating the bios be sure to check and see if 4G decoding is enabled as it is one of the more critical settings but if your new mining, please do yourself a favor and search to see if someone has posted bios settings for your specific motherboard.
Best way to find settings is to enter your motherboards name plus bios settings into Youtube or Google and see what populates.
WARNING: If your board didn’t have bios settings for mining show up in the searches, then you may need to cough it up and buy a mining specific motherboard as their designed for these types of applications.
I’m not saying your motherboard won’t work if you can’t find BIOS settings but if your having issues it could be the MOBO.
Check For Hardware Errors
Now that we’ve checked off some of the general issues we can begin troubleshooting the GPU and Riser. The best place to start is by checking the miner logs for errors.
AMD CARDS ONLY: If you have AMD cards you can check to see if your GPU’s are having any errors by using HWINFO. However its still best to follow along with the rest of the guide to ensure your problem gets resolved. try replacing the riser if the card is throwing a lot of errors
Step 5.a: Inside your mining software folder, there should be log files and lots of them if you’ve been crashing your rig.
Not all miners have log files, but ones like Claymore do. If by some chance you miner doesn’t have logs then skip to Step 7.a.
Step 5.b: Next you need to scroll down to the most recent dated files and check the logs for GPU failures or error messages.
Step 5.c: If you find error messages take note of which card number caused the error first and then exit out and check the next 5 to 6 log files above the one you just exited from. Most times you’ll notice the same card is the first to fail when checking the logs.
Step 6.a: If you’ve found the card number that causing errors, you’ll want to close the miner, if you have it running, then open the device manager. if this didn’t find it go to step 7.a
Windows 10 TIP: You can shortcut to the device manager by right-clicking the windows icon in the bottom left corner then select device manager from the menu. Once inside there you’ll need to click on display adapters to view the available GPU’s.
SIDENOTE: Now it’s important to note that most miner’s index your cards from 0-7 but windows index is from 1-8.
EXAMPLE: Let’s say the miner logs show GPU number two regularly causes an error message. Since the miner index starts at zero instead of one, then I need to disable card number three in the windows device manager which is done by counting down
Step 6.b: Once it’s disabled, you may want to restart your rig and check and make sure the card is still disabled in the device manager.
Step 6.c: If it’s still disabled once you restart, then go ahead and start the miner with no overclock settings and let it run until the GPUs begin to warm up. The reason for this is because I’m feeling my way around the rig to locate the cold disabled GPU. Once you find the disabled GPU shut the mining rig down and unplug the disabled GPU riser and restart the miner and run a test.
Step 6.d: If this fixes your problem, then you should replace the disabled cards riser and retest the rig, but don’t forget to enable the GPU before you test to see if your issues are solved. At this time you may search for stable overclock settings. please read the warning below
WARNING: Anytime you’re going to adjust or unplug/plug any hardware, on your rig, you must always power down the machine entirely before proceeding. don’t forget to wear your nerdy static guards too
HARDWARE TIP: I used to buy cheap risers off eBay, but over the years I’ve noticed you get more stability with premium risers. If you do buy the cheaper riser’s, it’s totally fine, but do yourself a favor and keep an extra set for backups as they tend to fault quite frequently.
At this point, if we’re still having issue well need to run a manual check on each GPU/Riser.
WARNING: If you’re using an M.2 Adapter, then please keep a spare around. If you notice the faulty hardware was on the same channel, then you’ll want to test the hardware on another channel immediately. To know for sure the M.2 adapter is faulty try a few other GPU’s on it and see if you can duplicate the error.
Step 7.a: By now you should have your mining rig powered down and started disconnecting all the risers from the motherboard leaving only one GPU/Riser plugged in.
Step 7.b: From here you’ll want to run a test to see if the rig shuts down or runs past the average time it would take for a shutdown to happen.
Step 7.c: If the card works then add in another card and repeat the test but if the GPU fails then test it another riser to be sure and if it crashes still you may need to RMA the card. continue testing the other GPU’s
Be sure not to rush this and only plug one card back in at a time as we’re trying to narrow down the issue down to the source. Let each test run at least 12 to 24 hours.
Eventually, you narrow it down by testing all the cards and continuing to check the GPU on different risers to determine whether its a weak GPU or faulty riser.
9.9 times out of 10 times its going to be riser although I have had a few bad cards in my day.
If at this point your still having issues try a different version of miner software and test. Another trick I’ve learned is to reinstall windows, and if you still have errors then test the hardware again.
Other Related Questions
Why won’t my mining rig turn on? Most times when your mining rig won’t turn on you need to conduct a hardware check and make sure everything is installed correctly. For you to troubleshoot this issue faster, you need to power down the mining rig and disconnect every riser from the motherboard.
Check your RAM to see if it isn’t pushed in all the way into the motherboard. Next, I would check all the connections to and from the motherboard and power supply to be sure all connections are snapped in
Another thing you want to watch out for is the rubber backings on the risers falling into your CPU fan and making sure the CPU fan is plug in to correct port. see if the fan spins freely
If none of those fix the issue, then you’re going to have to test the CPU, RAM, MOBO, and PSU. This is why its imperative you keep at least a few risers, one CPU, and an extra stick of RAM to test each part.
Why are my mining rig
Test the fan and check to see if its loose and try to see if it spins freely. If it seems ok then repeat the process in Step 1.a and if that doesn’t work then try to replace the riser as a last-ditch effort. you might have a bad fan
Don’t worry about replacing the fan well have a guide for DIY fan replacement soon.
Why does my mining rig keep restarting
However, if your if your mining room is rather warm and you’re noticing these restarts, then I would first check the miner logs see if the GPU drivers are failing. If that’s the case, then you almost know for sure its a GPU or riser.
On the other hand, listen to and hear a click be sure to check your power supply. Sometimes power supply especially the ones rated at 1000watts don’t do well when they start getting hot which causes a thermal shutdown. One way to know for sure is to keep some spare parts around and replace the power supply and run tests.
I hope you learned something from this guide and if anyone has any questions about these steps be sure to drop a reply in the comment box below.