new pc shutting off - advice?

I posted this on Reddit, but am going to copy it here, because you guys are great and give good advice-

New build, brand new from scratch, about a month old. 13700, ASUS Strix 4070ti, 32 Gb Ram, M2 and SSD, MSI MPG PCIe 5.0 1000w psu, ASUS z790 motherboard.

It has done ok until a week or so ago, had one hiccup. It shut off while playing a random indie disc golf game. I turned it back on and it seemed fine after that.

Last night I was playing fallout 76, not really a game that I would consider putting a load on, its older game, and it shut off again. I turned it back on, loaded game, and when I went back into the server, pc shut off again, when it went into the game where I was at. (loaded game fine, shut off when I logged back in server)

So then I turn it back on with the thought of doing some research on what to look for. At this point, I’m just surfing the web. PC would not run over 3-4 minutes without shutting off.

I turn my old pc on to research, leave the new one off. figured out some things to try, and I did think to unplug the PSU for a few minutes, then turn it back on.

At this point it seemed to run ok. I did the windows memory diagnostic thing, and it passed that. I did a virus scan and it passed that. It was late and I was tired, so I shut down normally and went to bed.

Other things of note and my questions-

Temperatures have seemed fine. So far I have not seen the temps over the high 50s. its averaged 35-45c for the past few weeks, mainly playing fallout and 2K PGA golf. last night it was still running that average. thats the cpu. When I monitored the gpu temp it seemed about the same last week.

At some point about a week ago I started messing with the Armory Crate “smart” fan control. That wouldn’t cause it to shut off, or would it? could that do that? I put it back on standard/turbo speed just in case, but could that be a cause?

My obvious fear is the PSU having a problem, but I don’t know how to test that part, and not like I have another high end PSU that would run my 4070 laying around.

I did notice the 4070 fans not running, but didn’t think anything because as I understand it they won’t run at idle. I do think they spin when the system turns on, but honestly, I haven’t paid much attention yet, because gpu temps haven’t seemed an issue.

Any thoughts or advice on what I should do next?

Sounds like possibly CPU overheating to me. What cooler did you get? And if you are wondering if the fan controller settings messed it up that would be easy to test.

Its a Noctua NH-D15. Thing is, its got a temp monitor that I can see, and the CPU temp has not got over 60c. It usually stays around 35-45c, even while I’m gaming.

I plan to test that fan control thing, probably tomorrow night.

Someone local told me that even though the cpu is reporting ok temps, something else may be overheating that does not report its temp, or the “smart” fan control may be sending a command to turn off inadvertantly.

I feel like I had a similar problem. I’m trying to remember…

I want to say it was the automatic overclock settings in the BIOS. It was setting it to something weird for “game mode” type scenarios. Check your BIOS for stupid stuff?

3 Likes

Here’s a tip. Open up Event Viewer, expand System and check the logs when the system crashed. There should be a red warning sign where the PC crashed and left a log. Check what it says in the description box below and let us (or google) know what happened, so we can see if we can help you.

Keep in mind, event viewer is literally logging every second, so in your case, the log might be really hard to find.

1 Like

thanks guys, I will check both of those things.

1 Like

I had a chance to work on this last night. The PC has not been on since Tuesday.

I turned it on, and looked at the bios, couldn’t find anything that stood out as being changed or weird, then looked through the event viewer. Found the red errors before the shut offs on Tuesday. Then, I messed around a little more in Windows. Things seemed to be going ok, it had been probably about 15 minutes or so, so I decided to try my golf game.

Golf game loaded ok, then Bam, got on the course, 1st hole, and PC just went dead, turned off.

Tried to power back up, and it immediately shut off again. waited a few minutes, tried again, and it got into windows, but shut off while opening the event viewer. I think I may have tried again. It would not run over two minutes.

Ok, there were a lot of stuff in event viewer, but the one thing I saw before the shutoffs on Tuesday, that seemed not normal, and I saw it several times was this-

ACPI thermal zone _ TZ.TZ00 has been enumerated.

Then there were a bunch of numbers. I did take pics with my phone if some wants to see all that. The event ID was 125.

Further testing- I did do some more testing after this, as follows.

I booted up PC into the bios. As long as I stayed in the bios, the PC stayed on. probably had it on 20-30 minutes in bios.
Next I got the PC to boot into safe mode. As long as I had it in safe mode, it stayed on, and did not shut down.

At this point I started thinking that it may be software related, if it would run in bios in safe mode, but not regular Windows. In addition to messing with the AI fan controls, it dawned on me, that I HAD been doing a lot of driver updates with Armory Crate right before this started happening. I was just clicking the button to automatically install any update it could find. Could this be causing it?

Anyways, thinking it may be Armory Crate, I tried to uninstall it from safe mode. It would not uninstall. Said it couldn’t uninstall it. is that a safe mode thing? So then I went to startup and services and disabled every Armory Crate and ASUS thing I could find.

Let it start up regularly, with the ASUS stuff disabled, and it shut down within a minute.

At that point I was sleepy and tired so went to bed.

I’m still not sure if I have a true hardware failure, something in the PSU, something thermal, or something in Windows, or a driver that I updated is causing a false shut off command. My next thought is to just format and re install Windows and see if it will run.

Any thoughts, with this new info?

My hunch is Armory Crate is causing the crashes. I searched Armory Crate causing crash and got many angry results. If you can’t uninstall it, which seems shady, do a Windows reinstall and never install that trash again.

1 Like

Can you post complete error log?

Looks like your CPU is overheating. Just so we can confirm.

ASUS fucked up last gen motherboards for both AMD (burning CPUs and MBs) and Intel platforms (your problem) with bad BIOS.

I’m using Armory Crate for RGB control without problem.

UPDATE:

Was able to work on this over the Holiday weekend-

-started diagnosing things, but all of the sudden, the PC started shutting off in Bios and safe mode… Ok, that started telling me it was not Armory Crate or windows. started testing hardware at that point.

took out the GPU, ran on integrated graphics. still shutting off, so probably not GPU. (whew!)

reapplied thermal paste, redid cooler. reseated the RAM stick that was under the cooler. still shutting off, so probably not thermal issue with CPU.

went to Best Buy and bought a PSU (because I did not have an extra one to test with). installed that (still on integrated graphics). Not only still shutting off, but now it got WORSE. With the new PSU it would not run over 30 seconds, sometimes wouldn’t even boot before shutting off.

So now, the next thing to try might be RAM. So I took out the RAM stick I did not reseat, the one you could get to easily.

BAM. everything comes up and works fine.

Cautious and surprised, I put the original PSU back in, and ran the pc the rest of the evening on integrated graphics, just surfing and stuff. did not cut off

Sunday, put the GPU back in. everything fine. ran the three games that I have installed, all ran fine. never shut off. had fun.

Monday, ran another test, after satisfied that the PC was not shutting off, I updated the BIOS on the motherboard. Had to update the Intel ME software firmware with it, and windows drivers. Windows drivers seemed ok, so did BIOS and Intel ME. It was painless, went fine, and PC came back up. gamed the rest of the night, no problems. cautiously optimistic.

So at this point I’m thinking its one of two things-
A) I have a bad stick of RAM
or
B) This is the one I’m hoping is the cause - Since the only thing I can remember changing was running that AI Armory Crate fan expert stuff, (which is rumored to make behind the scene changes to Bios), and letting it mass update drivers, my thought is that maybe it changed a memory voltage or something, or maybe a driver got updated that was not compatible with the older BIOS. Since this is DDR5-6000 its all pretty new stuff.
Does that sound plausible?

Next step (probably will be Thursday before I can have time to try this) is to put the RAM back in, and see what happens and go from there.

At least at the moment, it is running stable, and I know what piece of the puzzle to look at.

still odd that it was getting that error that pointed to CPU thermal issue…

This one feels likely. But bad RAM is somewhat common.

It is possible that the two RAM sticks are NOT the same, so when Armory Crate does some shenanigans, it makes the RAM sad.

Especially if it’s ddr5 ram. It probably stopped retraining the RAM and it failed at some point and kept failing. Also, armory crate requires kernel access on your system. Which means access to the controllers to your RAM if it has RGB. If the RAM didn’t retrain properly and ACPI was trying to do some stuff with your RAM, then it could be the issue that the app just went nuts and shut down your pc. Or im just talking out of my ass and it’s none of that :stuck_out_tongue:

If I were you, I’d just remove anything Asus from the system and take any of the open source fan/rgb controllers to manage these things. Or go with the better and more expensive route…replacing the Asus mobo with something more reliable.

I’ll admit I don’t know a lot about DDR 5 and even what the retraining means. Only thing I did was turn on the XMP setting.

I’d be surprised if the two sticks were not the same, they came as a package, together. But who knows, people do make mistakes.

And, barring a complete meltdown (which is apparently not out the question right now with ASUS!) I am stuck with the board, as I don’t have the funds to replace that. LOL

The BIOS shenanigans in the background where it changes RAM clock when you play games is what happened to me so that’s why I figured it was armory crate related.

Any luck?

2 Likes

Sorry, I have been swamped the last couple days at work and was also gonna give it another day or two before posting.

But here is the latest update-

After updating the BIOS and stuff, I ran for a couple days, and everything worked fine.

I put the second stick of RAM back in. PC ran for 2 days, maybe three, was thinking ok, that was it!. Then (after several days of being ok) it started doing it again. consistently.
Pulled the second stick of RAM back out, and it went back to having no problems.

So now I disabled XMP. Put the second stick back in.

I also checked my RAM on that QVL list thing, to make sure my model number of RAM is approved for my model of mobo. its on there, says it is QVL approved, or however its worded.

I’m about 3 days into running both sticks of RAM, but with XMP disabled. Sort of waiting to see, since so far it has run ok for a couple days after reinserting the second RAM before failing.

If it continues to work though, what steps should I take then?

Should I run some type of memtest with XMP on and off? I noticed there is a memtest in the BIOS. Should I just run with XMP disabled? I mean, not like I’m going to really notice the difference in speed, since I’m not an overclocker, but the other thing is, you want things to work like they should. Should I buy another set of RAM and try it? Anything else you guys would suggest?

So it is overclocking settings in the BIOS. I mean technically your RAM is meant to run at the speed listed on it and overclocking is pushing it beyond that. Just leave it off and have fun. I doubt you need to overclock your RAM.

1 Like

He paid extra for a motherboard which has xmp functionality. I’d be furious if I paid for something that I cannot use. But then again, I’m Slavic and my fuse is short by default :stuck_out_tongue:

1 Like

Now he learned not to spend money on things he doesn’t need

It has started doing it again. I knew I had posted too soon.

Yes, I am not happy.

I came home after the last message, three days ago, and the PC oddly would not turn on at all. I mean, the mobo light was on, but pressing the power button would not do anything. I unhooked power, plugged it back up, and then it came on. worked fine the rest of the night.

Friday night it worked fine all evening.

Yesterday, it shut off again, right after I booted up. I took the second stick of RAM out, (XMP still off, been off since the last message) and booted it back up, and it worked fine the rest of the night.

Today, it will not stay on long enough for me to check an error log. No matter what I do. I boot up and it stays on about a minute and shuts off.

I’m at a loss. :frowning: Yes, I spent money but all I was after was to have a machine that I could use to play up to date stuff for a while. I mean, I’ve been running a 4570 chip for years.

Edit: after sitting dejected for a bit, I powered the pc back on and went into safe mode. it does not seem to shut off (YET) in safe mode. the only critical error I keep seeing from today in windows log says something about cplspon service terminated, event code 7023