Question Anyone else noticing crazy number of NVMe power cycles per running hour? (250,000 cycles for 278 hours)

abufrejoval

Reputable
Jun 19, 2020
507
359
5,260
One of my notebooks is a Lenovo Yoga Slim 7 13ACN5 with a Ryzen 5800U APU, which actually saw relatively little action: it only had around 280 hours on the clock when it started to run into problems booting Windows while Ubuntu still worked, but also reported drive errors (both OS shared the same 500GB WDC SN730 drive.

When I tried to find out what was going on I saw that there was an insane number of power cycles on the drive, close to 250,000 power cycles for those 280 hours! Only around 7.3TB had been read and 6.6TB written, not much use by any count.

SMART gave a critical warning so I transferred the Windows installation to a new Micron NVMe drive which came with another Lenovo laptop, while Ubuntu was let go.

But on that other drive a similar story keeps repeating: after 1061 hours of operation the power cycles have risen to almost 80,000 and the other day the machine wouldn't come back from hibernation, evidently because the hibernation file contained a replaced sector: the available spare percentage was 83%, which seems quite low.

I also noticed that out of the near 80,000 power cycles, around 60,000 were listed as "unsafe" and that there were 30,000 error log entries, which I haven't yet tried to look at (no nice Windows GUI that I know for that).

The count seems to increase at the rate of minutes when in energy saving state, without power (shut-down or hibernation) little seems to change.

I've compared to the various other laptops and systems I operate and their numbers seem entirely sane, the other extreme is a corporate laptop, which has run 24x7 for years and has only 134 power cycles for nearly 17,000 hours of operation, but most machines tend to have 2-3x the hours than power cycles and rarely these high numbers.

My impression is that every time Windows wants to tell the SSD that little is currently going and and it may want to save some power, it's actually cutting power, and unsafely, too.

It's the only AMD notebook I have in operation, but currently the majority of my desktops are Ryzens and show no such behavior.

All BIOS are checked and updated on those monthly patch days, likewise the NVMe drives all have their most up-to-date firmwares. Energy saving is set to "balanced" and/or "intelligent" wherever I'm given a choice.

I'm also running the newest drivers from Lenovo (laptops) or AMD (desktops).

Of course the machine is from 2021 and out of warranty, Lenovo online support chat people feign "currently experiencing technical difficulties" when I try that avenue, but I haven't really found a lot of similar stories.

When the topic is raised at all, most responses say not to worry, but at nearly 1000 power cycles per hour, clearly some drives are throwing the towel.

To me it sounds like a firmware issue where the wrong commands are sent to the NVMe drive during power management, but as an end user I don't see how I could diagnose that.

So, could you guys have a look and see if you notice similarly high power cycles/operation hours on some of your machines and if certain Lenovos or AMD systems stick out?
 

Ralston18

Titan
Moderator
Window OS - correct?

Look in Task Manager, Resource Monitor, and Process Explorer to take a look at what all the system is doing or trying to do with all of that activity.

Use one tool at a time to look. You may need to download Process Explorer (Microsoft, free)

https://learn.microsoft.com/en-us/sysinternals/downloads/process-explorer

Other things you can do.

Turn off/disable all screen/power savers.

Run the built-in Windows troubleshooters. The troubleshooters may find and fix something.

Run "dism" and "sfc /scannow".

https://www.windowscentral.com/how-use-dism-command-line-utility-repair-windows-10-image

https://www.lifewire.com/how-to-use-sfc-scannow-to-repair-windows-system-files-2626161

Could be some app, utility, or even malware running in the background. Key is identify potential culprits and investigate further. May be just a matter of stopping something from launching at startup or later being triggered via Task Scheduler.
 

abufrejoval

Reputable
Jun 19, 2020
507
359
5,260
Yes, Windows 10 22H2 Enterprise originally, now Windows 11 23H2 Enterprise.

And it was identical for both.

And the issue is really when the box is at idle, not really running anything. I clean out all the usual cruft, reduce M$ nasties to a bare minimum, no Edge, OneDrive, Teams, Copilot, Weather etc. no apps per se running, no browser open or in the background.

I run CrystalDiskInfo, record the power cycles count, then put the machine into energy savings mode, wait for half an hour, touch the keyboard to have it come back and run CrystalDiskInfo again... only to find that the power cycles have increased at the rate of one every couple of minutes.

Power LED was slowly blinking all the time, indicating that the machine really wasn't waking up (at least not fully) due to some I/O.

Doing the same in parallel on a slightly older Lenovo with an Intel Whiskey Lake running the same Windows 11 and very nearly the same software in general, where there is zero extra power cycles there.

Actually it might have even done the same on Linux as it was running Proxmox installed on top of a standard Debian for a while. That's why I think it's perhaps a firmware issue not even Windows related.

But the replacement NVMe was only run with Windows and I didn't put it into energy savings on purpose, mostly had it run during conferences when there were large gaps when I didn't use it and thus Windows would have told the storage to save some power.

I noticed the issue probably more than a year ago (still within the warranty period), but had no real time to dig deeper or to document it properly. I did the Windows 11 upgrade after that so that goes a bit further than DISM and SFC...

Searching the web for clues didn't turn up anything, so this is a bit of a last ditch effort before turning this machine into something "special purpose" and replacing it with a new model that hopefully doesn't have that issue.

Actually when it had trouble to come back from hibernation it seemed very much dead, wouldn't even allow me to enter the BIOS and I was checking up on the prices for a motherboard replacement.

Then ordered a new Thinkpad X13 instead because it seemed a better buck for the money...

Just to make sure it wasn't 100% dead I removed the NVMe and saw that now it booted into the BIOS. Ran a Fedora 40 from a USB stick to see that the machine itself was ok. Then put the NVMe back and had Windows clean the flaky hibernation via booting into debug mode once and then cleanly.

Disabled the "fast startup" option, which I hate madly and which keeps creeping in on updates...

It seems good, fast and doesn't complain.

But those power cycles still climb in a very unnatural manner so sooner or later it will fail again and I'm also pretty sure that fully power cycling the NVMe drive isn't actually doing much good for the battery, either: it should really just go to a energy saving state not do a power cut and reboot on the drive.
 
Last edited: