I cannot get further than GRUB except to rescue mode, when I attempt to boot the main Fedora OS it gets stuck on searching for a disk indefinitely. Gets stuck on Job dev-disk-by\<many symbols>.device/start running (1h / no limit) in the console.
I have a Windows partition on same drive, it also doesn’t boot, it’s rescue command prompt (from where you are instructed to open notepad to rescue files) doesn’t “see” any disk but C: and X: (emerg boot).
I tried booting this machine with two live OS USBs: Fedora and SystemRescue. Neither of them list the SSD (or anything but the USB drive FS itself) in lsblk or the file manager.
Due to lack of storage mediums, I haven’t done a backup in a while. How can I rescue the files? Many passwords are also stuck there, in Firefox manager I wasn’t able to sync due to losing access to the 2FA email.
Edit: SOLVED! Needed to switch disk mode in BIOS/UEFI from RAID to AHCI. IDK how it got to RAID in the first place.
Like others said, sounds like hardware. But before you toss the drive, you said SSD, if it’s an ssd try different cables connecting to different ports. If it’s nvme try reseating it at least, moving it to another port if you have a 2nd. Just saying that sometimes ports and cables fail, so make sure you rule those out before losing hope.
Also possible it’s a bios thing, like maybe the port itself on got disabled in bios or controller got switched to raid mode.
As I wrote, GRUB with all customisations and rescue modes stored on this drive for both Windows and Linux work fine, so I find it unlikely to be a connector problem. Unless such a problem may lead to part of drive working fine and the other not. When SSD is out of socket, BIOS refuses to boot at all and makes loud sounds.
Sorry, I think my reading comprehension was shit there… I got fixated on rescue usb not seeing the disk.
No, I wouldn’t expect it to be a bad port if grub is loading (and the grub partition is on the same disk). Bios not booting at all with disk removed is strange too, I’d expect it to just boot the usb if that were plugged in while disk is not.
You said usb rescue lsblk doesn’t list the disk, guessing it doesn’t show up under /dev/disk/by-id either? lspci? How about with a windows install usb, does it see the disk?
I think I tested the SSD-out scenario without live in.
Only USB drive itself shows under by-id. I don’t have a Windows install USB, the windows I talked about is a partition on the broken disk. It does see the Linux partition with DiskPart but can’t mount it or extract files from BTRFS.
LSPCI lists many cryptic names, “RAID bus controller” sounds like the most promising one. https://termbin.com/287u
Having the bios able to see the disk, but a live boot can’t, makes no real sense to me. If a partition was messed up I’d get that, but to not even see there is a disk to partition, doesn’t feel right. I know it’s probably a dumb question, but you didn’t happen to be messing around in BIOS settings or something right? Is it possible you changed some settings a while ago but haven’t rebooted in a while, and this issue was waiting for you this whole time?
If you don’t have any other slots on the mother board to try the disk in, you could buy an external adapter for whatever kind of disk you have, which would allow you to use this thing as a USB drive. That should at least allow the live boots to see it.
Also also, is it possible you have two disks, and grub is on one and your data is on the other? Again, kinda weird question, but it’s a kinda weird situation…
I don’t think I changed anything relevant in the bios, I’m always quite careful with changing things there. Maybe I changed boot order to test live USBs a couple times. Reboots are frequent due to system crashes.
Also also, is it possible you have two disks, and grub is on one and your data is on the other? Again, kinda weird question, but it’s a kinda weird situation…
No, only one drive. I tried removing this drive, and it failed to get to GRUB. Plus, the rescue mode on Windows partition on the same drive that boots sees it’s own 200 GB of files, they are definitely can’t be anywhere but on 512 GB SSD.
This might not be your issue, but it might be useful to know: if you connect two disks with the same GPT UUID, they might start misbehaving. Perhaps GRUB might not care, but Linux does. Did you ever clone a disk or partition without changing the UUID?
Don’t recall
I’m a btrfs noob, so I’m skipping any tools that may fix the disk, try those first.
If the media isn’t dying, try a file carving tool like photorec. Idk if testdisk supports btrfs but it’s worth a try, it’s my go-to for undelete and finding lost partitions.
Edit: missed most of your post. First you need to check the kernel logs using the dmesg command. Look for errors that may explain why the disk doesn’t show up, especially if it lists scsi or sata in the message.
Edit again: you may want to check the disk’s self reported heath using “smart” data. Many bios menus show this info, and there are programs to get it on Linux. If there are too many read or write errors, you need to decide how important the data is. Professional recovery can probably get all of your data if you stop using the drive now and send it in. DIY recovery using a file carving tool would work best if you have another disk to make an image of the failing one with. ddrescue would be the tool for the job to create the image. If you don’t have another disk large enough, and the files aren’t super important, you can run file carving on the failing disk directly, but the more you use it the greater chance the disk will corrupt more data.
Latest DMESG logs are several days old, I’m not sure I had this problem back then. Nothing seems to be useful there, standard USB connect/disconnect stuff.
I enabled S.M.A.R.T. in BIOS, it seems it doesn’t see an issue with the drive (PredictFailure in wmic is FALSE)
PhotoRec and TestDisk do not see the disk when booted from live media, only the 15GB drive itself and 804MB loop0.
Can you post your dmesg output from booting a live USB? Maybe there will be a clue in there.
How is this ssd connected? Nvme? Sata?
It appears the time was broken, the logs were new.
Here are logs from the live USB, this link expires in 2 weeks, I will preserve it if you’ll find anything relevant there: https:/termbin.com/975x
NVMe
When you boot a recovery image does it have any dmesg logs though? You should see something there when you try to mount it
Does “fdisk -l” show all your partitions on that drive?
No, fdisk shows only the drive itself.
I didn’t find any obvious errors in dmesg logs, but again I don’t know much about them. You may check them out here: https://termbin.com/975x
This seems suspicious
Found 1 remapped NVMe devices. Switch your BIOS from RAID to AHCI mode to use them.Have you setup a RAID?
Don’t recall doing so. I think I saw something about automatic switch to/from RAID and AHCI in boot logs?
I’ve actually seen “something/AHCI/RAID” switch in BIOS set to RAID. Will try switching to AHCI.
Edit: IT WORKED! I changed RAID to AHCI, now the system boots as expected. Thank you. Will change to solved.
CachyOS is very btrfs savvy. I would try booting from a CachyOS ISO.
All distros are. 🙄
Yes, but some do things better than others. CachyOS is btrfs by default, and does btrfs better than most. Btrfs is a bit more complex than your vanilla ext4.
Bazzite excels at gaming related things. Alpine at lightweight stuff, Nix at inmutability, etc…
That is one of the defining characteristics of Linux.
Yes, but some do things better than others.
Ehhhhhh, kinda. “better” is highly subjective. Distros are “bundles of software” and a philosophy about how things are installed / when they are installed / what their default settings are / etc.
A lot of people, especially newbs and less technical folks, grossly misunderstand what those differences are and what they mean.
CachyOS is btrfs by default, and does btrfs better than most.
Bullshit. Not that it’s btrfs by default but that it does it “better” than anyone else is ridiculous. It uses the same kernel driver and user-land tools as my Pop_OS install which is based on Ubuntu 24.04 and which, believe it or not, is running btrfs just fine.
Btrfs is a bit more complex than your vanilla ext4
Kid - I remember when ext4 was released. Very exciting to have a journaling filesystem at the time…
Bazzite excels at gaming related things. Alpine at lightweight stuff, Nix at inmutability, etc…
“excels” at meaning “has steam installed by default” and “makes nvidia drivers easier to install” you mean.
My Pop_OS laptop runs game just fine.
All these distros with their various design goals and bundled packages are variations on a theme. Like different Lego sets that include parts from a common box. But when you use the “btrfs” block you use the same one everyone else does.
And none of them will deal gracefully with a failing disk that the OS has been told not to ignore errors on.
What you describe sounds like a hardware failure (one where btrfs plays no specific role).
If that’s indeed the case, you can only bring the drive to a data recovery service and see what they say (if it’s a spinning disk, they’ll probably recover the data for an exorbitant fee, if it’s an SSD idk).
PS: this is unlikely to work, but… you can try cleaning the drive’s contacts to see if it makes any difference, and also try moving the drive to a different connector (or use it on another computer)
I already removed the drive, contacts seemed clean. Connecting to another computer is the next thing I’m going to try.
If they don’t see the drive at all it’s fucked. If they see the drive but not partition I normally use testdisk to try and recover the partitions, photorec to look for files if that doesn’t work and it’s not encrypted.
If it isn’t visible at all then you are looking at drive repair, if you are asking this question that probably isn’t something to attempt. Try another set of cables and port just in case that is the issue though.
External live OSes don’t see the drive, but some things on drive when it is operating (like GRUB) work. DiskPart on Windows partition sees all partitions.
Every live OS I have ever run can see physical drives if they exist, sounds very odd to not be able to see it from there and yet grub loads.
If grub loads can you boot a recovery image from there? Are you sure grub isn’t installed anywhere else too
You mean emergency console image? Yeah, that’s what I’ve been working with. Older kernels also don’t work.
For the second one, I checked by removing SSD from slot, it refused to boot to GRUB, BIOS gave an error. So it’s on SSD. Plus, Windows partition console sees all it’s files, ~200GB.



