I have a problem with a SATA drive from western digital when running under linux. Sometimes errors get thrown and the device stops responding fr a while. The drive performed perfectly under windows. The errors show up in ‘dmesg’ like this
ata8.00: exception Emask 0x10 SAct 0x0 SErr 0x10000 action 0xe frozen
ata8: SError: { PHYRdyChg }
ata8.00: failed command: WRITE DMA EXT
ata8.00: cmd 35/00:80:3f:47:12/00:03:00:00:00/e0 tag 0 dma 458752 out
res d8/d8:d8:d8:d8:d8/d8:d8:d8:d8:d8/d8 Emask 0x12 (ATA bus error)
ata8.00: status: { Busy }
ata8.00: error: { ICRC UNC IDNF }
ata8: hard resetting link
ata8: COMRESET failed (errno=-32)
ata8: reset failed (errno=-32), retrying in 8 secs
Based on the following posts, it became apparent that the western digital NCQ implementation is not exactly up to scratch. This is what was causing the errors, not a bad power supply or faulty disk.
https://bugzilla.redhat.com/show_bug.cgi?id=404851
http://www.axelog.de/2010/05/9-sata-phyrdychg-exception/
While Axels blog suggests patching the kernel to resolve the issue, id like to propose an easier option – boot parameters. Simply add the following to your kernel line
libata.force=1.00:noncq
So as you can see from my log output, the problem is with port ata8.00, so I adjust the line to ‘libata.force=8.00:noncq’ . This means all libATA devices retain NCQ performance, but it can be disabled for the drive that is having issues.
An easy way to achieve this in Ubuntu would be to edit grub to pass the kernel the command line
sudo nano /etc/default/grub
and add
GRUB_CMDLINE_LINUX="libata.force=8.00:noncq"
then update grub
sudo update-grub
And hey presto, the errors have never been seen again.
I recently purchased a few 500 GB WD SATA drives for a raid setup (but have not yet installed them). Could you share the disk model you refers to in your trouble description.
Thank you.
Hey,
Just wanted to say I ran into this trouble with a raid I built out of 3 Western Digital drives and your solution fixed everything!
Thank you!
Hi there, thanks for your post! I run into the same trouble and was looking for a solution for more than a month now.
Acutally, after I figured out what the real problem was, the solution was quite close. As always, If you know the right question, you get a proper answer
Cheers, gERD