skip to Main Content

I’m trying to read data from a faulty external SSD to create an image for data recovery. The drive is an Apacer Panther SSD connected to a USB port via an ICY BOX SATA to USB connector on Ubuntu.

Executing the MWE below, read hangs at some address. The address is mostly stable between consecutive runs, but it can vary (e.g. on different days). With a block size of 1, read hangs on the first byte of some sector. The result is that the program freezes and no signal interrupts the read, ctrl-c simply prints "^C" to the terminal but does not kill the program and the alarm’s handler is never called.

Closing the terminal and re-running the program on a new terminal, no read is completed (it hangs on the first iteration). Only by disconnecting and reconnecting the SSD can I read again from the disk. However, if I disconnect the drive while read is blocked, the program continues.

Modifying and running the program with stdin as the file descriptor, both SIGINT and SIGALRM interrupt read.

So the question is:
a) Why does read block indefinitely since according to the man page it is interrupted by signals?
b) Is there any way to fix this?

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/select.h>
#include <unistd.h>
#include <errno.h>
#include <signal.h>

void sig_handler(int signum){
    printf("Alarm handlern");
}

int main(int argc, char *argv[]) {

    // Register ALARM signal handler to prevent read() from blocking indefinitely
    struct sigaction alarm_int_handler = {.sa_handler=sig_handler};
    sigaction(SIGALRM, &alarm_int_handler, 0);
    
    char* disk_name = "/dev/sdb";
    const int block_size = 512;
    int offset = 0;
    
    char block[block_size];

    // Open disk to read as binary file via file descriptor
    int fd = open(disk_name, O_RDONLY | O_NONBLOCK);
    if (fd == -1){
        perror(disk_name);
        exit(0);
    }

    int i;
    int position = offset;

    for (i=0; i<100000; i++){

        // Reset alarm to 1 sec (to interrupt blocked read)
        alarm(1);

        // Seek to current position
        int seek_pos = lseek(fd, position, SEEK_SET);
        if (seek_pos == -1){
            perror("Seek");
        }

        printf("Reading... ");
        fflush(stdout);
        int len = read(fd, block, block_size);
        printf("Read %d chars at %dn", len, position);

        if (len == -1){
            if (errno != EINTR){
                perror("Read");
            }
            else {
                printf("Read aborted due to interruptn");
                // TODO: handle it
            }
        }

        position += len;
        
    }

    close(fd);

    printf("Position %d (%d)n", position, i * block_size);
    printf("Donen");
    return 0;
}

Output on the terminal looks like this

.
.
.
Reading... Read 1 chars at 29642749
Reading... Read 1 chars at 29642750
Reading... Read 1 chars at 29642751
Reading...

2

Answers


  1. It might be a kernel driver bug.

    Did you try non-blocking reads? Regular files cannot be polled, but the descriptor can still be made non-blocking

    Login or Signup to reply.
  2. That sounds like your SSD might be defective (fails to respond to a request, e.g. its firmware hangs while trying to recover from corrupt data in flash memory) or the kernel driver has a bug.

    As to why the process does not respond to signals: There is a process state called "uninterruptible sleep" (abbreviated as state D in top and htop). Processes go into this state when their control flow is inside the kernel (i.e. during a system call like read), for example waiting for data from a disk or network (NFS mounts are infamous for this during a network outage). If your SSD does not reply to a data request, then the process would wait for data indefinitely, since the kernel will not ask the SSD a second time. Or maybe it does, and the SSD always refuses to answer, or might even time out after a few hours of trying… who knows.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search