skip to Main Content

In a tarball files are sorted by the time when they’re pushed in. In the following example they are not sorted alphabetically, but in push-order.

# tar tvf file.tgz
-rw-r--r-- 0/0       962 2023-01-17 13:40:17 6fe3b8b5-a4dc-4976-bea9-227434c11cda
-rw-r--r-- 0/0       962 2023-01-17 13:40:17 febe009e-8ce0-4027-abda-e27f5f949dde
-rw-r--r-- 0/0       962 2023-01-17 13:40:17 4fea07a6-a90c-4cb4-9969-4cac08422ccf
-rw-r--r-- 0/0       962 2023-01-17 13:40:17 01610297-c577-4e6c-9ce9-7825a40e6d0c
-rw-r--r-- 0/0       962 2023-01-17 13:40:26 01b498ca-3d87-426b-b01d-d4e75921cb00
-rw-r--r-- 0/0       962 2023-01-17 13:40:26 45821111-f69d-4331-87d8-f3afd9918c91

Surprisingly when reading the above tarball with PHP 8.1 RecursiveIteratorIterator they are iterated alphabetically.

$p = new PharData($path);
foreach (new RecursiveIteratorIterator($p) as $file) {
    echo $file . "n";
}

The above code results in the output below:

phar:///path/to/file/01610297-c577-4e6c-9ce9-7825a40e6d0c
phar:///path/to/file/01b498ca-3d87-426b-b01d-d4e75921cb00
phar:///path/to/file/45821111-f69d-4331-87d8-f3afd9918c91
phar:///path/to/file/4fea07a6-a90c-4cb4-9969-4cac08422ccf
phar:///path/to/file/6fe3b8b5-a4dc-4976-bea9-227434c11cda
phar:///path/to/file/febe009e-8ce0-4027-abda-e27f5f949dde

Is there a way in PHP to iterate over tarball content in the original order?

2

Answers


  1. Chosen as BEST ANSWER

    Thanks to Alex Howansky answer I found a working solution based on exec(), that basically take advantage of tar sorted listing:

    $sortedFileList = [];
    exec("tar tf $path", $sortedFileList);
    $sortedFileList = array_flip($sortedFileList);
    
    $p = new PharData($path);
    $files = iterator_to_array($p);
    usort($files,
        fn(a, $b) => $sortedFileList[$a->getFilename()] <=> $sortedFileList[$b->getFilename()]
    );
    
    foreach ($files as $file) {
        printf("%s %sn", $file->getPathname(), $file->getMTime());
    }
    

    I'm not a big fan of using exec, but the only alternative to achieve my goal would be to add some external library to the project, and since this work should be done by plain PharData iterator I hope that in a future it will permit to iterate tarball files in raw order.


  2. I believe this is caused by PharData class, not RecursiveIteratorIterator. You can force your own sort by converting to an array first, then using usort() on whatever field you want:

    $p = new PharData(...);
    $files = iterator_to_array($p);
    usort($files, fn($a, $b) => $a->getMTime() <=> $b->getMTime());
    foreach ($files as $file) {
        printf("%s %sn", $file->getPathname(), $file->getMTime());
    }
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search