skip to Main Content

I’m trying to extract the Compression attribute from several thousand jpeg files.

enter image description here

How can I do this with either Excel VBA or PowerShell (or some other method)?

Some users haven’t used the correct technique to convert tiff or png files to jpegs. They just edited the file extension directly in Explorer instead of using an app like Photoshop to properly change the file format. This is causing trouble in a downstream process.

After checking a few files, the problematic ones have ‘Uncompressed’ in that field…and I want to isolate these so they can be corrected.

Note that This answer does not provide the solution I need. The Compression attribute is not in the list of 308 attributes output by that method.

5

Answers


  1. You can get it with API GetFileAttributes.

    Option Explicit
    #If VBA7 Then
        Private Declare PtrSafe Function GetFileAttributes Lib "kernel32" Alias "GetFileAttributesA" (ByVal lpFileName As String) As Long
    #Else
        Private Declare Function GetFileAttributes Lib "kernel32" Alias "GetFileAttributesA" (ByVal lpFileName As String) As Long
    #End If
    Private Const FILE_ATTRIBUTE_COMPRESSED As Long = &H800
    Sub GetCompressionStatus()
        Dim filePath As String
        Dim fileAttributes As Long
        filePath = "d:tempmy.jpg"
        fileAttributes = GetFileAttributes(filePath)
        Debug.Print fileAttributes ' for debug
        If (fileAttributes And FILE_ATTRIBUTE_COMPRESSED) = FILE_ATTRIBUTE_COMPRESSED Then
            MsgBox "COMPRESSED"
        Else
            MsgBox "NO COMPRESSED"
        End If
    End Sub
    
    Login or Signup to reply.
  2. The simplest method in my opinion would be via WMIC, which is native to windows and usable via CMD and Powershell. Here’s an example of the command and its output:

    wmic datafile where "name='C:\test.jpg'"
        AccessMask  Archive  Caption      Compressed  CompressionMethod  CreationClassName  CreationDate               CSCreationClassName   CSName           Description  Drive  EightDotThreeFileName  Encrypted  EncryptionMethod  Extension  FileName  FileSize  FileType    FSCreationClassName  FSName  Hidden  InstallDate                InUseCount  LastAccessed               LastModified               Manufacturer  Name         Path  Readable  Status  System  Version  Writeable
    1507775     TRUE     C:test.jpg  FALSE                          CIM_LogicalFile    20231128113827.761885-480  Win32_ComputerSystem  DESKTOP-XXXXXXX  C:test.jpg  c:     c:test.jpg            FALSE                        jpg        test      5440375   JPEG Image  Win32_FileSystem     NTFS    FALSE   20231128113827.761885-480              20231128114415.151472-480  20231108204657.602147-480                C:test.jpg       TRUE      OK      FALSE            TRUE
    

    This output can be tweaked, summarized or isolated as well, like so:

    PS C:> wmic datafile where "name='C:\test.jpg'" list brief
    Compressed  Encrypted  FileSize  Hidden  Name         Readable  System  Version  Writeable
    FALSE       FALSE      5440375   FALSE   C:test.jpg  TRUE      FALSE            TRUE
    

    See more in the wmic help 🙂

    wmic /?
    

    EDIT:
    Figured I’d add exact value return, since you’re only looking for compression state:

    wmic datafile where "name='C:\test.jpg'" get Compressed
    

    Returns:

    Compressed
    FALSE
    

    You can parse this within your script however you like.

    Login or Signup to reply.
  3. For your purposes it may be sufficient to distinguish mangled files containing some other sort of image format from JPEGs by looking for the SOI tag that should be at the start of a genuine classical JPEG file namely 0xffd8. Occurrence of "JFIF" as text a few bytes in is also a pretty good indication of a JPEG (but no guarantee). MS Bitmap files mostly begin with "BM" but there other possibilities. And PNG starts 0x89504E47 which includes the letters PNG. ISTR GIF files start with "GIF". Finally TIFF will begin with either "II" or "MM". Then set the file extension back to being consistent with the file contents and you should be OK to go. This should be simple enough open the file read the first word to check what you actually have in the file data (may need additional heuristics for other esoteric image formats).

    You could also try re-educating your users with a baseball bat.

    Login or Signup to reply.
  4. Here is a PowerShell answer. Remember that WMI is not available in PowerShell Core. Use CIM.

    (Get-CimInstance -ClassName CIM_DataFile -Filter "Name='C:\Test.jpg'").Compressed
    

    Is there a separate filesystem compression and image compression?

    Login or Signup to reply.
  5. From my earlier comment – if your aim is really to id the actual file type (vs. relying on the file extension)

    Sub Tester()
    
        Const fldr As String = "C:Temppics"
        
        Debug.Print FileTypeId(fldr & "photo.jpg")
        Debug.Print FileTypeId(fldr & "unlock.gif")
        Debug.Print FileTypeId(fldr & "unlock2.png")
        Debug.Print FileTypeId(fldr & "sample.tiff")
    
    End Sub
    
    
    Function FileTypeId(fPath As String) As String
    
        Dim bytes() As Byte, ff As Integer, i As Long
        
        ff = FreeFile
        Open fPath For Binary Access Read As ff
        ReDim bytes(0 To LOF(ff) - 1) 'maybe don't need to read more than the first 10 bytes or so...
        Get ff, , bytes
        Close ff
        
        Select Case True
            Case ByteMatch(bytes, Array(&HFF, &HD8))
                FileTypeId = "JPEG"
            Case ByteMatch(bytes, Array(&H47, &H49, &H46, &H38, &H37, &H61)), _
                 ByteMatch(bytes, Array(&H47, &H49, &H46, &H38, &H39, &H61))
                 FileTypeId = "GIF"
            Case ByteMatch(bytes, Array(&H89, &H50, &H4E, &H47, &HD, &HA, &H1A, &HA))
                FileTypeId = "PNG"
            Case ByteMatch(bytes, Array(&H49, &H49, &H2A, &H0))
                FileTypeId = "TIFF"
            Case Else
                FileTypeId = "unknown"
        End Select
        
    '    Debug.Print fPath
    '    Debug.Print FileTypeId
    '    For i = LBound(bytes) To 6
    '        Debug.Print Hex(bytes(i))
    '    Next i
    End Function
    
    'do the first elements in `bytes` match `sig` ? 
    Function ByteMatch(bytes, sig) As Boolean
        Dim i As Long
        For i = LBound(sig) To UBound(sig)
            If bytes(i) <> sig(i) Then
               ByteMatch = False
               Exit Function
            End If
        Next i
        ByteMatch = True
    End Function
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search