skip to Main Content

Here is a piece of code (running on Linux CentOS 7.7.1908, x86_64)

import torch    #v1.3.0
import numpy as np  #v1.14.3
import matplotlib.pyplot as plt
from astropy.io.fits import getdata   #v3.0.2
data, hdr = getdata("afile.fits", 0, header=True) #gives dtype=float32 2d array
plt.imshow(data)
plt.show()

This gives a nice 512×512 image
enter image description here

Now, I would like to convert “data” into a PyTorch tensor:

a = torch.from_numpy(data)

Although, PyTorch raises:

ValueError: given numpy array has byte order different from the native
byte order. Conversion between byte orders is currently not supported.

Well, I have tried different manipulations with no success: ie. byteswap(), copy()

An idea?

PS: the same error occurs when I transfer my data to Mac OSX (Mojave) while still matplotlib is ok.

2

Answers


  1. Chosen as BEST ANSWER

    Well, I found a workaround after reading the data array from FITS

    data = data.astype(np.float32)
    a = torch.from_numpy(data)
    

    No error is thrown and everything is ok...


  2. FITS stores data in big-endian byte ordering (at the time FITS was developed this was a more common machine architecture; sadly the standard has never been updated to allow flexibility on this, although it could easily be done with a single header keyword to indicate endianness of the data…)

    According to the Numpy docs Numpy arrays report the endianness of the underlying data as part of its dtype (e.g. a dtype of ‘>i’ means big-endian ints, and ‘and change the array’s dtype to reflect the new byte order.

    Your solution of calling .astype(np.float32) should work, but that’s because the np.float32 dtype is explicitly little-endian, so .astype(...) copies an existing array and converts the data in that array, if necessary, to match that dtype. I just wanted to explain exactly why that works, since it might be otherwise unclear why you’re doing that.

    As for matplotlib it doesn’t really have much to do with your question. Numpy arrays can transparently perform operations on data that does not match the endianness of your machine architecture, by automatically performing byte swaps as necessary. Matplotlib and many other scientific Python libraries work directly with Numpy arrays and thus automatically benefit from its transparent handling of endianness.

    It just happens that PyTorch (in part because of its very high performance and GPU-centric data handling model) requires you to hand it data that’s already in little-endian order, perhaps just to avoid ambiguity. But that’s particular to PyTorch and is not specifically a contrast with matplotlib.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search