How do I write the contents of an S3 object into SharedMemory?
MB=100
mem = SharedMemory(create=True, size=MB*2**20)
response = s3_client.get_object(Bucket='my_bucket', Key='path/to/obj')
mem.buf[:] = response['Body'].read()
However, I then get an error:
memoryview assignment: lvalue and rvalue have different structure
Printing the memoryview shape gives this:
(105906176,)
When I then try this:
mem.buf[0] = response['Body'].read()
I get a different error:
memoryview: invalid type for format 'B'
How can I write the contents of an S3 file into SharedMemory? I don’t want to write to disk.
2
Answers
So, if you want to use slice-notation, you have to give it the exact size.
i.e. you need something like:
because the way that
mem.buf[:] = data
would be interpreted with slice-syntax is to resize the container on the left to be the same size as data. So consider,So in this case, just:
Of course, this requires you have enough auxiliary space to keep duplicate of the whole data (your other approach did as well, so I assume this is OK).
To do this memory-efficiently, you can just iterate over the streaming body, and it will read in 1kb chunks. So something to the effect of:
If you want to fiddle with the chunksize (in bytes), you can do something like:
Or an equivalent while-loop, if the two-arg form of
iter
is too arcane:You need to ensure the data you’re placing in the SharedBuffer is the same size as the part of the shared buffer you’re writing to. Further, if you want to avoid keeping a second copy of the data around, you can read from S3 in chunks, and write to the buffer as you download data: