Most optimized way to filter patch positions in an image
So my problem is this: I have an RGB image as a numpy array of dimensions (4086, 2048, 3)
, I split this image dimension into 96x96 patches and get back the positions of these patches in a numpy array. I always get 96x96 patches in every case. If the dimensions of the image can't allow me to create "pure" 96x96 patches on the x or y axis I just add a left padding to it so the last patches overlap a bit with the patch before it.
Now with these positions in hand I want to get rid of all 96x96 patches for which the RGB value is 255
in all three channels for every pixel in the patch, in the fastest way possible and I want to get back all the patches positions which don't have this value.
I would like to know:
- What is the fastest way to extract the 96x96 patches positions from the image dimension? (for now I have a for loop)
- How can you get rid of pure white patches (with value
255
on the 3 channels) in most optimal way? (for now I have a for loop)
I have a lot of these images to process like that with images resolution going up to (39706, 94762, 3)
so my "for loops" becomes quickly inefficient here. Thanks for your help! (I take solutions which make use of the GPU too)
Here is the pseudo code to give you an idea on how it's done for now:
patches =
patch_y = 0
y_limit = False
slide_width = 4086
slide_height = 2048
# Lets imagine this image_slide has 96x96 patches which value is 255
image_slide = np.random.rand(slide_width, slide_height, 3)
while patch_y < slide_height:
patch_x = 0
x_limit = False
while patch_x < slide_width:
# Extract the patch at the given position and return it or return None if it's 3 RGB
# channels are 255
is_white = PatchExtractor.is_white(patch_x, patch_y, image_slide)
# Add the patches position to the list if it's not None (not white)
if not is_white:
patches.append((patch_x, patch_y))
if not x_limit and patch_x + crop_size > slide_width - crop_size:
patch_x = slide_width - crop_size
x_limit = True
else:
patch_x += crop_size
if not y_limit and patch_y + crop_size > slide_height - crop_size:
patch_y = slide_height - crop_size
y_limit = True
else:
patch_y += crop_size
return patches
Ideally, I would like to get my patches positions outside a "for loop" then once I have them I can test if they are white or not outside a for loop as well with the fewer possible calls to numpy (so the code is processed in the C layer of numpy and doesn't go back and forth to python)
python performance numpy vectorization
add a comment |
So my problem is this: I have an RGB image as a numpy array of dimensions (4086, 2048, 3)
, I split this image dimension into 96x96 patches and get back the positions of these patches in a numpy array. I always get 96x96 patches in every case. If the dimensions of the image can't allow me to create "pure" 96x96 patches on the x or y axis I just add a left padding to it so the last patches overlap a bit with the patch before it.
Now with these positions in hand I want to get rid of all 96x96 patches for which the RGB value is 255
in all three channels for every pixel in the patch, in the fastest way possible and I want to get back all the patches positions which don't have this value.
I would like to know:
- What is the fastest way to extract the 96x96 patches positions from the image dimension? (for now I have a for loop)
- How can you get rid of pure white patches (with value
255
on the 3 channels) in most optimal way? (for now I have a for loop)
I have a lot of these images to process like that with images resolution going up to (39706, 94762, 3)
so my "for loops" becomes quickly inefficient here. Thanks for your help! (I take solutions which make use of the GPU too)
Here is the pseudo code to give you an idea on how it's done for now:
patches =
patch_y = 0
y_limit = False
slide_width = 4086
slide_height = 2048
# Lets imagine this image_slide has 96x96 patches which value is 255
image_slide = np.random.rand(slide_width, slide_height, 3)
while patch_y < slide_height:
patch_x = 0
x_limit = False
while patch_x < slide_width:
# Extract the patch at the given position and return it or return None if it's 3 RGB
# channels are 255
is_white = PatchExtractor.is_white(patch_x, patch_y, image_slide)
# Add the patches position to the list if it's not None (not white)
if not is_white:
patches.append((patch_x, patch_y))
if not x_limit and patch_x + crop_size > slide_width - crop_size:
patch_x = slide_width - crop_size
x_limit = True
else:
patch_x += crop_size
if not y_limit and patch_y + crop_size > slide_height - crop_size:
patch_y = slide_height - crop_size
y_limit = True
else:
patch_y += crop_size
return patches
Ideally, I would like to get my patches positions outside a "for loop" then once I have them I can test if they are white or not outside a for loop as well with the fewer possible calls to numpy (so the code is processed in the C layer of numpy and doesn't go back and forth to python)
python performance numpy vectorization
2
Share your working loopy solutions?
– Divakar
yesterday
I just edited my question :)
– E-Kami
yesterday
1
I made a gpu solution and realized ~90% of time is spent on transferring data from ram to gpu :(
– Shihab Shahriar
yesterday
@ShihabShahriar you used numba or something to move your arrays to the GPU?
– E-Kami
5 hours ago
add a comment |
So my problem is this: I have an RGB image as a numpy array of dimensions (4086, 2048, 3)
, I split this image dimension into 96x96 patches and get back the positions of these patches in a numpy array. I always get 96x96 patches in every case. If the dimensions of the image can't allow me to create "pure" 96x96 patches on the x or y axis I just add a left padding to it so the last patches overlap a bit with the patch before it.
Now with these positions in hand I want to get rid of all 96x96 patches for which the RGB value is 255
in all three channels for every pixel in the patch, in the fastest way possible and I want to get back all the patches positions which don't have this value.
I would like to know:
- What is the fastest way to extract the 96x96 patches positions from the image dimension? (for now I have a for loop)
- How can you get rid of pure white patches (with value
255
on the 3 channels) in most optimal way? (for now I have a for loop)
I have a lot of these images to process like that with images resolution going up to (39706, 94762, 3)
so my "for loops" becomes quickly inefficient here. Thanks for your help! (I take solutions which make use of the GPU too)
Here is the pseudo code to give you an idea on how it's done for now:
patches =
patch_y = 0
y_limit = False
slide_width = 4086
slide_height = 2048
# Lets imagine this image_slide has 96x96 patches which value is 255
image_slide = np.random.rand(slide_width, slide_height, 3)
while patch_y < slide_height:
patch_x = 0
x_limit = False
while patch_x < slide_width:
# Extract the patch at the given position and return it or return None if it's 3 RGB
# channels are 255
is_white = PatchExtractor.is_white(patch_x, patch_y, image_slide)
# Add the patches position to the list if it's not None (not white)
if not is_white:
patches.append((patch_x, patch_y))
if not x_limit and patch_x + crop_size > slide_width - crop_size:
patch_x = slide_width - crop_size
x_limit = True
else:
patch_x += crop_size
if not y_limit and patch_y + crop_size > slide_height - crop_size:
patch_y = slide_height - crop_size
y_limit = True
else:
patch_y += crop_size
return patches
Ideally, I would like to get my patches positions outside a "for loop" then once I have them I can test if they are white or not outside a for loop as well with the fewer possible calls to numpy (so the code is processed in the C layer of numpy and doesn't go back and forth to python)
python performance numpy vectorization
So my problem is this: I have an RGB image as a numpy array of dimensions (4086, 2048, 3)
, I split this image dimension into 96x96 patches and get back the positions of these patches in a numpy array. I always get 96x96 patches in every case. If the dimensions of the image can't allow me to create "pure" 96x96 patches on the x or y axis I just add a left padding to it so the last patches overlap a bit with the patch before it.
Now with these positions in hand I want to get rid of all 96x96 patches for which the RGB value is 255
in all three channels for every pixel in the patch, in the fastest way possible and I want to get back all the patches positions which don't have this value.
I would like to know:
- What is the fastest way to extract the 96x96 patches positions from the image dimension? (for now I have a for loop)
- How can you get rid of pure white patches (with value
255
on the 3 channels) in most optimal way? (for now I have a for loop)
I have a lot of these images to process like that with images resolution going up to (39706, 94762, 3)
so my "for loops" becomes quickly inefficient here. Thanks for your help! (I take solutions which make use of the GPU too)
Here is the pseudo code to give you an idea on how it's done for now:
patches =
patch_y = 0
y_limit = False
slide_width = 4086
slide_height = 2048
# Lets imagine this image_slide has 96x96 patches which value is 255
image_slide = np.random.rand(slide_width, slide_height, 3)
while patch_y < slide_height:
patch_x = 0
x_limit = False
while patch_x < slide_width:
# Extract the patch at the given position and return it or return None if it's 3 RGB
# channels are 255
is_white = PatchExtractor.is_white(patch_x, patch_y, image_slide)
# Add the patches position to the list if it's not None (not white)
if not is_white:
patches.append((patch_x, patch_y))
if not x_limit and patch_x + crop_size > slide_width - crop_size:
patch_x = slide_width - crop_size
x_limit = True
else:
patch_x += crop_size
if not y_limit and patch_y + crop_size > slide_height - crop_size:
patch_y = slide_height - crop_size
y_limit = True
else:
patch_y += crop_size
return patches
Ideally, I would like to get my patches positions outside a "for loop" then once I have them I can test if they are white or not outside a for loop as well with the fewer possible calls to numpy (so the code is processed in the C layer of numpy and doesn't go back and forth to python)
python performance numpy vectorization
python performance numpy vectorization
edited yesterday


Andras Deak
20.4k63870
20.4k63870
asked yesterday
E-Kami
1,13231734
1,13231734
2
Share your working loopy solutions?
– Divakar
yesterday
I just edited my question :)
– E-Kami
yesterday
1
I made a gpu solution and realized ~90% of time is spent on transferring data from ram to gpu :(
– Shihab Shahriar
yesterday
@ShihabShahriar you used numba or something to move your arrays to the GPU?
– E-Kami
5 hours ago
add a comment |
2
Share your working loopy solutions?
– Divakar
yesterday
I just edited my question :)
– E-Kami
yesterday
1
I made a gpu solution and realized ~90% of time is spent on transferring data from ram to gpu :(
– Shihab Shahriar
yesterday
@ShihabShahriar you used numba or something to move your arrays to the GPU?
– E-Kami
5 hours ago
2
2
Share your working loopy solutions?
– Divakar
yesterday
Share your working loopy solutions?
– Divakar
yesterday
I just edited my question :)
– E-Kami
yesterday
I just edited my question :)
– E-Kami
yesterday
1
1
I made a gpu solution and realized ~90% of time is spent on transferring data from ram to gpu :(
– Shihab Shahriar
yesterday
I made a gpu solution and realized ~90% of time is spent on transferring data from ram to gpu :(
– Shihab Shahriar
yesterday
@ShihabShahriar you used numba or something to move your arrays to the GPU?
– E-Kami
5 hours ago
@ShihabShahriar you used numba or something to move your arrays to the GPU?
– E-Kami
5 hours ago
add a comment |
1 Answer
1
active
oldest
votes
As you suspected you can vectorize all of what you're doing. It takes roughly a small integer multiple of the memory need of your original image. The algorithm is quite straightforward: pad your image so that an integer number of patches fit in it, cut it up into patches, check if each patch is all white, keep the rest:
import numpy as np
# generate some dummy data and shapes
imsize = (1024, 2048)
patchsize = 96
image = np.random.randint(0, 256, size=imsize + (3,), dtype=np.uint8)
# seed some white patches: cut a square hole in the random noise
image[image.shape[0]//2:3*image.shape[0]//2, image.shape[1]//2:3*image.shape[1]//2] = 255
# pad the image to necessary size; memory imprint similar size as the input image
# white pad for simplicity for now
nx,ny = (np.ceil(dim/patchsize).astype(int) for dim in imsize) # number of patches
if imsize[0] % patchsize or imsize[1] % patchsize:
# we need to pad along at least one dimension
padded = np.pad(image, ((0, nx * patchsize - imsize[0]),
(0, ny * patchsize - imsize[1]), (0,0)),
mode='constant', constant_values=255)
else:
# no padding needed
padded = image
# reshape padded image according to patches; doesn't copy memory
patched = padded.reshape(nx, patchsize, ny, patchsize, 3).transpose(0, 2, 1, 3, 4)
# patched is shape (nx, ny, patchsize, patchsize, 3)
# appending .copy() as a last step to the above will copy memory but might speed up
# the next step; time it to find out
# check for white patches; memory imprint the same size as the padded image
filt = ~(patched == 255).all((2, 3, 4))
# filt is a bool, one for each patch that tells us if it's _not_ all white
# (i.e. we want to keep it)
patch_x,patch_y = filt.nonzero() # patch indices of non-whites from 0 to nx-1, 0 to ny-1
patch_pixel_x = patch_x * patchsize # proper pixel indices of each pixel
patch_pixel_y = patch_y * patchsize
patches = np.array([patch_pixel_x, patch_pixel_y]).T
# shape (npatch, 2) which is compatible with a list of tuples
# if you want the actual patches as well:
patch_images = patched[filt, ...]
# shape (npatch, patchsize, patchsize, 3),
# patch_images[i,...] is an image with patchsize * patchsize pixels
As you can see, in the above I used white padding to get a congruent padded image. I believe this is in line with the philosophy of what you're trying to do. If you want to replicate what you're doing in the loop exactly, you can pad your image manually using the overlapping pixels that you'd take into account near the edge. You'd need to allocate a padded image of the right size, then manually slice the overlapping pixels of the original image in order to set the edge pixels in the padded result.
+1, great solution. Can you please elaborate this linepatched = padded.reshape(nx, patchsize, ny, patchsize, 3).transpose(0, 2, 1, 3, 4)
– Shihab Shahriar
yesterday
1
@Shihab thanks. The first part (padded.reshape(nx, patchsize, ny, patchsize, 3)
) does the actual chunking from a 2d array to a 4d collection of 2d patches (ignoring the colour channel dimension). Considerarr = np.arange(2*3*4*3).reshape(2*3,4*3)
. Thenarr_chunked = arr.reshape(2,3,4,3)
is a similar chunked array:arr_chunked[0,:,0,:]
is the 3-by-3 chunk at position[0,0]
among chunks.arr_chunked[0,:,1,:]
is the chunk next to this one to the right, and so on. In total there are 2*4 chunksarr_chunked[i,:,j,:]
. The final transpose reorders the axes so that chunk indices are last.
– Andras Deak
yesterday
1
learnt a new thing..thanks
– Shihab Shahriar
14 hours ago
1
This looks fantastic, thanks a lot! Most of my images are of size(39706, 94762, 3)
and the processing is done under a minute. The only downside is that it takes ~45GB of RAM, I'll have to find a way to reduce this number.
– E-Kami
5 hours ago
1
Thanks so much for this @AndrasDeak. I imagine if I introduce python loops again I could make use of numba to speed things up a bit
– E-Kami
3 hours ago
|
show 3 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53943876%2fmost-optimized-way-to-filter-patch-positions-in-an-image%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
As you suspected you can vectorize all of what you're doing. It takes roughly a small integer multiple of the memory need of your original image. The algorithm is quite straightforward: pad your image so that an integer number of patches fit in it, cut it up into patches, check if each patch is all white, keep the rest:
import numpy as np
# generate some dummy data and shapes
imsize = (1024, 2048)
patchsize = 96
image = np.random.randint(0, 256, size=imsize + (3,), dtype=np.uint8)
# seed some white patches: cut a square hole in the random noise
image[image.shape[0]//2:3*image.shape[0]//2, image.shape[1]//2:3*image.shape[1]//2] = 255
# pad the image to necessary size; memory imprint similar size as the input image
# white pad for simplicity for now
nx,ny = (np.ceil(dim/patchsize).astype(int) for dim in imsize) # number of patches
if imsize[0] % patchsize or imsize[1] % patchsize:
# we need to pad along at least one dimension
padded = np.pad(image, ((0, nx * patchsize - imsize[0]),
(0, ny * patchsize - imsize[1]), (0,0)),
mode='constant', constant_values=255)
else:
# no padding needed
padded = image
# reshape padded image according to patches; doesn't copy memory
patched = padded.reshape(nx, patchsize, ny, patchsize, 3).transpose(0, 2, 1, 3, 4)
# patched is shape (nx, ny, patchsize, patchsize, 3)
# appending .copy() as a last step to the above will copy memory but might speed up
# the next step; time it to find out
# check for white patches; memory imprint the same size as the padded image
filt = ~(patched == 255).all((2, 3, 4))
# filt is a bool, one for each patch that tells us if it's _not_ all white
# (i.e. we want to keep it)
patch_x,patch_y = filt.nonzero() # patch indices of non-whites from 0 to nx-1, 0 to ny-1
patch_pixel_x = patch_x * patchsize # proper pixel indices of each pixel
patch_pixel_y = patch_y * patchsize
patches = np.array([patch_pixel_x, patch_pixel_y]).T
# shape (npatch, 2) which is compatible with a list of tuples
# if you want the actual patches as well:
patch_images = patched[filt, ...]
# shape (npatch, patchsize, patchsize, 3),
# patch_images[i,...] is an image with patchsize * patchsize pixels
As you can see, in the above I used white padding to get a congruent padded image. I believe this is in line with the philosophy of what you're trying to do. If you want to replicate what you're doing in the loop exactly, you can pad your image manually using the overlapping pixels that you'd take into account near the edge. You'd need to allocate a padded image of the right size, then manually slice the overlapping pixels of the original image in order to set the edge pixels in the padded result.
+1, great solution. Can you please elaborate this linepatched = padded.reshape(nx, patchsize, ny, patchsize, 3).transpose(0, 2, 1, 3, 4)
– Shihab Shahriar
yesterday
1
@Shihab thanks. The first part (padded.reshape(nx, patchsize, ny, patchsize, 3)
) does the actual chunking from a 2d array to a 4d collection of 2d patches (ignoring the colour channel dimension). Considerarr = np.arange(2*3*4*3).reshape(2*3,4*3)
. Thenarr_chunked = arr.reshape(2,3,4,3)
is a similar chunked array:arr_chunked[0,:,0,:]
is the 3-by-3 chunk at position[0,0]
among chunks.arr_chunked[0,:,1,:]
is the chunk next to this one to the right, and so on. In total there are 2*4 chunksarr_chunked[i,:,j,:]
. The final transpose reorders the axes so that chunk indices are last.
– Andras Deak
yesterday
1
learnt a new thing..thanks
– Shihab Shahriar
14 hours ago
1
This looks fantastic, thanks a lot! Most of my images are of size(39706, 94762, 3)
and the processing is done under a minute. The only downside is that it takes ~45GB of RAM, I'll have to find a way to reduce this number.
– E-Kami
5 hours ago
1
Thanks so much for this @AndrasDeak. I imagine if I introduce python loops again I could make use of numba to speed things up a bit
– E-Kami
3 hours ago
|
show 3 more comments
As you suspected you can vectorize all of what you're doing. It takes roughly a small integer multiple of the memory need of your original image. The algorithm is quite straightforward: pad your image so that an integer number of patches fit in it, cut it up into patches, check if each patch is all white, keep the rest:
import numpy as np
# generate some dummy data and shapes
imsize = (1024, 2048)
patchsize = 96
image = np.random.randint(0, 256, size=imsize + (3,), dtype=np.uint8)
# seed some white patches: cut a square hole in the random noise
image[image.shape[0]//2:3*image.shape[0]//2, image.shape[1]//2:3*image.shape[1]//2] = 255
# pad the image to necessary size; memory imprint similar size as the input image
# white pad for simplicity for now
nx,ny = (np.ceil(dim/patchsize).astype(int) for dim in imsize) # number of patches
if imsize[0] % patchsize or imsize[1] % patchsize:
# we need to pad along at least one dimension
padded = np.pad(image, ((0, nx * patchsize - imsize[0]),
(0, ny * patchsize - imsize[1]), (0,0)),
mode='constant', constant_values=255)
else:
# no padding needed
padded = image
# reshape padded image according to patches; doesn't copy memory
patched = padded.reshape(nx, patchsize, ny, patchsize, 3).transpose(0, 2, 1, 3, 4)
# patched is shape (nx, ny, patchsize, patchsize, 3)
# appending .copy() as a last step to the above will copy memory but might speed up
# the next step; time it to find out
# check for white patches; memory imprint the same size as the padded image
filt = ~(patched == 255).all((2, 3, 4))
# filt is a bool, one for each patch that tells us if it's _not_ all white
# (i.e. we want to keep it)
patch_x,patch_y = filt.nonzero() # patch indices of non-whites from 0 to nx-1, 0 to ny-1
patch_pixel_x = patch_x * patchsize # proper pixel indices of each pixel
patch_pixel_y = patch_y * patchsize
patches = np.array([patch_pixel_x, patch_pixel_y]).T
# shape (npatch, 2) which is compatible with a list of tuples
# if you want the actual patches as well:
patch_images = patched[filt, ...]
# shape (npatch, patchsize, patchsize, 3),
# patch_images[i,...] is an image with patchsize * patchsize pixels
As you can see, in the above I used white padding to get a congruent padded image. I believe this is in line with the philosophy of what you're trying to do. If you want to replicate what you're doing in the loop exactly, you can pad your image manually using the overlapping pixels that you'd take into account near the edge. You'd need to allocate a padded image of the right size, then manually slice the overlapping pixels of the original image in order to set the edge pixels in the padded result.
+1, great solution. Can you please elaborate this linepatched = padded.reshape(nx, patchsize, ny, patchsize, 3).transpose(0, 2, 1, 3, 4)
– Shihab Shahriar
yesterday
1
@Shihab thanks. The first part (padded.reshape(nx, patchsize, ny, patchsize, 3)
) does the actual chunking from a 2d array to a 4d collection of 2d patches (ignoring the colour channel dimension). Considerarr = np.arange(2*3*4*3).reshape(2*3,4*3)
. Thenarr_chunked = arr.reshape(2,3,4,3)
is a similar chunked array:arr_chunked[0,:,0,:]
is the 3-by-3 chunk at position[0,0]
among chunks.arr_chunked[0,:,1,:]
is the chunk next to this one to the right, and so on. In total there are 2*4 chunksarr_chunked[i,:,j,:]
. The final transpose reorders the axes so that chunk indices are last.
– Andras Deak
yesterday
1
learnt a new thing..thanks
– Shihab Shahriar
14 hours ago
1
This looks fantastic, thanks a lot! Most of my images are of size(39706, 94762, 3)
and the processing is done under a minute. The only downside is that it takes ~45GB of RAM, I'll have to find a way to reduce this number.
– E-Kami
5 hours ago
1
Thanks so much for this @AndrasDeak. I imagine if I introduce python loops again I could make use of numba to speed things up a bit
– E-Kami
3 hours ago
|
show 3 more comments
As you suspected you can vectorize all of what you're doing. It takes roughly a small integer multiple of the memory need of your original image. The algorithm is quite straightforward: pad your image so that an integer number of patches fit in it, cut it up into patches, check if each patch is all white, keep the rest:
import numpy as np
# generate some dummy data and shapes
imsize = (1024, 2048)
patchsize = 96
image = np.random.randint(0, 256, size=imsize + (3,), dtype=np.uint8)
# seed some white patches: cut a square hole in the random noise
image[image.shape[0]//2:3*image.shape[0]//2, image.shape[1]//2:3*image.shape[1]//2] = 255
# pad the image to necessary size; memory imprint similar size as the input image
# white pad for simplicity for now
nx,ny = (np.ceil(dim/patchsize).astype(int) for dim in imsize) # number of patches
if imsize[0] % patchsize or imsize[1] % patchsize:
# we need to pad along at least one dimension
padded = np.pad(image, ((0, nx * patchsize - imsize[0]),
(0, ny * patchsize - imsize[1]), (0,0)),
mode='constant', constant_values=255)
else:
# no padding needed
padded = image
# reshape padded image according to patches; doesn't copy memory
patched = padded.reshape(nx, patchsize, ny, patchsize, 3).transpose(0, 2, 1, 3, 4)
# patched is shape (nx, ny, patchsize, patchsize, 3)
# appending .copy() as a last step to the above will copy memory but might speed up
# the next step; time it to find out
# check for white patches; memory imprint the same size as the padded image
filt = ~(patched == 255).all((2, 3, 4))
# filt is a bool, one for each patch that tells us if it's _not_ all white
# (i.e. we want to keep it)
patch_x,patch_y = filt.nonzero() # patch indices of non-whites from 0 to nx-1, 0 to ny-1
patch_pixel_x = patch_x * patchsize # proper pixel indices of each pixel
patch_pixel_y = patch_y * patchsize
patches = np.array([patch_pixel_x, patch_pixel_y]).T
# shape (npatch, 2) which is compatible with a list of tuples
# if you want the actual patches as well:
patch_images = patched[filt, ...]
# shape (npatch, patchsize, patchsize, 3),
# patch_images[i,...] is an image with patchsize * patchsize pixels
As you can see, in the above I used white padding to get a congruent padded image. I believe this is in line with the philosophy of what you're trying to do. If you want to replicate what you're doing in the loop exactly, you can pad your image manually using the overlapping pixels that you'd take into account near the edge. You'd need to allocate a padded image of the right size, then manually slice the overlapping pixels of the original image in order to set the edge pixels in the padded result.
As you suspected you can vectorize all of what you're doing. It takes roughly a small integer multiple of the memory need of your original image. The algorithm is quite straightforward: pad your image so that an integer number of patches fit in it, cut it up into patches, check if each patch is all white, keep the rest:
import numpy as np
# generate some dummy data and shapes
imsize = (1024, 2048)
patchsize = 96
image = np.random.randint(0, 256, size=imsize + (3,), dtype=np.uint8)
# seed some white patches: cut a square hole in the random noise
image[image.shape[0]//2:3*image.shape[0]//2, image.shape[1]//2:3*image.shape[1]//2] = 255
# pad the image to necessary size; memory imprint similar size as the input image
# white pad for simplicity for now
nx,ny = (np.ceil(dim/patchsize).astype(int) for dim in imsize) # number of patches
if imsize[0] % patchsize or imsize[1] % patchsize:
# we need to pad along at least one dimension
padded = np.pad(image, ((0, nx * patchsize - imsize[0]),
(0, ny * patchsize - imsize[1]), (0,0)),
mode='constant', constant_values=255)
else:
# no padding needed
padded = image
# reshape padded image according to patches; doesn't copy memory
patched = padded.reshape(nx, patchsize, ny, patchsize, 3).transpose(0, 2, 1, 3, 4)
# patched is shape (nx, ny, patchsize, patchsize, 3)
# appending .copy() as a last step to the above will copy memory but might speed up
# the next step; time it to find out
# check for white patches; memory imprint the same size as the padded image
filt = ~(patched == 255).all((2, 3, 4))
# filt is a bool, one for each patch that tells us if it's _not_ all white
# (i.e. we want to keep it)
patch_x,patch_y = filt.nonzero() # patch indices of non-whites from 0 to nx-1, 0 to ny-1
patch_pixel_x = patch_x * patchsize # proper pixel indices of each pixel
patch_pixel_y = patch_y * patchsize
patches = np.array([patch_pixel_x, patch_pixel_y]).T
# shape (npatch, 2) which is compatible with a list of tuples
# if you want the actual patches as well:
patch_images = patched[filt, ...]
# shape (npatch, patchsize, patchsize, 3),
# patch_images[i,...] is an image with patchsize * patchsize pixels
As you can see, in the above I used white padding to get a congruent padded image. I believe this is in line with the philosophy of what you're trying to do. If you want to replicate what you're doing in the loop exactly, you can pad your image manually using the overlapping pixels that you'd take into account near the edge. You'd need to allocate a padded image of the right size, then manually slice the overlapping pixels of the original image in order to set the edge pixels in the padded result.
answered yesterday


Andras Deak
20.4k63870
20.4k63870
+1, great solution. Can you please elaborate this linepatched = padded.reshape(nx, patchsize, ny, patchsize, 3).transpose(0, 2, 1, 3, 4)
– Shihab Shahriar
yesterday
1
@Shihab thanks. The first part (padded.reshape(nx, patchsize, ny, patchsize, 3)
) does the actual chunking from a 2d array to a 4d collection of 2d patches (ignoring the colour channel dimension). Considerarr = np.arange(2*3*4*3).reshape(2*3,4*3)
. Thenarr_chunked = arr.reshape(2,3,4,3)
is a similar chunked array:arr_chunked[0,:,0,:]
is the 3-by-3 chunk at position[0,0]
among chunks.arr_chunked[0,:,1,:]
is the chunk next to this one to the right, and so on. In total there are 2*4 chunksarr_chunked[i,:,j,:]
. The final transpose reorders the axes so that chunk indices are last.
– Andras Deak
yesterday
1
learnt a new thing..thanks
– Shihab Shahriar
14 hours ago
1
This looks fantastic, thanks a lot! Most of my images are of size(39706, 94762, 3)
and the processing is done under a minute. The only downside is that it takes ~45GB of RAM, I'll have to find a way to reduce this number.
– E-Kami
5 hours ago
1
Thanks so much for this @AndrasDeak. I imagine if I introduce python loops again I could make use of numba to speed things up a bit
– E-Kami
3 hours ago
|
show 3 more comments
+1, great solution. Can you please elaborate this linepatched = padded.reshape(nx, patchsize, ny, patchsize, 3).transpose(0, 2, 1, 3, 4)
– Shihab Shahriar
yesterday
1
@Shihab thanks. The first part (padded.reshape(nx, patchsize, ny, patchsize, 3)
) does the actual chunking from a 2d array to a 4d collection of 2d patches (ignoring the colour channel dimension). Considerarr = np.arange(2*3*4*3).reshape(2*3,4*3)
. Thenarr_chunked = arr.reshape(2,3,4,3)
is a similar chunked array:arr_chunked[0,:,0,:]
is the 3-by-3 chunk at position[0,0]
among chunks.arr_chunked[0,:,1,:]
is the chunk next to this one to the right, and so on. In total there are 2*4 chunksarr_chunked[i,:,j,:]
. The final transpose reorders the axes so that chunk indices are last.
– Andras Deak
yesterday
1
learnt a new thing..thanks
– Shihab Shahriar
14 hours ago
1
This looks fantastic, thanks a lot! Most of my images are of size(39706, 94762, 3)
and the processing is done under a minute. The only downside is that it takes ~45GB of RAM, I'll have to find a way to reduce this number.
– E-Kami
5 hours ago
1
Thanks so much for this @AndrasDeak. I imagine if I introduce python loops again I could make use of numba to speed things up a bit
– E-Kami
3 hours ago
+1, great solution. Can you please elaborate this line
patched = padded.reshape(nx, patchsize, ny, patchsize, 3).transpose(0, 2, 1, 3, 4)
– Shihab Shahriar
yesterday
+1, great solution. Can you please elaborate this line
patched = padded.reshape(nx, patchsize, ny, patchsize, 3).transpose(0, 2, 1, 3, 4)
– Shihab Shahriar
yesterday
1
1
@Shihab thanks. The first part (
padded.reshape(nx, patchsize, ny, patchsize, 3)
) does the actual chunking from a 2d array to a 4d collection of 2d patches (ignoring the colour channel dimension). Consider arr = np.arange(2*3*4*3).reshape(2*3,4*3)
. Then arr_chunked = arr.reshape(2,3,4,3)
is a similar chunked array: arr_chunked[0,:,0,:]
is the 3-by-3 chunk at position [0,0]
among chunks. arr_chunked[0,:,1,:]
is the chunk next to this one to the right, and so on. In total there are 2*4 chunks arr_chunked[i,:,j,:]
. The final transpose reorders the axes so that chunk indices are last.– Andras Deak
yesterday
@Shihab thanks. The first part (
padded.reshape(nx, patchsize, ny, patchsize, 3)
) does the actual chunking from a 2d array to a 4d collection of 2d patches (ignoring the colour channel dimension). Consider arr = np.arange(2*3*4*3).reshape(2*3,4*3)
. Then arr_chunked = arr.reshape(2,3,4,3)
is a similar chunked array: arr_chunked[0,:,0,:]
is the 3-by-3 chunk at position [0,0]
among chunks. arr_chunked[0,:,1,:]
is the chunk next to this one to the right, and so on. In total there are 2*4 chunks arr_chunked[i,:,j,:]
. The final transpose reorders the axes so that chunk indices are last.– Andras Deak
yesterday
1
1
learnt a new thing..thanks
– Shihab Shahriar
14 hours ago
learnt a new thing..thanks
– Shihab Shahriar
14 hours ago
1
1
This looks fantastic, thanks a lot! Most of my images are of size
(39706, 94762, 3)
and the processing is done under a minute. The only downside is that it takes ~45GB of RAM, I'll have to find a way to reduce this number.– E-Kami
5 hours ago
This looks fantastic, thanks a lot! Most of my images are of size
(39706, 94762, 3)
and the processing is done under a minute. The only downside is that it takes ~45GB of RAM, I'll have to find a way to reduce this number.– E-Kami
5 hours ago
1
1
Thanks so much for this @AndrasDeak. I imagine if I introduce python loops again I could make use of numba to speed things up a bit
– E-Kami
3 hours ago
Thanks so much for this @AndrasDeak. I imagine if I introduce python loops again I could make use of numba to speed things up a bit
– E-Kami
3 hours ago
|
show 3 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53943876%2fmost-optimized-way-to-filter-patch-positions-in-an-image%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Share your working loopy solutions?
– Divakar
yesterday
I just edited my question :)
– E-Kami
yesterday
1
I made a gpu solution and realized ~90% of time is spent on transferring data from ram to gpu :(
– Shihab Shahriar
yesterday
@ShihabShahriar you used numba or something to move your arrays to the GPU?
– E-Kami
5 hours ago