More efficient solution to find longest series based on boolean in NumPy ndArray
I search my ndArray to find longest series based on True values. Is there an option to find longest series without looping through array?
I've already wrote my own solution with numpy.nonzero, but there is probably better one.
import numpy as np
arr = np.array([[[1,2,3,4,5],
[6,7,8,9,10],
[11,12,13,14,15],
[16,17,18,19,20],
[21,22,23,24,25]],
[[True,True,True,False,True],
[True,True,True,True,False],
[True,True,False,True,True],
[True,True,True,False,True],
[True,True,True,False,True]]])
def getIndices(arr):
arr_to_search = np.nonzero(arr)
arrs =
prev_el0 = 0
prev_el1 = -1
activ_long =
for i in range(len(arr_to_search[0])):
if arr_to_search[0][i] == prev_el0:
if arr_to_search[1][i] != prev_el1 + 1:
arrs.append(activ_long)
activ_long =
else:
arrs.append(activ_long)
activ_long =
activ_long.append((arr_to_search[0][i],arr_to_search[1][i]))
prev_el0 = arr_to_search[0][i]
prev_el1 = arr_to_search[1][i]
max_len = len(max(arrs,key=len))
longest_arr_list = [a for a in arrs if len(a) == max_len]
return longest_arr_list
print(getIndices(arr[1,:,:]))
print(getIndices(arr[1,:,:].T))
[[(1, 0), (1, 1), (1, 2), (1, 3)]]
[[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4)], [(1, 0), (1, 1), (1, 2), (1, 3), (1, 4)]]
python numpy multidimensional-array
add a comment |
I search my ndArray to find longest series based on True values. Is there an option to find longest series without looping through array?
I've already wrote my own solution with numpy.nonzero, but there is probably better one.
import numpy as np
arr = np.array([[[1,2,3,4,5],
[6,7,8,9,10],
[11,12,13,14,15],
[16,17,18,19,20],
[21,22,23,24,25]],
[[True,True,True,False,True],
[True,True,True,True,False],
[True,True,False,True,True],
[True,True,True,False,True],
[True,True,True,False,True]]])
def getIndices(arr):
arr_to_search = np.nonzero(arr)
arrs =
prev_el0 = 0
prev_el1 = -1
activ_long =
for i in range(len(arr_to_search[0])):
if arr_to_search[0][i] == prev_el0:
if arr_to_search[1][i] != prev_el1 + 1:
arrs.append(activ_long)
activ_long =
else:
arrs.append(activ_long)
activ_long =
activ_long.append((arr_to_search[0][i],arr_to_search[1][i]))
prev_el0 = arr_to_search[0][i]
prev_el1 = arr_to_search[1][i]
max_len = len(max(arrs,key=len))
longest_arr_list = [a for a in arrs if len(a) == max_len]
return longest_arr_list
print(getIndices(arr[1,:,:]))
print(getIndices(arr[1,:,:].T))
[[(1, 0), (1, 1), (1, 2), (1, 3)]]
[[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4)], [(1, 0), (1, 1), (1, 2), (1, 3), (1, 4)]]
python numpy multidimensional-array
If your code works, might be more appropriate for code review
– user3483203
Dec 30 '18 at 15:13
What do you mean withfind longest series based on True values
? What is the expected output?
– yatu
Dec 30 '18 at 15:14
By series i mean longest subarray where value is True, like in example below: [True,True,True,False,True] [True,True,True,True,False] [True,True,False,True,True] [True,True,True,False,True] [True,True,True,False,True] here, longest series is in col 0 and col 1, 5 x True, False is breaking series
– grudzien
Dec 30 '18 at 15:25
add a comment |
I search my ndArray to find longest series based on True values. Is there an option to find longest series without looping through array?
I've already wrote my own solution with numpy.nonzero, but there is probably better one.
import numpy as np
arr = np.array([[[1,2,3,4,5],
[6,7,8,9,10],
[11,12,13,14,15],
[16,17,18,19,20],
[21,22,23,24,25]],
[[True,True,True,False,True],
[True,True,True,True,False],
[True,True,False,True,True],
[True,True,True,False,True],
[True,True,True,False,True]]])
def getIndices(arr):
arr_to_search = np.nonzero(arr)
arrs =
prev_el0 = 0
prev_el1 = -1
activ_long =
for i in range(len(arr_to_search[0])):
if arr_to_search[0][i] == prev_el0:
if arr_to_search[1][i] != prev_el1 + 1:
arrs.append(activ_long)
activ_long =
else:
arrs.append(activ_long)
activ_long =
activ_long.append((arr_to_search[0][i],arr_to_search[1][i]))
prev_el0 = arr_to_search[0][i]
prev_el1 = arr_to_search[1][i]
max_len = len(max(arrs,key=len))
longest_arr_list = [a for a in arrs if len(a) == max_len]
return longest_arr_list
print(getIndices(arr[1,:,:]))
print(getIndices(arr[1,:,:].T))
[[(1, 0), (1, 1), (1, 2), (1, 3)]]
[[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4)], [(1, 0), (1, 1), (1, 2), (1, 3), (1, 4)]]
python numpy multidimensional-array
I search my ndArray to find longest series based on True values. Is there an option to find longest series without looping through array?
I've already wrote my own solution with numpy.nonzero, but there is probably better one.
import numpy as np
arr = np.array([[[1,2,3,4,5],
[6,7,8,9,10],
[11,12,13,14,15],
[16,17,18,19,20],
[21,22,23,24,25]],
[[True,True,True,False,True],
[True,True,True,True,False],
[True,True,False,True,True],
[True,True,True,False,True],
[True,True,True,False,True]]])
def getIndices(arr):
arr_to_search = np.nonzero(arr)
arrs =
prev_el0 = 0
prev_el1 = -1
activ_long =
for i in range(len(arr_to_search[0])):
if arr_to_search[0][i] == prev_el0:
if arr_to_search[1][i] != prev_el1 + 1:
arrs.append(activ_long)
activ_long =
else:
arrs.append(activ_long)
activ_long =
activ_long.append((arr_to_search[0][i],arr_to_search[1][i]))
prev_el0 = arr_to_search[0][i]
prev_el1 = arr_to_search[1][i]
max_len = len(max(arrs,key=len))
longest_arr_list = [a for a in arrs if len(a) == max_len]
return longest_arr_list
print(getIndices(arr[1,:,:]))
print(getIndices(arr[1,:,:].T))
[[(1, 0), (1, 1), (1, 2), (1, 3)]]
[[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4)], [(1, 0), (1, 1), (1, 2), (1, 3), (1, 4)]]
python numpy multidimensional-array
python numpy multidimensional-array
asked Dec 30 '18 at 15:05
grudziengrudzien
61
61
If your code works, might be more appropriate for code review
– user3483203
Dec 30 '18 at 15:13
What do you mean withfind longest series based on True values
? What is the expected output?
– yatu
Dec 30 '18 at 15:14
By series i mean longest subarray where value is True, like in example below: [True,True,True,False,True] [True,True,True,True,False] [True,True,False,True,True] [True,True,True,False,True] [True,True,True,False,True] here, longest series is in col 0 and col 1, 5 x True, False is breaking series
– grudzien
Dec 30 '18 at 15:25
add a comment |
If your code works, might be more appropriate for code review
– user3483203
Dec 30 '18 at 15:13
What do you mean withfind longest series based on True values
? What is the expected output?
– yatu
Dec 30 '18 at 15:14
By series i mean longest subarray where value is True, like in example below: [True,True,True,False,True] [True,True,True,True,False] [True,True,False,True,True] [True,True,True,False,True] [True,True,True,False,True] here, longest series is in col 0 and col 1, 5 x True, False is breaking series
– grudzien
Dec 30 '18 at 15:25
If your code works, might be more appropriate for code review
– user3483203
Dec 30 '18 at 15:13
If your code works, might be more appropriate for code review
– user3483203
Dec 30 '18 at 15:13
What do you mean with
find longest series based on True values
? What is the expected output?– yatu
Dec 30 '18 at 15:14
What do you mean with
find longest series based on True values
? What is the expected output?– yatu
Dec 30 '18 at 15:14
By series i mean longest subarray where value is True, like in example below: [True,True,True,False,True] [True,True,True,True,False] [True,True,False,True,True] [True,True,True,False,True] [True,True,True,False,True] here, longest series is in col 0 and col 1, 5 x True, False is breaking series
– grudzien
Dec 30 '18 at 15:25
By series i mean longest subarray where value is True, like in example below: [True,True,True,False,True] [True,True,True,True,False] [True,True,False,True,True] [True,True,True,False,True] [True,True,True,False,True] here, longest series is in col 0 and col 1, 5 x True, False is breaking series
– grudzien
Dec 30 '18 at 15:25
add a comment |
1 Answer
1
active
oldest
votes
Here is a numpy solution which avoids explicit loops based on this previous question.
I'm assuming the boolean array is named a
. Essentially we find the indices for where rows change from 0 to 1 or from 1 to 0 and look at the difference between these. By padding 0s on the frong and back, we ensure that for every transition from a 0 to a 1, there is another transition from a 1 to a 0.
For convenience I process a
and a.T
at the same time, but you could do them separately if you want.
m,n = a.shape
A = np.zeros((2*m,n+2))
A[:m,1:-1] = a
A[m:,1:-1] = a.T
dA = np.diff(A)
start = np.where(dA>0)
end = np.where(dA<0)
argmax_run = np.argmax(end[1]-start[1])
row = start[0][argmax_run]
col_start = start[1][argmax_run]
col_end= end[1][argmax_run]-1
max_len = col_end - col_start + 1
print('max run of length {}'.format(max_len))
print('in '+('row' if row<m else'col')+' {}'.format(row%m)+' from '+('col' if row<m else'row')+' {} to {}'.format(col_start,col_end))
To improve performance and storage we can change A
to a boolean array. Since the -1
and 1
in dA
above always come in pairs, we can find start
and end
as below.
nz = np.nonzero(dA)
start = (nz[0][::2], nz[1][::2])
end = (nz[0][1::2], nz[1][1::2])
Note that you can then remove the variables start
and end
entirely as they aren't really needed.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53978707%2fmore-efficient-solution-to-find-longest-series-based-on-boolean-in-numpy-ndarray%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Here is a numpy solution which avoids explicit loops based on this previous question.
I'm assuming the boolean array is named a
. Essentially we find the indices for where rows change from 0 to 1 or from 1 to 0 and look at the difference between these. By padding 0s on the frong and back, we ensure that for every transition from a 0 to a 1, there is another transition from a 1 to a 0.
For convenience I process a
and a.T
at the same time, but you could do them separately if you want.
m,n = a.shape
A = np.zeros((2*m,n+2))
A[:m,1:-1] = a
A[m:,1:-1] = a.T
dA = np.diff(A)
start = np.where(dA>0)
end = np.where(dA<0)
argmax_run = np.argmax(end[1]-start[1])
row = start[0][argmax_run]
col_start = start[1][argmax_run]
col_end= end[1][argmax_run]-1
max_len = col_end - col_start + 1
print('max run of length {}'.format(max_len))
print('in '+('row' if row<m else'col')+' {}'.format(row%m)+' from '+('col' if row<m else'row')+' {} to {}'.format(col_start,col_end))
To improve performance and storage we can change A
to a boolean array. Since the -1
and 1
in dA
above always come in pairs, we can find start
and end
as below.
nz = np.nonzero(dA)
start = (nz[0][::2], nz[1][::2])
end = (nz[0][1::2], nz[1][1::2])
Note that you can then remove the variables start
and end
entirely as they aren't really needed.
add a comment |
Here is a numpy solution which avoids explicit loops based on this previous question.
I'm assuming the boolean array is named a
. Essentially we find the indices for where rows change from 0 to 1 or from 1 to 0 and look at the difference between these. By padding 0s on the frong and back, we ensure that for every transition from a 0 to a 1, there is another transition from a 1 to a 0.
For convenience I process a
and a.T
at the same time, but you could do them separately if you want.
m,n = a.shape
A = np.zeros((2*m,n+2))
A[:m,1:-1] = a
A[m:,1:-1] = a.T
dA = np.diff(A)
start = np.where(dA>0)
end = np.where(dA<0)
argmax_run = np.argmax(end[1]-start[1])
row = start[0][argmax_run]
col_start = start[1][argmax_run]
col_end= end[1][argmax_run]-1
max_len = col_end - col_start + 1
print('max run of length {}'.format(max_len))
print('in '+('row' if row<m else'col')+' {}'.format(row%m)+' from '+('col' if row<m else'row')+' {} to {}'.format(col_start,col_end))
To improve performance and storage we can change A
to a boolean array. Since the -1
and 1
in dA
above always come in pairs, we can find start
and end
as below.
nz = np.nonzero(dA)
start = (nz[0][::2], nz[1][::2])
end = (nz[0][1::2], nz[1][1::2])
Note that you can then remove the variables start
and end
entirely as they aren't really needed.
add a comment |
Here is a numpy solution which avoids explicit loops based on this previous question.
I'm assuming the boolean array is named a
. Essentially we find the indices for where rows change from 0 to 1 or from 1 to 0 and look at the difference between these. By padding 0s on the frong and back, we ensure that for every transition from a 0 to a 1, there is another transition from a 1 to a 0.
For convenience I process a
and a.T
at the same time, but you could do them separately if you want.
m,n = a.shape
A = np.zeros((2*m,n+2))
A[:m,1:-1] = a
A[m:,1:-1] = a.T
dA = np.diff(A)
start = np.where(dA>0)
end = np.where(dA<0)
argmax_run = np.argmax(end[1]-start[1])
row = start[0][argmax_run]
col_start = start[1][argmax_run]
col_end= end[1][argmax_run]-1
max_len = col_end - col_start + 1
print('max run of length {}'.format(max_len))
print('in '+('row' if row<m else'col')+' {}'.format(row%m)+' from '+('col' if row<m else'row')+' {} to {}'.format(col_start,col_end))
To improve performance and storage we can change A
to a boolean array. Since the -1
and 1
in dA
above always come in pairs, we can find start
and end
as below.
nz = np.nonzero(dA)
start = (nz[0][::2], nz[1][::2])
end = (nz[0][1::2], nz[1][1::2])
Note that you can then remove the variables start
and end
entirely as they aren't really needed.
Here is a numpy solution which avoids explicit loops based on this previous question.
I'm assuming the boolean array is named a
. Essentially we find the indices for where rows change from 0 to 1 or from 1 to 0 and look at the difference between these. By padding 0s on the frong and back, we ensure that for every transition from a 0 to a 1, there is another transition from a 1 to a 0.
For convenience I process a
and a.T
at the same time, but you could do them separately if you want.
m,n = a.shape
A = np.zeros((2*m,n+2))
A[:m,1:-1] = a
A[m:,1:-1] = a.T
dA = np.diff(A)
start = np.where(dA>0)
end = np.where(dA<0)
argmax_run = np.argmax(end[1]-start[1])
row = start[0][argmax_run]
col_start = start[1][argmax_run]
col_end= end[1][argmax_run]-1
max_len = col_end - col_start + 1
print('max run of length {}'.format(max_len))
print('in '+('row' if row<m else'col')+' {}'.format(row%m)+' from '+('col' if row<m else'row')+' {} to {}'.format(col_start,col_end))
To improve performance and storage we can change A
to a boolean array. Since the -1
and 1
in dA
above always come in pairs, we can find start
and end
as below.
nz = np.nonzero(dA)
start = (nz[0][::2], nz[1][::2])
end = (nz[0][1::2], nz[1][1::2])
Note that you can then remove the variables start
and end
entirely as they aren't really needed.
answered Dec 30 '18 at 16:21
tchtch
42525
42525
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53978707%2fmore-efficient-solution-to-find-longest-series-based-on-boolean-in-numpy-ndarray%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
If your code works, might be more appropriate for code review
– user3483203
Dec 30 '18 at 15:13
What do you mean with
find longest series based on True values
? What is the expected output?– yatu
Dec 30 '18 at 15:14
By series i mean longest subarray where value is True, like in example below: [True,True,True,False,True] [True,True,True,True,False] [True,True,False,True,True] [True,True,True,False,True] [True,True,True,False,True] here, longest series is in col 0 and col 1, 5 x True, False is breaking series
– grudzien
Dec 30 '18 at 15:25