How to properly implement multiprocessing in an application with 4 different functions?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
Have a look at the code snippet below. In the main function I instantiate an array jobs. Projects is an array containing multiple project objects. Those project objects also contain multiple target objects. For each target I want to execute four different functions. For this I start a Process pointing to the function run. I append the Process to the array and start it. Current piece of code will produce zombie processes which I try to avoid.
def main():
jobs =
for project in projects:
for target in project.getTargets():
p = multiprocessing.Process(target=run, args=(target.getX(),
target.getY(),))
jobs.append(p)
p.start()
for job in jobs:
job.join()
def run(x, y):
a(x, y)
b(x, y)
c(x, y)
d(x, y)
The goal is to handle approx. five targets in parallel and then use a mechanism such as FIFO to handle a new target once a another target has finished.
python python-2.7 python-multiprocessing
add a comment |
Have a look at the code snippet below. In the main function I instantiate an array jobs. Projects is an array containing multiple project objects. Those project objects also contain multiple target objects. For each target I want to execute four different functions. For this I start a Process pointing to the function run. I append the Process to the array and start it. Current piece of code will produce zombie processes which I try to avoid.
def main():
jobs =
for project in projects:
for target in project.getTargets():
p = multiprocessing.Process(target=run, args=(target.getX(),
target.getY(),))
jobs.append(p)
p.start()
for job in jobs:
job.join()
def run(x, y):
a(x, y)
b(x, y)
c(x, y)
d(x, y)
The goal is to handle approx. five targets in parallel and then use a mechanism such as FIFO to handle a new target once a another target has finished.
python python-2.7 python-multiprocessing
2
Your code is somewhat confusing. You don't seem to be usingtarget. Is it suposed to be split intoxandy? The return values of the functionsauptodaren't used. What's the point of calling them? Normally I would suggest using amultiprocessing.Poolto apply a function to an iterable of values in parallel, but I'm not sure how that would fit here.
– Roland Smith
Jan 3 at 21:13
I edited the code sample to show howtargetis being used. The purpose of the functionsatodare not relevant in this case I guess. You only should know I execute a bash command inside these functions.
– ssd
Jan 3 at 21:18
Apparently this is just pseudo code for some hypothetical question? 9 times out of 10 in multitasking the problem is how to split tasks and targets properly. Some function needs a lot of ram, another waits 90% of time for hard disk reads and one is busy calculating floats - or waiting input from other function. In general, start with least amount of tasks and targets and test in practise what happens and how/where the time is spent when running a certain task. Does more CPU help, or will one thread do as your ram is full or HD slow?
– Stacking For Heap
Jan 3 at 21:24
add a comment |
Have a look at the code snippet below. In the main function I instantiate an array jobs. Projects is an array containing multiple project objects. Those project objects also contain multiple target objects. For each target I want to execute four different functions. For this I start a Process pointing to the function run. I append the Process to the array and start it. Current piece of code will produce zombie processes which I try to avoid.
def main():
jobs =
for project in projects:
for target in project.getTargets():
p = multiprocessing.Process(target=run, args=(target.getX(),
target.getY(),))
jobs.append(p)
p.start()
for job in jobs:
job.join()
def run(x, y):
a(x, y)
b(x, y)
c(x, y)
d(x, y)
The goal is to handle approx. five targets in parallel and then use a mechanism such as FIFO to handle a new target once a another target has finished.
python python-2.7 python-multiprocessing
Have a look at the code snippet below. In the main function I instantiate an array jobs. Projects is an array containing multiple project objects. Those project objects also contain multiple target objects. For each target I want to execute four different functions. For this I start a Process pointing to the function run. I append the Process to the array and start it. Current piece of code will produce zombie processes which I try to avoid.
def main():
jobs =
for project in projects:
for target in project.getTargets():
p = multiprocessing.Process(target=run, args=(target.getX(),
target.getY(),))
jobs.append(p)
p.start()
for job in jobs:
job.join()
def run(x, y):
a(x, y)
b(x, y)
c(x, y)
d(x, y)
The goal is to handle approx. five targets in parallel and then use a mechanism such as FIFO to handle a new target once a another target has finished.
python python-2.7 python-multiprocessing
python python-2.7 python-multiprocessing
edited Jan 3 at 21:25
martineau
70.2k1092186
70.2k1092186
asked Jan 3 at 21:03
ssdssd
11
11
2
Your code is somewhat confusing. You don't seem to be usingtarget. Is it suposed to be split intoxandy? The return values of the functionsauptodaren't used. What's the point of calling them? Normally I would suggest using amultiprocessing.Poolto apply a function to an iterable of values in parallel, but I'm not sure how that would fit here.
– Roland Smith
Jan 3 at 21:13
I edited the code sample to show howtargetis being used. The purpose of the functionsatodare not relevant in this case I guess. You only should know I execute a bash command inside these functions.
– ssd
Jan 3 at 21:18
Apparently this is just pseudo code for some hypothetical question? 9 times out of 10 in multitasking the problem is how to split tasks and targets properly. Some function needs a lot of ram, another waits 90% of time for hard disk reads and one is busy calculating floats - or waiting input from other function. In general, start with least amount of tasks and targets and test in practise what happens and how/where the time is spent when running a certain task. Does more CPU help, or will one thread do as your ram is full or HD slow?
– Stacking For Heap
Jan 3 at 21:24
add a comment |
2
Your code is somewhat confusing. You don't seem to be usingtarget. Is it suposed to be split intoxandy? The return values of the functionsauptodaren't used. What's the point of calling them? Normally I would suggest using amultiprocessing.Poolto apply a function to an iterable of values in parallel, but I'm not sure how that would fit here.
– Roland Smith
Jan 3 at 21:13
I edited the code sample to show howtargetis being used. The purpose of the functionsatodare not relevant in this case I guess. You only should know I execute a bash command inside these functions.
– ssd
Jan 3 at 21:18
Apparently this is just pseudo code for some hypothetical question? 9 times out of 10 in multitasking the problem is how to split tasks and targets properly. Some function needs a lot of ram, another waits 90% of time for hard disk reads and one is busy calculating floats - or waiting input from other function. In general, start with least amount of tasks and targets and test in practise what happens and how/where the time is spent when running a certain task. Does more CPU help, or will one thread do as your ram is full or HD slow?
– Stacking For Heap
Jan 3 at 21:24
2
2
Your code is somewhat confusing. You don't seem to be using
target. Is it suposed to be split into x and y? The return values of the functions a upto d aren't used. What's the point of calling them? Normally I would suggest using a multiprocessing.Pool to apply a function to an iterable of values in parallel, but I'm not sure how that would fit here.– Roland Smith
Jan 3 at 21:13
Your code is somewhat confusing. You don't seem to be using
target. Is it suposed to be split into x and y? The return values of the functions a upto d aren't used. What's the point of calling them? Normally I would suggest using a multiprocessing.Pool to apply a function to an iterable of values in parallel, but I'm not sure how that would fit here.– Roland Smith
Jan 3 at 21:13
I edited the code sample to show how
target is being used. The purpose of the functions a to d are not relevant in this case I guess. You only should know I execute a bash command inside these functions.– ssd
Jan 3 at 21:18
I edited the code sample to show how
target is being used. The purpose of the functions a to d are not relevant in this case I guess. You only should know I execute a bash command inside these functions.– ssd
Jan 3 at 21:18
Apparently this is just pseudo code for some hypothetical question? 9 times out of 10 in multitasking the problem is how to split tasks and targets properly. Some function needs a lot of ram, another waits 90% of time for hard disk reads and one is busy calculating floats - or waiting input from other function. In general, start with least amount of tasks and targets and test in practise what happens and how/where the time is spent when running a certain task. Does more CPU help, or will one thread do as your ram is full or HD slow?
– Stacking For Heap
Jan 3 at 21:24
Apparently this is just pseudo code for some hypothetical question? 9 times out of 10 in multitasking the problem is how to split tasks and targets properly. Some function needs a lot of ram, another waits 90% of time for hard disk reads and one is busy calculating floats - or waiting input from other function. In general, start with least amount of tasks and targets and test in practise what happens and how/where the time is spent when running a certain task. Does more CPU help, or will one thread do as your ram is full or HD slow?
– Stacking For Heap
Jan 3 at 21:24
add a comment |
2 Answers
2
active
oldest
votes
Pass processing function as part your collection over which you iterate:
from multiprocessing import Pool
def fun(*args)
proc, p, q = args
return proc(p, q)
data = [(f, x, y) for f in (a, b, c, d)]
pool = Pool(4)
results = pool.map(fun, data)
add a comment |
From you comment I gather that you are calling bash in the functions a .. d. I would suggest not to do that.
- If you are doing calculations in
bash, that's beter done in Python. - If you use
bashto start programs, that is also better done directly in Python.
If you can use python 3, I would recommend to use concurrent.futures.ThreadPoolExecutor to have a bunch of threads iterate over your data. In each thread, you can then use the subprocess module to start external programs.
My dicom2jpg.py script is an example how to do this. It runs ImageMagick's convert program in parallel to convert DICOM x-ray images into PNG format.
If you need to use Python 2.7, then I would make a list of subprocesses (by calling subprocess.Popen). Continuously iterate over this list and check if a subprocess has finished. If so, remove it from the list. If you have not run out of tasks, start a new subprocess and append it to the list. The list should have as many subprocesses as your machine has cores. More is generally not useful.
This approach is shown in an older version of dicom2png.py.
Interesting! But how would you handle this if you have like three other functions next tostartconvert? I'm talking about Python 2.7.
– ssd
Jan 6 at 17:22
@ssd As long as those other functions can have the same interface asstartconvert, that is they take one filename argument and return a(str, Process)tuple, it would be OK. They would fit into themanageprocsframework.
– Roland Smith
Jan 6 at 18:45
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54029774%2fhow-to-properly-implement-multiprocessing-in-an-application-with-4-different-fun%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Pass processing function as part your collection over which you iterate:
from multiprocessing import Pool
def fun(*args)
proc, p, q = args
return proc(p, q)
data = [(f, x, y) for f in (a, b, c, d)]
pool = Pool(4)
results = pool.map(fun, data)
add a comment |
Pass processing function as part your collection over which you iterate:
from multiprocessing import Pool
def fun(*args)
proc, p, q = args
return proc(p, q)
data = [(f, x, y) for f in (a, b, c, d)]
pool = Pool(4)
results = pool.map(fun, data)
add a comment |
Pass processing function as part your collection over which you iterate:
from multiprocessing import Pool
def fun(*args)
proc, p, q = args
return proc(p, q)
data = [(f, x, y) for f in (a, b, c, d)]
pool = Pool(4)
results = pool.map(fun, data)
Pass processing function as part your collection over which you iterate:
from multiprocessing import Pool
def fun(*args)
proc, p, q = args
return proc(p, q)
data = [(f, x, y) for f in (a, b, c, d)]
pool = Pool(4)
results = pool.map(fun, data)
answered Jan 3 at 21:27
scrutariscrutari
4501715
4501715
add a comment |
add a comment |
From you comment I gather that you are calling bash in the functions a .. d. I would suggest not to do that.
- If you are doing calculations in
bash, that's beter done in Python. - If you use
bashto start programs, that is also better done directly in Python.
If you can use python 3, I would recommend to use concurrent.futures.ThreadPoolExecutor to have a bunch of threads iterate over your data. In each thread, you can then use the subprocess module to start external programs.
My dicom2jpg.py script is an example how to do this. It runs ImageMagick's convert program in parallel to convert DICOM x-ray images into PNG format.
If you need to use Python 2.7, then I would make a list of subprocesses (by calling subprocess.Popen). Continuously iterate over this list and check if a subprocess has finished. If so, remove it from the list. If you have not run out of tasks, start a new subprocess and append it to the list. The list should have as many subprocesses as your machine has cores. More is generally not useful.
This approach is shown in an older version of dicom2png.py.
Interesting! But how would you handle this if you have like three other functions next tostartconvert? I'm talking about Python 2.7.
– ssd
Jan 6 at 17:22
@ssd As long as those other functions can have the same interface asstartconvert, that is they take one filename argument and return a(str, Process)tuple, it would be OK. They would fit into themanageprocsframework.
– Roland Smith
Jan 6 at 18:45
add a comment |
From you comment I gather that you are calling bash in the functions a .. d. I would suggest not to do that.
- If you are doing calculations in
bash, that's beter done in Python. - If you use
bashto start programs, that is also better done directly in Python.
If you can use python 3, I would recommend to use concurrent.futures.ThreadPoolExecutor to have a bunch of threads iterate over your data. In each thread, you can then use the subprocess module to start external programs.
My dicom2jpg.py script is an example how to do this. It runs ImageMagick's convert program in parallel to convert DICOM x-ray images into PNG format.
If you need to use Python 2.7, then I would make a list of subprocesses (by calling subprocess.Popen). Continuously iterate over this list and check if a subprocess has finished. If so, remove it from the list. If you have not run out of tasks, start a new subprocess and append it to the list. The list should have as many subprocesses as your machine has cores. More is generally not useful.
This approach is shown in an older version of dicom2png.py.
Interesting! But how would you handle this if you have like three other functions next tostartconvert? I'm talking about Python 2.7.
– ssd
Jan 6 at 17:22
@ssd As long as those other functions can have the same interface asstartconvert, that is they take one filename argument and return a(str, Process)tuple, it would be OK. They would fit into themanageprocsframework.
– Roland Smith
Jan 6 at 18:45
add a comment |
From you comment I gather that you are calling bash in the functions a .. d. I would suggest not to do that.
- If you are doing calculations in
bash, that's beter done in Python. - If you use
bashto start programs, that is also better done directly in Python.
If you can use python 3, I would recommend to use concurrent.futures.ThreadPoolExecutor to have a bunch of threads iterate over your data. In each thread, you can then use the subprocess module to start external programs.
My dicom2jpg.py script is an example how to do this. It runs ImageMagick's convert program in parallel to convert DICOM x-ray images into PNG format.
If you need to use Python 2.7, then I would make a list of subprocesses (by calling subprocess.Popen). Continuously iterate over this list and check if a subprocess has finished. If so, remove it from the list. If you have not run out of tasks, start a new subprocess and append it to the list. The list should have as many subprocesses as your machine has cores. More is generally not useful.
This approach is shown in an older version of dicom2png.py.
From you comment I gather that you are calling bash in the functions a .. d. I would suggest not to do that.
- If you are doing calculations in
bash, that's beter done in Python. - If you use
bashto start programs, that is also better done directly in Python.
If you can use python 3, I would recommend to use concurrent.futures.ThreadPoolExecutor to have a bunch of threads iterate over your data. In each thread, you can then use the subprocess module to start external programs.
My dicom2jpg.py script is an example how to do this. It runs ImageMagick's convert program in parallel to convert DICOM x-ray images into PNG format.
If you need to use Python 2.7, then I would make a list of subprocesses (by calling subprocess.Popen). Continuously iterate over this list and check if a subprocess has finished. If so, remove it from the list. If you have not run out of tasks, start a new subprocess and append it to the list. The list should have as many subprocesses as your machine has cores. More is generally not useful.
This approach is shown in an older version of dicom2png.py.
answered Jan 3 at 22:04
Roland SmithRoland Smith
27k33256
27k33256
Interesting! But how would you handle this if you have like three other functions next tostartconvert? I'm talking about Python 2.7.
– ssd
Jan 6 at 17:22
@ssd As long as those other functions can have the same interface asstartconvert, that is they take one filename argument and return a(str, Process)tuple, it would be OK. They would fit into themanageprocsframework.
– Roland Smith
Jan 6 at 18:45
add a comment |
Interesting! But how would you handle this if you have like three other functions next tostartconvert? I'm talking about Python 2.7.
– ssd
Jan 6 at 17:22
@ssd As long as those other functions can have the same interface asstartconvert, that is they take one filename argument and return a(str, Process)tuple, it would be OK. They would fit into themanageprocsframework.
– Roland Smith
Jan 6 at 18:45
Interesting! But how would you handle this if you have like three other functions next to
startconvert? I'm talking about Python 2.7.– ssd
Jan 6 at 17:22
Interesting! But how would you handle this if you have like three other functions next to
startconvert? I'm talking about Python 2.7.– ssd
Jan 6 at 17:22
@ssd As long as those other functions can have the same interface as
startconvert, that is they take one filename argument and return a (str, Process) tuple, it would be OK. They would fit into the manageprocs framework.– Roland Smith
Jan 6 at 18:45
@ssd As long as those other functions can have the same interface as
startconvert, that is they take one filename argument and return a (str, Process) tuple, it would be OK. They would fit into the manageprocs framework.– Roland Smith
Jan 6 at 18:45
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54029774%2fhow-to-properly-implement-multiprocessing-in-an-application-with-4-different-fun%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Your code is somewhat confusing. You don't seem to be using
target. Is it suposed to be split intoxandy? The return values of the functionsauptodaren't used. What's the point of calling them? Normally I would suggest using amultiprocessing.Poolto apply a function to an iterable of values in parallel, but I'm not sure how that would fit here.– Roland Smith
Jan 3 at 21:13
I edited the code sample to show how
targetis being used. The purpose of the functionsatodare not relevant in this case I guess. You only should know I execute a bash command inside these functions.– ssd
Jan 3 at 21:18
Apparently this is just pseudo code for some hypothetical question? 9 times out of 10 in multitasking the problem is how to split tasks and targets properly. Some function needs a lot of ram, another waits 90% of time for hard disk reads and one is busy calculating floats - or waiting input from other function. In general, start with least amount of tasks and targets and test in practise what happens and how/where the time is spent when running a certain task. Does more CPU help, or will one thread do as your ram is full or HD slow?
– Stacking For Heap
Jan 3 at 21:24