Preventing deadlock during distributed calculation












0















I am currently struggling with the following exercise:
There is a given, undirected and connected random graph of nodes. Every node represents an own process and knows its neighbor processes (nodes). Every node can only communicate with its neighbor processes.
Initially, every node has a random integer value. The goal is, that all nodes calculate the middle value of this integer.
To do so, I have to implement the following algorithm:




  1. Two random nodes (N(1) and N(x)) initiate this distributed algorithm. They choose up to two neighbor nodes (N(2) and N(y)) and send them the value of their variable.


  2. The receiving node N(2) calculates the middle value of the received and its own variable.


  3. The receiving node N(2) stores the calculated value as their current value and sends it to the initiator N(1).


  4. N(1) receives the calculated value and stores it as its current value.


  5. N(1) and N(2) hold now the same integer value.


  6. N(1) initiates the next calculation with N(x)...


  7. N(2) selects up to two further neighbor nodes and the process begins again.



A Node terminates if it participated in a specified amount of calculations.



I have implemented this so far in JAVA on an event-based approach: Every incoming communication Message will be added to the end of an Event-Queue to ensure, that a node doesn't carry out two calculations at the same time.



Therefore, the steps should always be executed as listed above.



I'm using the following events:




  • "REQUEST_CALCULATION": Node 1 requests the calculation to Node 2, by sending its current value.


  • "ACK_CALCULATION": Node 2 carried out the calculation and sends the new value back to Node 1.



So far, the distributes application works ok, but the processes running into deadlocks on a regular base: Due to the sequential handling of the events, the queued events cannot be carried out, since a locking neighbor is not sending its result.



I tried to resolve this on different ways:




  1. If a node waited x seconds for a result, I assume a deadlock, cancel the current calculation and proceed with the next queued event.
    This work-around does its job, but the distributed calculating terminates quite quickly: All nodes go into idle mode since many calculations (and subsequently the forwarding to two further neighbors after a successful calculation) get lost.


  2. Like above, but instead of canceling the event completely, I reinitiate it and put it at the end of the queue. This ends normally in an endless loop, where the same messages between a set of nodes are repeated constantly.


  3. To avoid 2), I added a random delay on the consumption of the queued events. Although this decreases the number of deadlocks, this hack cannot be the final solution.



I guess that the deadlocks are just a side-effect of a failure in the base concept of my application. Therefore, I would like to ask you for some recommendations of how to put this distributed calculating process into action, also regardless my previous described implementation.



Thanks in advance for your help, input and feedback.










share|improve this question


















  • 1





    Your question is way too broad. Your 2. and 3. approach is similar to en.wikipedia.org/wiki/…, so you might want to check this out. Also, your nodes should be in some kind of state to indicate if they are in a process of synchronize their value with a neighbor or are free for work. Additionally the responses from their neighbors (in the synchronize process) should be handled with priority before the other "request calculation" messages.

    – Progman
    Jan 1 at 16:52











  • Thanks, that was just the hint I needed. While I already had a state indication implemented, I splitted the queue up to two separate ones; one for every event type. Consequently, I just start with the event handling of the "REQUEST_CALCULATION"-Events, if the "ACK_CALCULATION" queue is empty. This in combination with the "CSMA/CA mechanism" solved the issues. Running 100 nodes parallel, without any occurring deadlocks.

    – Euestros
    Jan 2 at 18:07


















0















I am currently struggling with the following exercise:
There is a given, undirected and connected random graph of nodes. Every node represents an own process and knows its neighbor processes (nodes). Every node can only communicate with its neighbor processes.
Initially, every node has a random integer value. The goal is, that all nodes calculate the middle value of this integer.
To do so, I have to implement the following algorithm:




  1. Two random nodes (N(1) and N(x)) initiate this distributed algorithm. They choose up to two neighbor nodes (N(2) and N(y)) and send them the value of their variable.


  2. The receiving node N(2) calculates the middle value of the received and its own variable.


  3. The receiving node N(2) stores the calculated value as their current value and sends it to the initiator N(1).


  4. N(1) receives the calculated value and stores it as its current value.


  5. N(1) and N(2) hold now the same integer value.


  6. N(1) initiates the next calculation with N(x)...


  7. N(2) selects up to two further neighbor nodes and the process begins again.



A Node terminates if it participated in a specified amount of calculations.



I have implemented this so far in JAVA on an event-based approach: Every incoming communication Message will be added to the end of an Event-Queue to ensure, that a node doesn't carry out two calculations at the same time.



Therefore, the steps should always be executed as listed above.



I'm using the following events:




  • "REQUEST_CALCULATION": Node 1 requests the calculation to Node 2, by sending its current value.


  • "ACK_CALCULATION": Node 2 carried out the calculation and sends the new value back to Node 1.



So far, the distributes application works ok, but the processes running into deadlocks on a regular base: Due to the sequential handling of the events, the queued events cannot be carried out, since a locking neighbor is not sending its result.



I tried to resolve this on different ways:




  1. If a node waited x seconds for a result, I assume a deadlock, cancel the current calculation and proceed with the next queued event.
    This work-around does its job, but the distributed calculating terminates quite quickly: All nodes go into idle mode since many calculations (and subsequently the forwarding to two further neighbors after a successful calculation) get lost.


  2. Like above, but instead of canceling the event completely, I reinitiate it and put it at the end of the queue. This ends normally in an endless loop, where the same messages between a set of nodes are repeated constantly.


  3. To avoid 2), I added a random delay on the consumption of the queued events. Although this decreases the number of deadlocks, this hack cannot be the final solution.



I guess that the deadlocks are just a side-effect of a failure in the base concept of my application. Therefore, I would like to ask you for some recommendations of how to put this distributed calculating process into action, also regardless my previous described implementation.



Thanks in advance for your help, input and feedback.










share|improve this question


















  • 1





    Your question is way too broad. Your 2. and 3. approach is similar to en.wikipedia.org/wiki/…, so you might want to check this out. Also, your nodes should be in some kind of state to indicate if they are in a process of synchronize their value with a neighbor or are free for work. Additionally the responses from their neighbors (in the synchronize process) should be handled with priority before the other "request calculation" messages.

    – Progman
    Jan 1 at 16:52











  • Thanks, that was just the hint I needed. While I already had a state indication implemented, I splitted the queue up to two separate ones; one for every event type. Consequently, I just start with the event handling of the "REQUEST_CALCULATION"-Events, if the "ACK_CALCULATION" queue is empty. This in combination with the "CSMA/CA mechanism" solved the issues. Running 100 nodes parallel, without any occurring deadlocks.

    – Euestros
    Jan 2 at 18:07
















0












0








0








I am currently struggling with the following exercise:
There is a given, undirected and connected random graph of nodes. Every node represents an own process and knows its neighbor processes (nodes). Every node can only communicate with its neighbor processes.
Initially, every node has a random integer value. The goal is, that all nodes calculate the middle value of this integer.
To do so, I have to implement the following algorithm:




  1. Two random nodes (N(1) and N(x)) initiate this distributed algorithm. They choose up to two neighbor nodes (N(2) and N(y)) and send them the value of their variable.


  2. The receiving node N(2) calculates the middle value of the received and its own variable.


  3. The receiving node N(2) stores the calculated value as their current value and sends it to the initiator N(1).


  4. N(1) receives the calculated value and stores it as its current value.


  5. N(1) and N(2) hold now the same integer value.


  6. N(1) initiates the next calculation with N(x)...


  7. N(2) selects up to two further neighbor nodes and the process begins again.



A Node terminates if it participated in a specified amount of calculations.



I have implemented this so far in JAVA on an event-based approach: Every incoming communication Message will be added to the end of an Event-Queue to ensure, that a node doesn't carry out two calculations at the same time.



Therefore, the steps should always be executed as listed above.



I'm using the following events:




  • "REQUEST_CALCULATION": Node 1 requests the calculation to Node 2, by sending its current value.


  • "ACK_CALCULATION": Node 2 carried out the calculation and sends the new value back to Node 1.



So far, the distributes application works ok, but the processes running into deadlocks on a regular base: Due to the sequential handling of the events, the queued events cannot be carried out, since a locking neighbor is not sending its result.



I tried to resolve this on different ways:




  1. If a node waited x seconds for a result, I assume a deadlock, cancel the current calculation and proceed with the next queued event.
    This work-around does its job, but the distributed calculating terminates quite quickly: All nodes go into idle mode since many calculations (and subsequently the forwarding to two further neighbors after a successful calculation) get lost.


  2. Like above, but instead of canceling the event completely, I reinitiate it and put it at the end of the queue. This ends normally in an endless loop, where the same messages between a set of nodes are repeated constantly.


  3. To avoid 2), I added a random delay on the consumption of the queued events. Although this decreases the number of deadlocks, this hack cannot be the final solution.



I guess that the deadlocks are just a side-effect of a failure in the base concept of my application. Therefore, I would like to ask you for some recommendations of how to put this distributed calculating process into action, also regardless my previous described implementation.



Thanks in advance for your help, input and feedback.










share|improve this question














I am currently struggling with the following exercise:
There is a given, undirected and connected random graph of nodes. Every node represents an own process and knows its neighbor processes (nodes). Every node can only communicate with its neighbor processes.
Initially, every node has a random integer value. The goal is, that all nodes calculate the middle value of this integer.
To do so, I have to implement the following algorithm:




  1. Two random nodes (N(1) and N(x)) initiate this distributed algorithm. They choose up to two neighbor nodes (N(2) and N(y)) and send them the value of their variable.


  2. The receiving node N(2) calculates the middle value of the received and its own variable.


  3. The receiving node N(2) stores the calculated value as their current value and sends it to the initiator N(1).


  4. N(1) receives the calculated value and stores it as its current value.


  5. N(1) and N(2) hold now the same integer value.


  6. N(1) initiates the next calculation with N(x)...


  7. N(2) selects up to two further neighbor nodes and the process begins again.



A Node terminates if it participated in a specified amount of calculations.



I have implemented this so far in JAVA on an event-based approach: Every incoming communication Message will be added to the end of an Event-Queue to ensure, that a node doesn't carry out two calculations at the same time.



Therefore, the steps should always be executed as listed above.



I'm using the following events:




  • "REQUEST_CALCULATION": Node 1 requests the calculation to Node 2, by sending its current value.


  • "ACK_CALCULATION": Node 2 carried out the calculation and sends the new value back to Node 1.



So far, the distributes application works ok, but the processes running into deadlocks on a regular base: Due to the sequential handling of the events, the queued events cannot be carried out, since a locking neighbor is not sending its result.



I tried to resolve this on different ways:




  1. If a node waited x seconds for a result, I assume a deadlock, cancel the current calculation and proceed with the next queued event.
    This work-around does its job, but the distributed calculating terminates quite quickly: All nodes go into idle mode since many calculations (and subsequently the forwarding to two further neighbors after a successful calculation) get lost.


  2. Like above, but instead of canceling the event completely, I reinitiate it and put it at the end of the queue. This ends normally in an endless loop, where the same messages between a set of nodes are repeated constantly.


  3. To avoid 2), I added a random delay on the consumption of the queued events. Although this decreases the number of deadlocks, this hack cannot be the final solution.



I guess that the deadlocks are just a side-effect of a failure in the base concept of my application. Therefore, I would like to ask you for some recommendations of how to put this distributed calculating process into action, also regardless my previous described implementation.



Thanks in advance for your help, input and feedback.







java deadlock distributed-computing






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jan 1 at 15:48









EuestrosEuestros

284




284








  • 1





    Your question is way too broad. Your 2. and 3. approach is similar to en.wikipedia.org/wiki/…, so you might want to check this out. Also, your nodes should be in some kind of state to indicate if they are in a process of synchronize their value with a neighbor or are free for work. Additionally the responses from their neighbors (in the synchronize process) should be handled with priority before the other "request calculation" messages.

    – Progman
    Jan 1 at 16:52











  • Thanks, that was just the hint I needed. While I already had a state indication implemented, I splitted the queue up to two separate ones; one for every event type. Consequently, I just start with the event handling of the "REQUEST_CALCULATION"-Events, if the "ACK_CALCULATION" queue is empty. This in combination with the "CSMA/CA mechanism" solved the issues. Running 100 nodes parallel, without any occurring deadlocks.

    – Euestros
    Jan 2 at 18:07
















  • 1





    Your question is way too broad. Your 2. and 3. approach is similar to en.wikipedia.org/wiki/…, so you might want to check this out. Also, your nodes should be in some kind of state to indicate if they are in a process of synchronize their value with a neighbor or are free for work. Additionally the responses from their neighbors (in the synchronize process) should be handled with priority before the other "request calculation" messages.

    – Progman
    Jan 1 at 16:52











  • Thanks, that was just the hint I needed. While I already had a state indication implemented, I splitted the queue up to two separate ones; one for every event type. Consequently, I just start with the event handling of the "REQUEST_CALCULATION"-Events, if the "ACK_CALCULATION" queue is empty. This in combination with the "CSMA/CA mechanism" solved the issues. Running 100 nodes parallel, without any occurring deadlocks.

    – Euestros
    Jan 2 at 18:07










1




1





Your question is way too broad. Your 2. and 3. approach is similar to en.wikipedia.org/wiki/…, so you might want to check this out. Also, your nodes should be in some kind of state to indicate if they are in a process of synchronize their value with a neighbor or are free for work. Additionally the responses from their neighbors (in the synchronize process) should be handled with priority before the other "request calculation" messages.

– Progman
Jan 1 at 16:52





Your question is way too broad. Your 2. and 3. approach is similar to en.wikipedia.org/wiki/…, so you might want to check this out. Also, your nodes should be in some kind of state to indicate if they are in a process of synchronize their value with a neighbor or are free for work. Additionally the responses from their neighbors (in the synchronize process) should be handled with priority before the other "request calculation" messages.

– Progman
Jan 1 at 16:52













Thanks, that was just the hint I needed. While I already had a state indication implemented, I splitted the queue up to two separate ones; one for every event type. Consequently, I just start with the event handling of the "REQUEST_CALCULATION"-Events, if the "ACK_CALCULATION" queue is empty. This in combination with the "CSMA/CA mechanism" solved the issues. Running 100 nodes parallel, without any occurring deadlocks.

– Euestros
Jan 2 at 18:07







Thanks, that was just the hint I needed. While I already had a state indication implemented, I splitted the queue up to two separate ones; one for every event type. Consequently, I just start with the event handling of the "REQUEST_CALCULATION"-Events, if the "ACK_CALCULATION" queue is empty. This in combination with the "CSMA/CA mechanism" solved the issues. Running 100 nodes parallel, without any occurring deadlocks.

– Euestros
Jan 2 at 18:07














0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53996806%2fpreventing-deadlock-during-distributed-calculation%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53996806%2fpreventing-deadlock-during-distributed-calculation%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Monofisismo

Angular Downloading a file using contenturl with Basic Authentication

Olmecas