Preventing deadlock during distributed calculation

I am currently struggling with the following exercise:
There is a given, undirected and connected random graph of nodes. Every node represents an own process and knows its neighbor processes (nodes). Every node can only communicate with its neighbor processes.
Initially, every node has a random integer value. The goal is, that all nodes calculate the middle value of this integer.
To do so, I have to implement the following algorithm:

Two random nodes (N(1) and N(x)) initiate this distributed algorithm. They choose up to two neighbor nodes (N(2) and N(y)) and send them the value of their variable.

The receiving node N(2) calculates the middle value of the received and its own variable.

The receiving node N(2) stores the calculated value as their current value and sends it to the initiator N(1).

N(1) receives the calculated value and stores it as its current value.

N(1) and N(2) hold now the same integer value.

N(1) initiates the next calculation with N(x)...

N(2) selects up to two further neighbor nodes and the process begins again.

A Node terminates if it participated in a specified amount of calculations.

I have implemented this so far in JAVA on an event-based approach: Every incoming communication Message will be added to the end of an Event-Queue to ensure, that a node doesn't carry out two calculations at the same time.

Therefore, the steps should always be executed as listed above.

I'm using the following events:

"REQUEST_CALCULATION": Node 1 requests the calculation to Node 2, by sending its current value.

"ACK_CALCULATION": Node 2 carried out the calculation and sends the new value back to Node 1.

So far, the distributes application works ok, but the processes running into deadlocks on a regular base: Due to the sequential handling of the events, the queued events cannot be carried out, since a locking neighbor is not sending its result.

I tried to resolve this on different ways:

If a node waited x seconds for a result, I assume a deadlock, cancel the current calculation and proceed with the next queued event.
This work-around does its job, but the distributed calculating terminates quite quickly: All nodes go into idle mode since many calculations (and subsequently the forwarding to two further neighbors after a successful calculation) get lost.

Like above, but instead of canceling the event completely, I reinitiate it and put it at the end of the queue. This ends normally in an endless loop, where the same messages between a set of nodes are repeated constantly.

To avoid 2), I added a random delay on the consumption of the queued events. Although this decreases the number of deadlocks, this hack cannot be the final solution.

I guess that the deadlocks are just a side-effect of a failure in the base concept of my application. Therefore, I would like to ask you for some recommendations of how to put this distributed calculating process into action, also regardless my previous described implementation.

Thanks in advance for your help, input and feedback.

asked Jan 1 at 15:48

Euestros

284

1

Your question is way too broad. Your 2. and 3. approach is similar to en.wikipedia.org/wiki/…, so you might want to check this out. Also, your nodes should be in some kind of state to indicate if they are in a process of synchronize their value with a neighbor or are free for work. Additionally the responses from their neighbors (in the synchronize process) should be handled with priority before the other "request calculation" messages.

– Progman
Jan 1 at 16:52

Thanks, that was just the hint I needed. While I already had a state indication implemented, I splitted the queue up to two separate ones; one for every event type. Consequently, I just start with the event handling of the "REQUEST_CALCULATION"-Events, if the "ACK_CALCULATION" queue is empty. This in combination with the "CSMA/CA mechanism" solved the issues. Running 100 nodes parallel, without any occurring deadlocks.

– Euestros
Jan 2 at 18:07

add a comment |

Two random nodes (N(1) and N(x)) initiate this distributed algorithm. They choose up to two neighbor nodes (N(2) and N(y)) and send them the value of their variable.

The receiving node N(2) calculates the middle value of the received and its own variable.

The receiving node N(2) stores the calculated value as their current value and sends it to the initiator N(1).

N(1) receives the calculated value and stores it as its current value.

N(1) and N(2) hold now the same integer value.

N(1) initiates the next calculation with N(x)...

N(2) selects up to two further neighbor nodes and the process begins again.

A Node terminates if it participated in a specified amount of calculations.

Therefore, the steps should always be executed as listed above.

I'm using the following events:

"REQUEST_CALCULATION": Node 1 requests the calculation to Node 2, by sending its current value.

"ACK_CALCULATION": Node 2 carried out the calculation and sends the new value back to Node 1.

I tried to resolve this on different ways:

If a node waited x seconds for a result, I assume a deadlock, cancel the current calculation and proceed with the next queued event.
This work-around does its job, but the distributed calculating terminates quite quickly: All nodes go into idle mode since many calculations (and subsequently the forwarding to two further neighbors after a successful calculation) get lost.

Like above, but instead of canceling the event completely, I reinitiate it and put it at the end of the queue. This ends normally in an endless loop, where the same messages between a set of nodes are repeated constantly.

To avoid 2), I added a random delay on the consumption of the queued events. Although this decreases the number of deadlocks, this hack cannot be the final solution.

Thanks in advance for your help, input and feedback.

asked Jan 1 at 15:48

Euestros

284

1

Your question is way too broad. Your 2. and 3. approach is similar to en.wikipedia.org/wiki/…, so you might want to check this out. Also, your nodes should be in some kind of state to indicate if they are in a process of synchronize their value with a neighbor or are free for work. Additionally the responses from their neighbors (in the synchronize process) should be handled with priority before the other "request calculation" messages.

– Progman
Jan 1 at 16:52

Thanks, that was just the hint I needed. While I already had a state indication implemented, I splitted the queue up to two separate ones; one for every event type. Consequently, I just start with the event handling of the "REQUEST_CALCULATION"-Events, if the "ACK_CALCULATION" queue is empty. This in combination with the "CSMA/CA mechanism" solved the issues. Running 100 nodes parallel, without any occurring deadlocks.

– Euestros
Jan 2 at 18:07

add a comment |

Two random nodes (N(1) and N(x)) initiate this distributed algorithm. They choose up to two neighbor nodes (N(2) and N(y)) and send them the value of their variable.

The receiving node N(2) calculates the middle value of the received and its own variable.

The receiving node N(2) stores the calculated value as their current value and sends it to the initiator N(1).

N(1) receives the calculated value and stores it as its current value.

N(1) and N(2) hold now the same integer value.

N(1) initiates the next calculation with N(x)...

N(2) selects up to two further neighbor nodes and the process begins again.

A Node terminates if it participated in a specified amount of calculations.

Therefore, the steps should always be executed as listed above.

I'm using the following events:

"REQUEST_CALCULATION": Node 1 requests the calculation to Node 2, by sending its current value.

"ACK_CALCULATION": Node 2 carried out the calculation and sends the new value back to Node 1.

I tried to resolve this on different ways:

If a node waited x seconds for a result, I assume a deadlock, cancel the current calculation and proceed with the next queued event.
This work-around does its job, but the distributed calculating terminates quite quickly: All nodes go into idle mode since many calculations (and subsequently the forwarding to two further neighbors after a successful calculation) get lost.

Like above, but instead of canceling the event completely, I reinitiate it and put it at the end of the queue. This ends normally in an endless loop, where the same messages between a set of nodes are repeated constantly.

To avoid 2), I added a random delay on the consumption of the queued events. Although this decreases the number of deadlocks, this hack cannot be the final solution.

Thanks in advance for your help, input and feedback.

asked Jan 1 at 15:48

Euestros

284

Two random nodes (N(1) and N(x)) initiate this distributed algorithm. They choose up to two neighbor nodes (N(2) and N(y)) and send them the value of their variable.

The receiving node N(2) calculates the middle value of the received and its own variable.

The receiving node N(2) stores the calculated value as their current value and sends it to the initiator N(1).

N(1) receives the calculated value and stores it as its current value.

N(1) and N(2) hold now the same integer value.

N(1) initiates the next calculation with N(x)...

N(2) selects up to two further neighbor nodes and the process begins again.

A Node terminates if it participated in a specified amount of calculations.

Therefore, the steps should always be executed as listed above.

I'm using the following events:

"REQUEST_CALCULATION": Node 1 requests the calculation to Node 2, by sending its current value.

"ACK_CALCULATION": Node 2 carried out the calculation and sends the new value back to Node 1.

I tried to resolve this on different ways:

If a node waited x seconds for a result, I assume a deadlock, cancel the current calculation and proceed with the next queued event.
This work-around does its job, but the distributed calculating terminates quite quickly: All nodes go into idle mode since many calculations (and subsequently the forwarding to two further neighbors after a successful calculation) get lost.

Like above, but instead of canceling the event completely, I reinitiate it and put it at the end of the queue. This ends normally in an endless loop, where the same messages between a set of nodes are repeated constantly.

To avoid 2), I added a random delay on the consumption of the queued events. Although this decreases the number of deadlocks, this hack cannot be the final solution.

Thanks in advance for your help, input and feedback.

java deadlock distributed-computing

asked Jan 1 at 15:48

Euestros

284

asked Jan 1 at 15:48

Euestros

284

asked Jan 1 at 15:48

Euestros

284

asked Jan 1 at 15:48

Euestros

284

asked Jan 1 at 15:48

Euestros

284

1

Your question is way too broad. Your 2. and 3. approach is similar to en.wikipedia.org/wiki/…, so you might want to check this out. Also, your nodes should be in some kind of state to indicate if they are in a process of synchronize their value with a neighbor or are free for work. Additionally the responses from their neighbors (in the synchronize process) should be handled with priority before the other "request calculation" messages.

– Progman
Jan 1 at 16:52

Thanks, that was just the hint I needed. While I already had a state indication implemented, I splitted the queue up to two separate ones; one for every event type. Consequently, I just start with the event handling of the "REQUEST_CALCULATION"-Events, if the "ACK_CALCULATION" queue is empty. This in combination with the "CSMA/CA mechanism" solved the issues. Running 100 nodes parallel, without any occurring deadlocks.

– Euestros
Jan 2 at 18:07

add a comment |

1

Your question is way too broad. Your 2. and 3. approach is similar to en.wikipedia.org/wiki/…, so you might want to check this out. Also, your nodes should be in some kind of state to indicate if they are in a process of synchronize their value with a neighbor or are free for work. Additionally the responses from their neighbors (in the synchronize process) should be handled with priority before the other "request calculation" messages.

– Progman
Jan 1 at 16:52

Thanks, that was just the hint I needed. While I already had a state indication implemented, I splitted the queue up to two separate ones; one for every event type. Consequently, I just start with the event handling of the "REQUEST_CALCULATION"-Events, if the "ACK_CALCULATION" queue is empty. This in combination with the "CSMA/CA mechanism" solved the issues. Running 100 nodes parallel, without any occurring deadlocks.

– Euestros
Jan 2 at 18:07

Your question is way too broad. Your 2. and 3. approach is similar to en.wikipedia.org/wiki/…, so you might want to check this out. Also, your nodes should be in some kind of state to indicate if they are in a process of synchronize their value with a neighbor or are free for work. Additionally the responses from their neighbors (in the synchronize process) should be handled with priority before the other "request calculation" messages.

– Progman
Jan 1 at 16:52

Thanks, that was just the hint I needed. While I already had a state indication implemented, I splitted the queue up to two separate ones; one for every event type. Consequently, I just start with the event handling of the "REQUEST_CALCULATION"-Events, if the "ACK_CALCULATION" queue is empty. This in combination with the "CSMA/CA mechanism" solved the issues. Running 100 nodes parallel, without any occurring deadlocks.

– Euestros
Jan 2 at 18:07

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53996806%2fpreventing-deadlock-during-distributed-calculation%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtjtk