Number of parallel sockets/TCP connections to open for optimal use by application in C++
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
So, I am working on a C++ application that currently uses C sockets to transfer data between peers. There are n peers and all run the same code. In the application logic, any peer can need to transfer (possibly large) data to any other peer and so connections are first open between all possible combinations of peers. The requirement is that the application logic and the network transfers of (possibly large) data should be as fast as possible.
As of present, between any 2 peers (say A and B), the application opens 2 types of connections - one where A is the server and B is the client and vice versa. This was possibly done so that if A needs to transfer data to B and vice versa concurrently, the whole thing can finish faster than just having one connection type from A to B. For each connection type (say where A is the server and B the client), the application then opens 3 TCP connections (using C-sockets). However, the way its presently coded it only ends up using only one of these 3 connections.
Upon seeing this, I began to wonder that to make optimal use of N open connections, maybe one can use round-robin or some policy to break data in chunks and tranfer at same time. However, the question of how many parallel TCP connections should be open and what policy be used between these connections is not clear to me. On what factors does this answer also depend ? For example, if i have 1000 TCP connections open, whats the harm ? (ignoring the system constraints like running out of ports etc.)
If someone can throw light on how applications today make use of multiple parallel TCP connections to be most performant, that would be great. Quick google search leads me to several research papers, but I am also interested in knowing how do for example web browsers solve this problem.
Thanks!
UPDATE : After talking to a few people with more knowledge of TCP, I have come to have a better picture.
Firstly, my premise that opening two types of connections between A and B (one where A is client and B server and vice versa) will help in increasing net throughput seems wrong. Opening one type of TCP connection between A and B should suffice. This depends on whether datagrams are able to travel from A to B and vice versa at the same time. I found this link to be useful : Is TCP bidirectional or full-duplex?.
Also, to make use of full bandwidth available to me, its better to open multiple TCP connections. I found this highly relevant link : TCP is it possible to achieve higher transfer rate with multiple connections?
But the question of how many such connections should be open still remains. It would be great if someone can answer that.
c++ c sockets networking tcp
add a comment |
So, I am working on a C++ application that currently uses C sockets to transfer data between peers. There are n peers and all run the same code. In the application logic, any peer can need to transfer (possibly large) data to any other peer and so connections are first open between all possible combinations of peers. The requirement is that the application logic and the network transfers of (possibly large) data should be as fast as possible.
As of present, between any 2 peers (say A and B), the application opens 2 types of connections - one where A is the server and B is the client and vice versa. This was possibly done so that if A needs to transfer data to B and vice versa concurrently, the whole thing can finish faster than just having one connection type from A to B. For each connection type (say where A is the server and B the client), the application then opens 3 TCP connections (using C-sockets). However, the way its presently coded it only ends up using only one of these 3 connections.
Upon seeing this, I began to wonder that to make optimal use of N open connections, maybe one can use round-robin or some policy to break data in chunks and tranfer at same time. However, the question of how many parallel TCP connections should be open and what policy be used between these connections is not clear to me. On what factors does this answer also depend ? For example, if i have 1000 TCP connections open, whats the harm ? (ignoring the system constraints like running out of ports etc.)
If someone can throw light on how applications today make use of multiple parallel TCP connections to be most performant, that would be great. Quick google search leads me to several research papers, but I am also interested in knowing how do for example web browsers solve this problem.
Thanks!
UPDATE : After talking to a few people with more knowledge of TCP, I have come to have a better picture.
Firstly, my premise that opening two types of connections between A and B (one where A is client and B server and vice versa) will help in increasing net throughput seems wrong. Opening one type of TCP connection between A and B should suffice. This depends on whether datagrams are able to travel from A to B and vice versa at the same time. I found this link to be useful : Is TCP bidirectional or full-duplex?.
Also, to make use of full bandwidth available to me, its better to open multiple TCP connections. I found this highly relevant link : TCP is it possible to achieve higher transfer rate with multiple connections?
But the question of how many such connections should be open still remains. It would be great if someone can answer that.
c++ c sockets networking tcp
Take a look Here howselect()
works.
– Michi
Jan 4 at 15:45
1
More TCP connections will make things worse. It means more state to keep track of and modify and it will often mean that two packets are needed where one would suffice. (For example, ACKs can't piggyback on data if they're for different TCP connections, but they can if they're the same one.) It will mean much more work discovering latency and bandwidth, more instances of slow start, and so on.
– David Schwartz
Jan 6 at 7:11
add a comment |
So, I am working on a C++ application that currently uses C sockets to transfer data between peers. There are n peers and all run the same code. In the application logic, any peer can need to transfer (possibly large) data to any other peer and so connections are first open between all possible combinations of peers. The requirement is that the application logic and the network transfers of (possibly large) data should be as fast as possible.
As of present, between any 2 peers (say A and B), the application opens 2 types of connections - one where A is the server and B is the client and vice versa. This was possibly done so that if A needs to transfer data to B and vice versa concurrently, the whole thing can finish faster than just having one connection type from A to B. For each connection type (say where A is the server and B the client), the application then opens 3 TCP connections (using C-sockets). However, the way its presently coded it only ends up using only one of these 3 connections.
Upon seeing this, I began to wonder that to make optimal use of N open connections, maybe one can use round-robin or some policy to break data in chunks and tranfer at same time. However, the question of how many parallel TCP connections should be open and what policy be used between these connections is not clear to me. On what factors does this answer also depend ? For example, if i have 1000 TCP connections open, whats the harm ? (ignoring the system constraints like running out of ports etc.)
If someone can throw light on how applications today make use of multiple parallel TCP connections to be most performant, that would be great. Quick google search leads me to several research papers, but I am also interested in knowing how do for example web browsers solve this problem.
Thanks!
UPDATE : After talking to a few people with more knowledge of TCP, I have come to have a better picture.
Firstly, my premise that opening two types of connections between A and B (one where A is client and B server and vice versa) will help in increasing net throughput seems wrong. Opening one type of TCP connection between A and B should suffice. This depends on whether datagrams are able to travel from A to B and vice versa at the same time. I found this link to be useful : Is TCP bidirectional or full-duplex?.
Also, to make use of full bandwidth available to me, its better to open multiple TCP connections. I found this highly relevant link : TCP is it possible to achieve higher transfer rate with multiple connections?
But the question of how many such connections should be open still remains. It would be great if someone can answer that.
c++ c sockets networking tcp
So, I am working on a C++ application that currently uses C sockets to transfer data between peers. There are n peers and all run the same code. In the application logic, any peer can need to transfer (possibly large) data to any other peer and so connections are first open between all possible combinations of peers. The requirement is that the application logic and the network transfers of (possibly large) data should be as fast as possible.
As of present, between any 2 peers (say A and B), the application opens 2 types of connections - one where A is the server and B is the client and vice versa. This was possibly done so that if A needs to transfer data to B and vice versa concurrently, the whole thing can finish faster than just having one connection type from A to B. For each connection type (say where A is the server and B the client), the application then opens 3 TCP connections (using C-sockets). However, the way its presently coded it only ends up using only one of these 3 connections.
Upon seeing this, I began to wonder that to make optimal use of N open connections, maybe one can use round-robin or some policy to break data in chunks and tranfer at same time. However, the question of how many parallel TCP connections should be open and what policy be used between these connections is not clear to me. On what factors does this answer also depend ? For example, if i have 1000 TCP connections open, whats the harm ? (ignoring the system constraints like running out of ports etc.)
If someone can throw light on how applications today make use of multiple parallel TCP connections to be most performant, that would be great. Quick google search leads me to several research papers, but I am also interested in knowing how do for example web browsers solve this problem.
Thanks!
UPDATE : After talking to a few people with more knowledge of TCP, I have come to have a better picture.
Firstly, my premise that opening two types of connections between A and B (one where A is client and B server and vice versa) will help in increasing net throughput seems wrong. Opening one type of TCP connection between A and B should suffice. This depends on whether datagrams are able to travel from A to B and vice versa at the same time. I found this link to be useful : Is TCP bidirectional or full-duplex?.
Also, to make use of full bandwidth available to me, its better to open multiple TCP connections. I found this highly relevant link : TCP is it possible to achieve higher transfer rate with multiple connections?
But the question of how many such connections should be open still remains. It would be great if someone can answer that.
c++ c sockets networking tcp
c++ c sockets networking tcp
edited Jan 6 at 7:03
wholesaleLion
asked Jan 4 at 15:10
wholesaleLionwholesaleLion
4916
4916
Take a look Here howselect()
works.
– Michi
Jan 4 at 15:45
1
More TCP connections will make things worse. It means more state to keep track of and modify and it will often mean that two packets are needed where one would suffice. (For example, ACKs can't piggyback on data if they're for different TCP connections, but they can if they're the same one.) It will mean much more work discovering latency and bandwidth, more instances of slow start, and so on.
– David Schwartz
Jan 6 at 7:11
add a comment |
Take a look Here howselect()
works.
– Michi
Jan 4 at 15:45
1
More TCP connections will make things worse. It means more state to keep track of and modify and it will often mean that two packets are needed where one would suffice. (For example, ACKs can't piggyback on data if they're for different TCP connections, but they can if they're the same one.) It will mean much more work discovering latency and bandwidth, more instances of slow start, and so on.
– David Schwartz
Jan 6 at 7:11
Take a look Here how
select()
works.– Michi
Jan 4 at 15:45
Take a look Here how
select()
works.– Michi
Jan 4 at 15:45
1
1
More TCP connections will make things worse. It means more state to keep track of and modify and it will often mean that two packets are needed where one would suffice. (For example, ACKs can't piggyback on data if they're for different TCP connections, but they can if they're the same one.) It will mean much more work discovering latency and bandwidth, more instances of slow start, and so on.
– David Schwartz
Jan 6 at 7:11
More TCP connections will make things worse. It means more state to keep track of and modify and it will often mean that two packets are needed where one would suffice. (For example, ACKs can't piggyback on data if they're for different TCP connections, but they can if they're the same one.) It will mean much more work discovering latency and bandwidth, more instances of slow start, and so on.
– David Schwartz
Jan 6 at 7:11
add a comment |
2 Answers
2
active
oldest
votes
You didn't specify OS, so I will assume it's Linux we're talking about.
I think you need to do some research about non-blocking IO, say epoll or asio. It is currently the most effective and scalable way to work with multiple connections simultaneously.
You can start here, for example.
Some performance analysis can be found here or here.
I think thatselect()
works well in this Situation.
– Michi
Jan 4 at 15:40
select()
is OK until you need to scale. I would say thatepoll
should be a default choice.
– grungegurunge
Jan 4 at 15:45
Sure. I see how async programming and non-blocking network calls will help. But my question is a little bit different. Say you have 1 GB of data to transfer and you open one connection and start writing 100 MB of data using one thread. Non-blocking IO would mean my application thread doesn't remain blocked on the IO call and can continue processing. But for better efficiency, if i know the last 100 MB write is going to take some time, I should open another connection and start writing another 100 MB on it and so on and so forth. My question is how many of such new connections be open ?
– wholesaleLion
Jan 4 at 18:30
I am sure modern applications including say browsers open parallel connections in addition to doing non-blocking IO - how many connections do they open ? Its also going to be a function of the OS and system parameters, but theoretically, what should the ideal value be ?
– wholesaleLion
Jan 4 at 18:30
How multiple parallel connections are going to increase speed? I would say that browsers use parallel connections because if one of the resources download gets stuck, it doesn't affect the whole page displaying. Another case is when there is a bandwidth limit per connection on remote side, and you use parallel connections to mitigate it, but that's very specific case.
– grungegurunge
Jan 4 at 20:01
|
show 1 more comment
When transferring data between two hosts, there is unlikely to be any significant throughput advantage to be obtained by using more than one TCP socket. With proper programming, a single TCP connection can saturate the link's bandwidth in both directions simultaneously (i.e. it can do full-duplex/2-way transfers at line speed). Splitting the data across multiple TCP connections merely adds overhead; in the best-case scenario, each of the N connections will transfer at 1/N the speed of the single connection (and in real life, less than that, due to additional packet headers, bandwidth contention, etc).
There is one potential (minor) benefit that can be realized by using multiple TCP streams, however -- that benefit is seen only in the case where the data being transferred in stream A is logically independent of the data being transferred in stream B. If that is the case (i.e. if the receiver can immediately make use of data in stream A, without having to wait for data in stream B to arrive first), then having multiple streams can make your data transfer somewhat more resilient to packet-dropouts.
For example, if stream A drops a packet, that will cause stream A to have to briefly pause while it retransmits the dropped packet, but in the meantime stream B's data may continue to flow without interruption, since stream B is operating independently from stream A. (If the A-data and the B-data were both being sent over the same TCP stream, OTOH, the B-data would be forced to wait for the lost A-packet to be retransmitted, since strict FIFO-ordering is always enforced within a TCP stream).
Note that this benefit is likely smaller than you might think, though, since in many cases the problem that caused one TCP stream to lose packets will also simultaneously cause any other TCP streams going over the same network path to lose packets too.
Thank you for your answer! However, as I have come to understand, to optimally utilize the bandwidth, either you should change the TCP window size so that link from A to B is full with data at all times, or in general, if you don't change the window size, then having multiple streams might be useful to saturate the link. Check out this : link.
– wholesaleLion
Jan 6 at 16:14
However, I agree with your point that with right programming (like setting correct window size), one can do optimally with one TCP connection. However, increasing window size, as I understand, has other unintended consequences, like increased buffer size etc. I wonder how general network optimizing applications get over this - do they work with only single connection - any idea ?
– wholesaleLion
Jan 6 at 16:17
I can’t speak for most network apps, but in mine I use multiple tcp connections only where it makes logical sense to do so (eg when communicating with multiple servers at once) and the default settings, and rely on the assumption that the TCP stack implementers will have chosen the defaults that they felt were the best possible for most common cases (rather than trying to second-guess their decisions). This yields results that are good enough for my purposes (eg ~100MB/sec across a local gigabit Ethernet LAN)
– Jeremy Friesner
Jan 7 at 2:06
I looked again in my program the does 100MB/sec file transfers, and I see that I do set the send-buffer and receive-buffer sizes to 64KB each in that program. (I don't think doing so affects the behavior of the so-modified TCP sockets other than causing them to allocate some additional RAM, and makes them less likely to have to drop a received packet due to the receive-buffer being full, and makes them less likely to pause on sending due to the send-buffer running empty before the socket's I/O thread can add more data to it)
– Jeremy Friesner
Jan 7 at 2:17
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54041542%2fnumber-of-parallel-sockets-tcp-connections-to-open-for-optimal-use-by-applicatio%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You didn't specify OS, so I will assume it's Linux we're talking about.
I think you need to do some research about non-blocking IO, say epoll or asio. It is currently the most effective and scalable way to work with multiple connections simultaneously.
You can start here, for example.
Some performance analysis can be found here or here.
I think thatselect()
works well in this Situation.
– Michi
Jan 4 at 15:40
select()
is OK until you need to scale. I would say thatepoll
should be a default choice.
– grungegurunge
Jan 4 at 15:45
Sure. I see how async programming and non-blocking network calls will help. But my question is a little bit different. Say you have 1 GB of data to transfer and you open one connection and start writing 100 MB of data using one thread. Non-blocking IO would mean my application thread doesn't remain blocked on the IO call and can continue processing. But for better efficiency, if i know the last 100 MB write is going to take some time, I should open another connection and start writing another 100 MB on it and so on and so forth. My question is how many of such new connections be open ?
– wholesaleLion
Jan 4 at 18:30
I am sure modern applications including say browsers open parallel connections in addition to doing non-blocking IO - how many connections do they open ? Its also going to be a function of the OS and system parameters, but theoretically, what should the ideal value be ?
– wholesaleLion
Jan 4 at 18:30
How multiple parallel connections are going to increase speed? I would say that browsers use parallel connections because if one of the resources download gets stuck, it doesn't affect the whole page displaying. Another case is when there is a bandwidth limit per connection on remote side, and you use parallel connections to mitigate it, but that's very specific case.
– grungegurunge
Jan 4 at 20:01
|
show 1 more comment
You didn't specify OS, so I will assume it's Linux we're talking about.
I think you need to do some research about non-blocking IO, say epoll or asio. It is currently the most effective and scalable way to work with multiple connections simultaneously.
You can start here, for example.
Some performance analysis can be found here or here.
I think thatselect()
works well in this Situation.
– Michi
Jan 4 at 15:40
select()
is OK until you need to scale. I would say thatepoll
should be a default choice.
– grungegurunge
Jan 4 at 15:45
Sure. I see how async programming and non-blocking network calls will help. But my question is a little bit different. Say you have 1 GB of data to transfer and you open one connection and start writing 100 MB of data using one thread. Non-blocking IO would mean my application thread doesn't remain blocked on the IO call and can continue processing. But for better efficiency, if i know the last 100 MB write is going to take some time, I should open another connection and start writing another 100 MB on it and so on and so forth. My question is how many of such new connections be open ?
– wholesaleLion
Jan 4 at 18:30
I am sure modern applications including say browsers open parallel connections in addition to doing non-blocking IO - how many connections do they open ? Its also going to be a function of the OS and system parameters, but theoretically, what should the ideal value be ?
– wholesaleLion
Jan 4 at 18:30
How multiple parallel connections are going to increase speed? I would say that browsers use parallel connections because if one of the resources download gets stuck, it doesn't affect the whole page displaying. Another case is when there is a bandwidth limit per connection on remote side, and you use parallel connections to mitigate it, but that's very specific case.
– grungegurunge
Jan 4 at 20:01
|
show 1 more comment
You didn't specify OS, so I will assume it's Linux we're talking about.
I think you need to do some research about non-blocking IO, say epoll or asio. It is currently the most effective and scalable way to work with multiple connections simultaneously.
You can start here, for example.
Some performance analysis can be found here or here.
You didn't specify OS, so I will assume it's Linux we're talking about.
I think you need to do some research about non-blocking IO, say epoll or asio. It is currently the most effective and scalable way to work with multiple connections simultaneously.
You can start here, for example.
Some performance analysis can be found here or here.
edited Jan 4 at 15:27
answered Jan 4 at 15:21
grungegurungegrungegurunge
47538
47538
I think thatselect()
works well in this Situation.
– Michi
Jan 4 at 15:40
select()
is OK until you need to scale. I would say thatepoll
should be a default choice.
– grungegurunge
Jan 4 at 15:45
Sure. I see how async programming and non-blocking network calls will help. But my question is a little bit different. Say you have 1 GB of data to transfer and you open one connection and start writing 100 MB of data using one thread. Non-blocking IO would mean my application thread doesn't remain blocked on the IO call and can continue processing. But for better efficiency, if i know the last 100 MB write is going to take some time, I should open another connection and start writing another 100 MB on it and so on and so forth. My question is how many of such new connections be open ?
– wholesaleLion
Jan 4 at 18:30
I am sure modern applications including say browsers open parallel connections in addition to doing non-blocking IO - how many connections do they open ? Its also going to be a function of the OS and system parameters, but theoretically, what should the ideal value be ?
– wholesaleLion
Jan 4 at 18:30
How multiple parallel connections are going to increase speed? I would say that browsers use parallel connections because if one of the resources download gets stuck, it doesn't affect the whole page displaying. Another case is when there is a bandwidth limit per connection on remote side, and you use parallel connections to mitigate it, but that's very specific case.
– grungegurunge
Jan 4 at 20:01
|
show 1 more comment
I think thatselect()
works well in this Situation.
– Michi
Jan 4 at 15:40
select()
is OK until you need to scale. I would say thatepoll
should be a default choice.
– grungegurunge
Jan 4 at 15:45
Sure. I see how async programming and non-blocking network calls will help. But my question is a little bit different. Say you have 1 GB of data to transfer and you open one connection and start writing 100 MB of data using one thread. Non-blocking IO would mean my application thread doesn't remain blocked on the IO call and can continue processing. But for better efficiency, if i know the last 100 MB write is going to take some time, I should open another connection and start writing another 100 MB on it and so on and so forth. My question is how many of such new connections be open ?
– wholesaleLion
Jan 4 at 18:30
I am sure modern applications including say browsers open parallel connections in addition to doing non-blocking IO - how many connections do they open ? Its also going to be a function of the OS and system parameters, but theoretically, what should the ideal value be ?
– wholesaleLion
Jan 4 at 18:30
How multiple parallel connections are going to increase speed? I would say that browsers use parallel connections because if one of the resources download gets stuck, it doesn't affect the whole page displaying. Another case is when there is a bandwidth limit per connection on remote side, and you use parallel connections to mitigate it, but that's very specific case.
– grungegurunge
Jan 4 at 20:01
I think that
select()
works well in this Situation.– Michi
Jan 4 at 15:40
I think that
select()
works well in this Situation.– Michi
Jan 4 at 15:40
select()
is OK until you need to scale. I would say that epoll
should be a default choice.– grungegurunge
Jan 4 at 15:45
select()
is OK until you need to scale. I would say that epoll
should be a default choice.– grungegurunge
Jan 4 at 15:45
Sure. I see how async programming and non-blocking network calls will help. But my question is a little bit different. Say you have 1 GB of data to transfer and you open one connection and start writing 100 MB of data using one thread. Non-blocking IO would mean my application thread doesn't remain blocked on the IO call and can continue processing. But for better efficiency, if i know the last 100 MB write is going to take some time, I should open another connection and start writing another 100 MB on it and so on and so forth. My question is how many of such new connections be open ?
– wholesaleLion
Jan 4 at 18:30
Sure. I see how async programming and non-blocking network calls will help. But my question is a little bit different. Say you have 1 GB of data to transfer and you open one connection and start writing 100 MB of data using one thread. Non-blocking IO would mean my application thread doesn't remain blocked on the IO call and can continue processing. But for better efficiency, if i know the last 100 MB write is going to take some time, I should open another connection and start writing another 100 MB on it and so on and so forth. My question is how many of such new connections be open ?
– wholesaleLion
Jan 4 at 18:30
I am sure modern applications including say browsers open parallel connections in addition to doing non-blocking IO - how many connections do they open ? Its also going to be a function of the OS and system parameters, but theoretically, what should the ideal value be ?
– wholesaleLion
Jan 4 at 18:30
I am sure modern applications including say browsers open parallel connections in addition to doing non-blocking IO - how many connections do they open ? Its also going to be a function of the OS and system parameters, but theoretically, what should the ideal value be ?
– wholesaleLion
Jan 4 at 18:30
How multiple parallel connections are going to increase speed? I would say that browsers use parallel connections because if one of the resources download gets stuck, it doesn't affect the whole page displaying. Another case is when there is a bandwidth limit per connection on remote side, and you use parallel connections to mitigate it, but that's very specific case.
– grungegurunge
Jan 4 at 20:01
How multiple parallel connections are going to increase speed? I would say that browsers use parallel connections because if one of the resources download gets stuck, it doesn't affect the whole page displaying. Another case is when there is a bandwidth limit per connection on remote side, and you use parallel connections to mitigate it, but that's very specific case.
– grungegurunge
Jan 4 at 20:01
|
show 1 more comment
When transferring data between two hosts, there is unlikely to be any significant throughput advantage to be obtained by using more than one TCP socket. With proper programming, a single TCP connection can saturate the link's bandwidth in both directions simultaneously (i.e. it can do full-duplex/2-way transfers at line speed). Splitting the data across multiple TCP connections merely adds overhead; in the best-case scenario, each of the N connections will transfer at 1/N the speed of the single connection (and in real life, less than that, due to additional packet headers, bandwidth contention, etc).
There is one potential (minor) benefit that can be realized by using multiple TCP streams, however -- that benefit is seen only in the case where the data being transferred in stream A is logically independent of the data being transferred in stream B. If that is the case (i.e. if the receiver can immediately make use of data in stream A, without having to wait for data in stream B to arrive first), then having multiple streams can make your data transfer somewhat more resilient to packet-dropouts.
For example, if stream A drops a packet, that will cause stream A to have to briefly pause while it retransmits the dropped packet, but in the meantime stream B's data may continue to flow without interruption, since stream B is operating independently from stream A. (If the A-data and the B-data were both being sent over the same TCP stream, OTOH, the B-data would be forced to wait for the lost A-packet to be retransmitted, since strict FIFO-ordering is always enforced within a TCP stream).
Note that this benefit is likely smaller than you might think, though, since in many cases the problem that caused one TCP stream to lose packets will also simultaneously cause any other TCP streams going over the same network path to lose packets too.
Thank you for your answer! However, as I have come to understand, to optimally utilize the bandwidth, either you should change the TCP window size so that link from A to B is full with data at all times, or in general, if you don't change the window size, then having multiple streams might be useful to saturate the link. Check out this : link.
– wholesaleLion
Jan 6 at 16:14
However, I agree with your point that with right programming (like setting correct window size), one can do optimally with one TCP connection. However, increasing window size, as I understand, has other unintended consequences, like increased buffer size etc. I wonder how general network optimizing applications get over this - do they work with only single connection - any idea ?
– wholesaleLion
Jan 6 at 16:17
I can’t speak for most network apps, but in mine I use multiple tcp connections only where it makes logical sense to do so (eg when communicating with multiple servers at once) and the default settings, and rely on the assumption that the TCP stack implementers will have chosen the defaults that they felt were the best possible for most common cases (rather than trying to second-guess their decisions). This yields results that are good enough for my purposes (eg ~100MB/sec across a local gigabit Ethernet LAN)
– Jeremy Friesner
Jan 7 at 2:06
I looked again in my program the does 100MB/sec file transfers, and I see that I do set the send-buffer and receive-buffer sizes to 64KB each in that program. (I don't think doing so affects the behavior of the so-modified TCP sockets other than causing them to allocate some additional RAM, and makes them less likely to have to drop a received packet due to the receive-buffer being full, and makes them less likely to pause on sending due to the send-buffer running empty before the socket's I/O thread can add more data to it)
– Jeremy Friesner
Jan 7 at 2:17
add a comment |
When transferring data between two hosts, there is unlikely to be any significant throughput advantage to be obtained by using more than one TCP socket. With proper programming, a single TCP connection can saturate the link's bandwidth in both directions simultaneously (i.e. it can do full-duplex/2-way transfers at line speed). Splitting the data across multiple TCP connections merely adds overhead; in the best-case scenario, each of the N connections will transfer at 1/N the speed of the single connection (and in real life, less than that, due to additional packet headers, bandwidth contention, etc).
There is one potential (minor) benefit that can be realized by using multiple TCP streams, however -- that benefit is seen only in the case where the data being transferred in stream A is logically independent of the data being transferred in stream B. If that is the case (i.e. if the receiver can immediately make use of data in stream A, without having to wait for data in stream B to arrive first), then having multiple streams can make your data transfer somewhat more resilient to packet-dropouts.
For example, if stream A drops a packet, that will cause stream A to have to briefly pause while it retransmits the dropped packet, but in the meantime stream B's data may continue to flow without interruption, since stream B is operating independently from stream A. (If the A-data and the B-data were both being sent over the same TCP stream, OTOH, the B-data would be forced to wait for the lost A-packet to be retransmitted, since strict FIFO-ordering is always enforced within a TCP stream).
Note that this benefit is likely smaller than you might think, though, since in many cases the problem that caused one TCP stream to lose packets will also simultaneously cause any other TCP streams going over the same network path to lose packets too.
Thank you for your answer! However, as I have come to understand, to optimally utilize the bandwidth, either you should change the TCP window size so that link from A to B is full with data at all times, or in general, if you don't change the window size, then having multiple streams might be useful to saturate the link. Check out this : link.
– wholesaleLion
Jan 6 at 16:14
However, I agree with your point that with right programming (like setting correct window size), one can do optimally with one TCP connection. However, increasing window size, as I understand, has other unintended consequences, like increased buffer size etc. I wonder how general network optimizing applications get over this - do they work with only single connection - any idea ?
– wholesaleLion
Jan 6 at 16:17
I can’t speak for most network apps, but in mine I use multiple tcp connections only where it makes logical sense to do so (eg when communicating with multiple servers at once) and the default settings, and rely on the assumption that the TCP stack implementers will have chosen the defaults that they felt were the best possible for most common cases (rather than trying to second-guess their decisions). This yields results that are good enough for my purposes (eg ~100MB/sec across a local gigabit Ethernet LAN)
– Jeremy Friesner
Jan 7 at 2:06
I looked again in my program the does 100MB/sec file transfers, and I see that I do set the send-buffer and receive-buffer sizes to 64KB each in that program. (I don't think doing so affects the behavior of the so-modified TCP sockets other than causing them to allocate some additional RAM, and makes them less likely to have to drop a received packet due to the receive-buffer being full, and makes them less likely to pause on sending due to the send-buffer running empty before the socket's I/O thread can add more data to it)
– Jeremy Friesner
Jan 7 at 2:17
add a comment |
When transferring data between two hosts, there is unlikely to be any significant throughput advantage to be obtained by using more than one TCP socket. With proper programming, a single TCP connection can saturate the link's bandwidth in both directions simultaneously (i.e. it can do full-duplex/2-way transfers at line speed). Splitting the data across multiple TCP connections merely adds overhead; in the best-case scenario, each of the N connections will transfer at 1/N the speed of the single connection (and in real life, less than that, due to additional packet headers, bandwidth contention, etc).
There is one potential (minor) benefit that can be realized by using multiple TCP streams, however -- that benefit is seen only in the case where the data being transferred in stream A is logically independent of the data being transferred in stream B. If that is the case (i.e. if the receiver can immediately make use of data in stream A, without having to wait for data in stream B to arrive first), then having multiple streams can make your data transfer somewhat more resilient to packet-dropouts.
For example, if stream A drops a packet, that will cause stream A to have to briefly pause while it retransmits the dropped packet, but in the meantime stream B's data may continue to flow without interruption, since stream B is operating independently from stream A. (If the A-data and the B-data were both being sent over the same TCP stream, OTOH, the B-data would be forced to wait for the lost A-packet to be retransmitted, since strict FIFO-ordering is always enforced within a TCP stream).
Note that this benefit is likely smaller than you might think, though, since in many cases the problem that caused one TCP stream to lose packets will also simultaneously cause any other TCP streams going over the same network path to lose packets too.
When transferring data between two hosts, there is unlikely to be any significant throughput advantage to be obtained by using more than one TCP socket. With proper programming, a single TCP connection can saturate the link's bandwidth in both directions simultaneously (i.e. it can do full-duplex/2-way transfers at line speed). Splitting the data across multiple TCP connections merely adds overhead; in the best-case scenario, each of the N connections will transfer at 1/N the speed of the single connection (and in real life, less than that, due to additional packet headers, bandwidth contention, etc).
There is one potential (minor) benefit that can be realized by using multiple TCP streams, however -- that benefit is seen only in the case where the data being transferred in stream A is logically independent of the data being transferred in stream B. If that is the case (i.e. if the receiver can immediately make use of data in stream A, without having to wait for data in stream B to arrive first), then having multiple streams can make your data transfer somewhat more resilient to packet-dropouts.
For example, if stream A drops a packet, that will cause stream A to have to briefly pause while it retransmits the dropped packet, but in the meantime stream B's data may continue to flow without interruption, since stream B is operating independently from stream A. (If the A-data and the B-data were both being sent over the same TCP stream, OTOH, the B-data would be forced to wait for the lost A-packet to be retransmitted, since strict FIFO-ordering is always enforced within a TCP stream).
Note that this benefit is likely smaller than you might think, though, since in many cases the problem that caused one TCP stream to lose packets will also simultaneously cause any other TCP streams going over the same network path to lose packets too.
answered Jan 6 at 7:22
Jeremy FriesnerJeremy Friesner
40.5k1181165
40.5k1181165
Thank you for your answer! However, as I have come to understand, to optimally utilize the bandwidth, either you should change the TCP window size so that link from A to B is full with data at all times, or in general, if you don't change the window size, then having multiple streams might be useful to saturate the link. Check out this : link.
– wholesaleLion
Jan 6 at 16:14
However, I agree with your point that with right programming (like setting correct window size), one can do optimally with one TCP connection. However, increasing window size, as I understand, has other unintended consequences, like increased buffer size etc. I wonder how general network optimizing applications get over this - do they work with only single connection - any idea ?
– wholesaleLion
Jan 6 at 16:17
I can’t speak for most network apps, but in mine I use multiple tcp connections only where it makes logical sense to do so (eg when communicating with multiple servers at once) and the default settings, and rely on the assumption that the TCP stack implementers will have chosen the defaults that they felt were the best possible for most common cases (rather than trying to second-guess their decisions). This yields results that are good enough for my purposes (eg ~100MB/sec across a local gigabit Ethernet LAN)
– Jeremy Friesner
Jan 7 at 2:06
I looked again in my program the does 100MB/sec file transfers, and I see that I do set the send-buffer and receive-buffer sizes to 64KB each in that program. (I don't think doing so affects the behavior of the so-modified TCP sockets other than causing them to allocate some additional RAM, and makes them less likely to have to drop a received packet due to the receive-buffer being full, and makes them less likely to pause on sending due to the send-buffer running empty before the socket's I/O thread can add more data to it)
– Jeremy Friesner
Jan 7 at 2:17
add a comment |
Thank you for your answer! However, as I have come to understand, to optimally utilize the bandwidth, either you should change the TCP window size so that link from A to B is full with data at all times, or in general, if you don't change the window size, then having multiple streams might be useful to saturate the link. Check out this : link.
– wholesaleLion
Jan 6 at 16:14
However, I agree with your point that with right programming (like setting correct window size), one can do optimally with one TCP connection. However, increasing window size, as I understand, has other unintended consequences, like increased buffer size etc. I wonder how general network optimizing applications get over this - do they work with only single connection - any idea ?
– wholesaleLion
Jan 6 at 16:17
I can’t speak for most network apps, but in mine I use multiple tcp connections only where it makes logical sense to do so (eg when communicating with multiple servers at once) and the default settings, and rely on the assumption that the TCP stack implementers will have chosen the defaults that they felt were the best possible for most common cases (rather than trying to second-guess their decisions). This yields results that are good enough for my purposes (eg ~100MB/sec across a local gigabit Ethernet LAN)
– Jeremy Friesner
Jan 7 at 2:06
I looked again in my program the does 100MB/sec file transfers, and I see that I do set the send-buffer and receive-buffer sizes to 64KB each in that program. (I don't think doing so affects the behavior of the so-modified TCP sockets other than causing them to allocate some additional RAM, and makes them less likely to have to drop a received packet due to the receive-buffer being full, and makes them less likely to pause on sending due to the send-buffer running empty before the socket's I/O thread can add more data to it)
– Jeremy Friesner
Jan 7 at 2:17
Thank you for your answer! However, as I have come to understand, to optimally utilize the bandwidth, either you should change the TCP window size so that link from A to B is full with data at all times, or in general, if you don't change the window size, then having multiple streams might be useful to saturate the link. Check out this : link.
– wholesaleLion
Jan 6 at 16:14
Thank you for your answer! However, as I have come to understand, to optimally utilize the bandwidth, either you should change the TCP window size so that link from A to B is full with data at all times, or in general, if you don't change the window size, then having multiple streams might be useful to saturate the link. Check out this : link.
– wholesaleLion
Jan 6 at 16:14
However, I agree with your point that with right programming (like setting correct window size), one can do optimally with one TCP connection. However, increasing window size, as I understand, has other unintended consequences, like increased buffer size etc. I wonder how general network optimizing applications get over this - do they work with only single connection - any idea ?
– wholesaleLion
Jan 6 at 16:17
However, I agree with your point that with right programming (like setting correct window size), one can do optimally with one TCP connection. However, increasing window size, as I understand, has other unintended consequences, like increased buffer size etc. I wonder how general network optimizing applications get over this - do they work with only single connection - any idea ?
– wholesaleLion
Jan 6 at 16:17
I can’t speak for most network apps, but in mine I use multiple tcp connections only where it makes logical sense to do so (eg when communicating with multiple servers at once) and the default settings, and rely on the assumption that the TCP stack implementers will have chosen the defaults that they felt were the best possible for most common cases (rather than trying to second-guess their decisions). This yields results that are good enough for my purposes (eg ~100MB/sec across a local gigabit Ethernet LAN)
– Jeremy Friesner
Jan 7 at 2:06
I can’t speak for most network apps, but in mine I use multiple tcp connections only where it makes logical sense to do so (eg when communicating with multiple servers at once) and the default settings, and rely on the assumption that the TCP stack implementers will have chosen the defaults that they felt were the best possible for most common cases (rather than trying to second-guess their decisions). This yields results that are good enough for my purposes (eg ~100MB/sec across a local gigabit Ethernet LAN)
– Jeremy Friesner
Jan 7 at 2:06
I looked again in my program the does 100MB/sec file transfers, and I see that I do set the send-buffer and receive-buffer sizes to 64KB each in that program. (I don't think doing so affects the behavior of the so-modified TCP sockets other than causing them to allocate some additional RAM, and makes them less likely to have to drop a received packet due to the receive-buffer being full, and makes them less likely to pause on sending due to the send-buffer running empty before the socket's I/O thread can add more data to it)
– Jeremy Friesner
Jan 7 at 2:17
I looked again in my program the does 100MB/sec file transfers, and I see that I do set the send-buffer and receive-buffer sizes to 64KB each in that program. (I don't think doing so affects the behavior of the so-modified TCP sockets other than causing them to allocate some additional RAM, and makes them less likely to have to drop a received packet due to the receive-buffer being full, and makes them less likely to pause on sending due to the send-buffer running empty before the socket's I/O thread can add more data to it)
– Jeremy Friesner
Jan 7 at 2:17
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54041542%2fnumber-of-parallel-sockets-tcp-connections-to-open-for-optimal-use-by-applicatio%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Take a look Here how
select()
works.– Michi
Jan 4 at 15:45
1
More TCP connections will make things worse. It means more state to keep track of and modify and it will often mean that two packets are needed where one would suffice. (For example, ACKs can't piggyback on data if they're for different TCP connections, but they can if they're the same one.) It will mean much more work discovering latency and bandwidth, more instances of slow start, and so on.
– David Schwartz
Jan 6 at 7:11