dd is producing a 32 MB random file instead of 1 GB
I wanted to produce a 1 GB random file, so I used following command.
dd if=/dev/urandom of=output bs=1G count=1
But instead every time I launch this command I get a 32 MB file:
<11:58:40>$ dd if=/dev/urandom of=output bs=1G count=1
0+1 records in
0+1 records out
33554431 bytes (34 MB, 32 MiB) copied, 0,288321 s, 116 MB/s
What is wrong?
script dd random-number-generator
|
show 1 more comment
I wanted to produce a 1 GB random file, so I used following command.
dd if=/dev/urandom of=output bs=1G count=1
But instead every time I launch this command I get a 32 MB file:
<11:58:40>$ dd if=/dev/urandom of=output bs=1G count=1
0+1 records in
0+1 records out
33554431 bytes (34 MB, 32 MiB) copied, 0,288321 s, 116 MB/s
What is wrong?
script dd random-number-generator
3
IMHO I don't think there are many valid use cases fordd
at all. I'd usehead
,cat
orrsync
in its place almost always. And your question if one of the reasons why the alternatives are usually safer.
– Bakuriu
2 days ago
@Bakuriu - also, if you just want to produce a file full of zeroes (or rather you do not care about what is inside it) use truncate. It is much faster.
– Konrad Gajewski
2 days ago
@KonradGajewski FYI truncate tries to make a sparse file (if that matters)
– Xen2050
yesterday
4
@Bakuriuhead
cannot do this task without the-c
option that isn't in POSIX. I don't know any version ofcat
which can solve this.rsync
is a completely nonstandard utility. That is neither here nr there; skimming through its man page, I don't see how it can solve this problem, either.
– Kaz
15 hours ago
Technically,/dev/urandom
isn't in POSIX either...
– grawity
6 hours ago
|
show 1 more comment
I wanted to produce a 1 GB random file, so I used following command.
dd if=/dev/urandom of=output bs=1G count=1
But instead every time I launch this command I get a 32 MB file:
<11:58:40>$ dd if=/dev/urandom of=output bs=1G count=1
0+1 records in
0+1 records out
33554431 bytes (34 MB, 32 MiB) copied, 0,288321 s, 116 MB/s
What is wrong?
script dd random-number-generator
I wanted to produce a 1 GB random file, so I used following command.
dd if=/dev/urandom of=output bs=1G count=1
But instead every time I launch this command I get a 32 MB file:
<11:58:40>$ dd if=/dev/urandom of=output bs=1G count=1
0+1 records in
0+1 records out
33554431 bytes (34 MB, 32 MiB) copied, 0,288321 s, 116 MB/s
What is wrong?
script dd random-number-generator
script dd random-number-generator
edited Dec 28 at 12:30
Peter Mortensen
8,331166184
8,331166184
asked Dec 27 at 11:01
Trismegistos
34137
34137
3
IMHO I don't think there are many valid use cases fordd
at all. I'd usehead
,cat
orrsync
in its place almost always. And your question if one of the reasons why the alternatives are usually safer.
– Bakuriu
2 days ago
@Bakuriu - also, if you just want to produce a file full of zeroes (or rather you do not care about what is inside it) use truncate. It is much faster.
– Konrad Gajewski
2 days ago
@KonradGajewski FYI truncate tries to make a sparse file (if that matters)
– Xen2050
yesterday
4
@Bakuriuhead
cannot do this task without the-c
option that isn't in POSIX. I don't know any version ofcat
which can solve this.rsync
is a completely nonstandard utility. That is neither here nr there; skimming through its man page, I don't see how it can solve this problem, either.
– Kaz
15 hours ago
Technically,/dev/urandom
isn't in POSIX either...
– grawity
6 hours ago
|
show 1 more comment
3
IMHO I don't think there are many valid use cases fordd
at all. I'd usehead
,cat
orrsync
in its place almost always. And your question if one of the reasons why the alternatives are usually safer.
– Bakuriu
2 days ago
@Bakuriu - also, if you just want to produce a file full of zeroes (or rather you do not care about what is inside it) use truncate. It is much faster.
– Konrad Gajewski
2 days ago
@KonradGajewski FYI truncate tries to make a sparse file (if that matters)
– Xen2050
yesterday
4
@Bakuriuhead
cannot do this task without the-c
option that isn't in POSIX. I don't know any version ofcat
which can solve this.rsync
is a completely nonstandard utility. That is neither here nr there; skimming through its man page, I don't see how it can solve this problem, either.
– Kaz
15 hours ago
Technically,/dev/urandom
isn't in POSIX either...
– grawity
6 hours ago
3
3
IMHO I don't think there are many valid use cases for
dd
at all. I'd use head
, cat
or rsync
in its place almost always. And your question if one of the reasons why the alternatives are usually safer.– Bakuriu
2 days ago
IMHO I don't think there are many valid use cases for
dd
at all. I'd use head
, cat
or rsync
in its place almost always. And your question if one of the reasons why the alternatives are usually safer.– Bakuriu
2 days ago
@Bakuriu - also, if you just want to produce a file full of zeroes (or rather you do not care about what is inside it) use truncate. It is much faster.
– Konrad Gajewski
2 days ago
@Bakuriu - also, if you just want to produce a file full of zeroes (or rather you do not care about what is inside it) use truncate. It is much faster.
– Konrad Gajewski
2 days ago
@KonradGajewski FYI truncate tries to make a sparse file (if that matters)
– Xen2050
yesterday
@KonradGajewski FYI truncate tries to make a sparse file (if that matters)
– Xen2050
yesterday
4
4
@Bakuriu
head
cannot do this task without the -c
option that isn't in POSIX. I don't know any version of cat
which can solve this. rsync
is a completely nonstandard utility. That is neither here nr there; skimming through its man page, I don't see how it can solve this problem, either.– Kaz
15 hours ago
@Bakuriu
head
cannot do this task without the -c
option that isn't in POSIX. I don't know any version of cat
which can solve this. rsync
is a completely nonstandard utility. That is neither here nr there; skimming through its man page, I don't see how it can solve this problem, either.– Kaz
15 hours ago
Technically,
/dev/urandom
isn't in POSIX either...– grawity
6 hours ago
Technically,
/dev/urandom
isn't in POSIX either...– grawity
6 hours ago
|
show 1 more comment
2 Answers
2
active
oldest
votes
bs
, the buffer size, means the size of a single read() call done by dd.
(For example, both bs=1M count=1
and bs=1k count=1k
will result in a 1 MiB file, but the first version will do it in a single step, while the second will do it in 1024 small chunks.)
Regular files can be read at nearly any buffer size (as long as that buffer fits in RAM), but devices and "virtual" files often work very close to the individual calls and have some arbitrary restriction of how much data they'll produce per read() call.
For /dev/urandom
, this limit is defined in urandom_read() in drivers/char/random.c:
#define ENTROPY_SHIFT 3
static ssize_t
urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
{
nbytes = min_t(size_t, nbytes, INT_MAX >> (ENTROPY_SHIFT + 3));
...
}
This means that every time the function is called, it will clamp the requested size to 33554431 bytes.
By default, unlike most other tools, dd will not retry after receiving less data than requested – you get the 32 MiB and that's it. (To make it retry automatically, as in Kamil's answer, you'll need to specify iflag=fullblock
.)
Note also that "the size of a single read()" means that the whole buffer must fit in memory at once, so massive block sizes also correspond to massive memory usage by dd.
And it's all pointless because you usually won't gain any performance when going above ~16–32 MiB blocks – syscalls aren't the slow part here, the random number generator is.
So for simplicity, just use head -c 1G /dev/urandom > output
.
7
"... you usually won't gain any performance when going above ~16–32 MiB blocks" - In my experience, you tend not to gain much, or even lose performance above 64-128 kilobyte. At that point, you're well in the diminishing returns wrt syscall cost, and cache contention starts to play a role.
– marcelm
Dec 27 at 20:43
3
@marcelm I've helped architect high performance systems where IO performance would improve as block size increased to 1-2 MB blocks, and in some cases up to 8 MB or so. Per LUN. And as filesystems were constructed using multiple parallel LUNs, to get get best performance meant using multiple threads for IO, each doing 1 MB+ blocks. Sustained IO rates were over 1 GB/sec. And those were all spinning disks, so I can see high-performance arrays of SSDs swallowing or generating data faster and faster as the block size grows to 16 or even 32 MB blocks. Easily. Maybe even larger.
– Andrew Henle
Dec 28 at 11:01
4
I'll explicitly note thatiflag=fullblock
is a GNU extension to the POSIXdd
utility. As the question doesn't specify Linux, I think the use of Linux-specific extensions should probably be explicitly noted lest some future reader trying to solve a similar issue on a non-Linux system be confused.
– Andrew Henle
Dec 28 at 12:37
5
@AndrewHenle Ah, interesting! I did a quick test withdd
on my machine, with block sizes from 1k to 512M. Reading from an Intel 750 SSD, optimal performance (about 1300MiB/s) was achieved at 2MiB blocks, roughly matching your results. Larger block sizes neither helped nor hindered. Reading from/dev/zero
, optimal performance (almost 20GiB/s) was at 64KiB and 128KiB blocks; both smaller and larger blocks decreased performance, roughly matching my previous comment. Bottom line: benchmark for your actual situation. And of course, neither of us benchmarked/dev/random
:P
– marcelm
2 days ago
3
@Xen2050 I did some more quick tests, and it appearsdd
is faster. A quick strace showed thathead
uses 8KiB reads, and two 4KiB writes, which is interesting (GNU coreutils 8.26 on Debian 9.6 / Linux 4.8).head
speeds are indeed somewhere betweendd bs=4k
anddd bs=8k
.head
speeds are down ~40% compared todd if=/dev/zero bs=64k
and down ~25% compared todd if=/dev/nvme0n1 bs=2M
. The reads from/dev/zero
are of course more CPU-limited, but for the SSD I/O queing also plays a role. It's a bigger difference than I expected.
– marcelm
2 days ago
|
show 5 more comments
dd
may read less than ibs
(note: bs
specifies both ibs
and obs
), unless iflag=fullblock
is specified. 0+1 records in
indicates that 0
full blocks and 1
partial block was read. However any full or partial block increases the counter.
I don't know the exact mechanism that makes Edit: this concurrent answer explains the mechanism that makes dd
read a block that is less than 1G
in this particular case. I guess any block is read to the memory before it's written, so memory management may interfere (but this is only a guess).dd
read a block that is less than 1G
in this particular case.
Anyway, I don't recommend such large bs
. I would use bs=1M count=1024
. The most important thing is: without iflag=fullblock
any read attempt may read less than ibs
(unless ibs=1
, I think, this is quite inefficient though).
So if you need to read some exact amount of data, use iflag=fullblock
. Note iflag
is not required by POSIX, your dd
may not support it. According to this answer ibs=1
is probably the only POSIX way to read an exact number of bytes. Of course if you change ibs
then you will need to recalculate the count
. In your case lowering ibs
to 32M
or less will probably fix the issue, even without iflag=fullblock
.
In my Kubuntu I would fix your command like this:
dd if=/dev/urandom of=output bs=1M count=1024 iflag=fullblock
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1388082%2fdd-is-producing-a-32-mb-random-file-instead-of-1-gb%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
bs
, the buffer size, means the size of a single read() call done by dd.
(For example, both bs=1M count=1
and bs=1k count=1k
will result in a 1 MiB file, but the first version will do it in a single step, while the second will do it in 1024 small chunks.)
Regular files can be read at nearly any buffer size (as long as that buffer fits in RAM), but devices and "virtual" files often work very close to the individual calls and have some arbitrary restriction of how much data they'll produce per read() call.
For /dev/urandom
, this limit is defined in urandom_read() in drivers/char/random.c:
#define ENTROPY_SHIFT 3
static ssize_t
urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
{
nbytes = min_t(size_t, nbytes, INT_MAX >> (ENTROPY_SHIFT + 3));
...
}
This means that every time the function is called, it will clamp the requested size to 33554431 bytes.
By default, unlike most other tools, dd will not retry after receiving less data than requested – you get the 32 MiB and that's it. (To make it retry automatically, as in Kamil's answer, you'll need to specify iflag=fullblock
.)
Note also that "the size of a single read()" means that the whole buffer must fit in memory at once, so massive block sizes also correspond to massive memory usage by dd.
And it's all pointless because you usually won't gain any performance when going above ~16–32 MiB blocks – syscalls aren't the slow part here, the random number generator is.
So for simplicity, just use head -c 1G /dev/urandom > output
.
7
"... you usually won't gain any performance when going above ~16–32 MiB blocks" - In my experience, you tend not to gain much, or even lose performance above 64-128 kilobyte. At that point, you're well in the diminishing returns wrt syscall cost, and cache contention starts to play a role.
– marcelm
Dec 27 at 20:43
3
@marcelm I've helped architect high performance systems where IO performance would improve as block size increased to 1-2 MB blocks, and in some cases up to 8 MB or so. Per LUN. And as filesystems were constructed using multiple parallel LUNs, to get get best performance meant using multiple threads for IO, each doing 1 MB+ blocks. Sustained IO rates were over 1 GB/sec. And those were all spinning disks, so I can see high-performance arrays of SSDs swallowing or generating data faster and faster as the block size grows to 16 or even 32 MB blocks. Easily. Maybe even larger.
– Andrew Henle
Dec 28 at 11:01
4
I'll explicitly note thatiflag=fullblock
is a GNU extension to the POSIXdd
utility. As the question doesn't specify Linux, I think the use of Linux-specific extensions should probably be explicitly noted lest some future reader trying to solve a similar issue on a non-Linux system be confused.
– Andrew Henle
Dec 28 at 12:37
5
@AndrewHenle Ah, interesting! I did a quick test withdd
on my machine, with block sizes from 1k to 512M. Reading from an Intel 750 SSD, optimal performance (about 1300MiB/s) was achieved at 2MiB blocks, roughly matching your results. Larger block sizes neither helped nor hindered. Reading from/dev/zero
, optimal performance (almost 20GiB/s) was at 64KiB and 128KiB blocks; both smaller and larger blocks decreased performance, roughly matching my previous comment. Bottom line: benchmark for your actual situation. And of course, neither of us benchmarked/dev/random
:P
– marcelm
2 days ago
3
@Xen2050 I did some more quick tests, and it appearsdd
is faster. A quick strace showed thathead
uses 8KiB reads, and two 4KiB writes, which is interesting (GNU coreutils 8.26 on Debian 9.6 / Linux 4.8).head
speeds are indeed somewhere betweendd bs=4k
anddd bs=8k
.head
speeds are down ~40% compared todd if=/dev/zero bs=64k
and down ~25% compared todd if=/dev/nvme0n1 bs=2M
. The reads from/dev/zero
are of course more CPU-limited, but for the SSD I/O queing also plays a role. It's a bigger difference than I expected.
– marcelm
2 days ago
|
show 5 more comments
bs
, the buffer size, means the size of a single read() call done by dd.
(For example, both bs=1M count=1
and bs=1k count=1k
will result in a 1 MiB file, but the first version will do it in a single step, while the second will do it in 1024 small chunks.)
Regular files can be read at nearly any buffer size (as long as that buffer fits in RAM), but devices and "virtual" files often work very close to the individual calls and have some arbitrary restriction of how much data they'll produce per read() call.
For /dev/urandom
, this limit is defined in urandom_read() in drivers/char/random.c:
#define ENTROPY_SHIFT 3
static ssize_t
urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
{
nbytes = min_t(size_t, nbytes, INT_MAX >> (ENTROPY_SHIFT + 3));
...
}
This means that every time the function is called, it will clamp the requested size to 33554431 bytes.
By default, unlike most other tools, dd will not retry after receiving less data than requested – you get the 32 MiB and that's it. (To make it retry automatically, as in Kamil's answer, you'll need to specify iflag=fullblock
.)
Note also that "the size of a single read()" means that the whole buffer must fit in memory at once, so massive block sizes also correspond to massive memory usage by dd.
And it's all pointless because you usually won't gain any performance when going above ~16–32 MiB blocks – syscalls aren't the slow part here, the random number generator is.
So for simplicity, just use head -c 1G /dev/urandom > output
.
7
"... you usually won't gain any performance when going above ~16–32 MiB blocks" - In my experience, you tend not to gain much, or even lose performance above 64-128 kilobyte. At that point, you're well in the diminishing returns wrt syscall cost, and cache contention starts to play a role.
– marcelm
Dec 27 at 20:43
3
@marcelm I've helped architect high performance systems where IO performance would improve as block size increased to 1-2 MB blocks, and in some cases up to 8 MB or so. Per LUN. And as filesystems were constructed using multiple parallel LUNs, to get get best performance meant using multiple threads for IO, each doing 1 MB+ blocks. Sustained IO rates were over 1 GB/sec. And those were all spinning disks, so I can see high-performance arrays of SSDs swallowing or generating data faster and faster as the block size grows to 16 or even 32 MB blocks. Easily. Maybe even larger.
– Andrew Henle
Dec 28 at 11:01
4
I'll explicitly note thatiflag=fullblock
is a GNU extension to the POSIXdd
utility. As the question doesn't specify Linux, I think the use of Linux-specific extensions should probably be explicitly noted lest some future reader trying to solve a similar issue on a non-Linux system be confused.
– Andrew Henle
Dec 28 at 12:37
5
@AndrewHenle Ah, interesting! I did a quick test withdd
on my machine, with block sizes from 1k to 512M. Reading from an Intel 750 SSD, optimal performance (about 1300MiB/s) was achieved at 2MiB blocks, roughly matching your results. Larger block sizes neither helped nor hindered. Reading from/dev/zero
, optimal performance (almost 20GiB/s) was at 64KiB and 128KiB blocks; both smaller and larger blocks decreased performance, roughly matching my previous comment. Bottom line: benchmark for your actual situation. And of course, neither of us benchmarked/dev/random
:P
– marcelm
2 days ago
3
@Xen2050 I did some more quick tests, and it appearsdd
is faster. A quick strace showed thathead
uses 8KiB reads, and two 4KiB writes, which is interesting (GNU coreutils 8.26 on Debian 9.6 / Linux 4.8).head
speeds are indeed somewhere betweendd bs=4k
anddd bs=8k
.head
speeds are down ~40% compared todd if=/dev/zero bs=64k
and down ~25% compared todd if=/dev/nvme0n1 bs=2M
. The reads from/dev/zero
are of course more CPU-limited, but for the SSD I/O queing also plays a role. It's a bigger difference than I expected.
– marcelm
2 days ago
|
show 5 more comments
bs
, the buffer size, means the size of a single read() call done by dd.
(For example, both bs=1M count=1
and bs=1k count=1k
will result in a 1 MiB file, but the first version will do it in a single step, while the second will do it in 1024 small chunks.)
Regular files can be read at nearly any buffer size (as long as that buffer fits in RAM), but devices and "virtual" files often work very close to the individual calls and have some arbitrary restriction of how much data they'll produce per read() call.
For /dev/urandom
, this limit is defined in urandom_read() in drivers/char/random.c:
#define ENTROPY_SHIFT 3
static ssize_t
urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
{
nbytes = min_t(size_t, nbytes, INT_MAX >> (ENTROPY_SHIFT + 3));
...
}
This means that every time the function is called, it will clamp the requested size to 33554431 bytes.
By default, unlike most other tools, dd will not retry after receiving less data than requested – you get the 32 MiB and that's it. (To make it retry automatically, as in Kamil's answer, you'll need to specify iflag=fullblock
.)
Note also that "the size of a single read()" means that the whole buffer must fit in memory at once, so massive block sizes also correspond to massive memory usage by dd.
And it's all pointless because you usually won't gain any performance when going above ~16–32 MiB blocks – syscalls aren't the slow part here, the random number generator is.
So for simplicity, just use head -c 1G /dev/urandom > output
.
bs
, the buffer size, means the size of a single read() call done by dd.
(For example, both bs=1M count=1
and bs=1k count=1k
will result in a 1 MiB file, but the first version will do it in a single step, while the second will do it in 1024 small chunks.)
Regular files can be read at nearly any buffer size (as long as that buffer fits in RAM), but devices and "virtual" files often work very close to the individual calls and have some arbitrary restriction of how much data they'll produce per read() call.
For /dev/urandom
, this limit is defined in urandom_read() in drivers/char/random.c:
#define ENTROPY_SHIFT 3
static ssize_t
urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
{
nbytes = min_t(size_t, nbytes, INT_MAX >> (ENTROPY_SHIFT + 3));
...
}
This means that every time the function is called, it will clamp the requested size to 33554431 bytes.
By default, unlike most other tools, dd will not retry after receiving less data than requested – you get the 32 MiB and that's it. (To make it retry automatically, as in Kamil's answer, you'll need to specify iflag=fullblock
.)
Note also that "the size of a single read()" means that the whole buffer must fit in memory at once, so massive block sizes also correspond to massive memory usage by dd.
And it's all pointless because you usually won't gain any performance when going above ~16–32 MiB blocks – syscalls aren't the slow part here, the random number generator is.
So for simplicity, just use head -c 1G /dev/urandom > output
.
edited Dec 27 at 12:30
answered Dec 27 at 11:29
grawity
232k35490546
232k35490546
7
"... you usually won't gain any performance when going above ~16–32 MiB blocks" - In my experience, you tend not to gain much, or even lose performance above 64-128 kilobyte. At that point, you're well in the diminishing returns wrt syscall cost, and cache contention starts to play a role.
– marcelm
Dec 27 at 20:43
3
@marcelm I've helped architect high performance systems where IO performance would improve as block size increased to 1-2 MB blocks, and in some cases up to 8 MB or so. Per LUN. And as filesystems were constructed using multiple parallel LUNs, to get get best performance meant using multiple threads for IO, each doing 1 MB+ blocks. Sustained IO rates were over 1 GB/sec. And those were all spinning disks, so I can see high-performance arrays of SSDs swallowing or generating data faster and faster as the block size grows to 16 or even 32 MB blocks. Easily. Maybe even larger.
– Andrew Henle
Dec 28 at 11:01
4
I'll explicitly note thatiflag=fullblock
is a GNU extension to the POSIXdd
utility. As the question doesn't specify Linux, I think the use of Linux-specific extensions should probably be explicitly noted lest some future reader trying to solve a similar issue on a non-Linux system be confused.
– Andrew Henle
Dec 28 at 12:37
5
@AndrewHenle Ah, interesting! I did a quick test withdd
on my machine, with block sizes from 1k to 512M. Reading from an Intel 750 SSD, optimal performance (about 1300MiB/s) was achieved at 2MiB blocks, roughly matching your results. Larger block sizes neither helped nor hindered. Reading from/dev/zero
, optimal performance (almost 20GiB/s) was at 64KiB and 128KiB blocks; both smaller and larger blocks decreased performance, roughly matching my previous comment. Bottom line: benchmark for your actual situation. And of course, neither of us benchmarked/dev/random
:P
– marcelm
2 days ago
3
@Xen2050 I did some more quick tests, and it appearsdd
is faster. A quick strace showed thathead
uses 8KiB reads, and two 4KiB writes, which is interesting (GNU coreutils 8.26 on Debian 9.6 / Linux 4.8).head
speeds are indeed somewhere betweendd bs=4k
anddd bs=8k
.head
speeds are down ~40% compared todd if=/dev/zero bs=64k
and down ~25% compared todd if=/dev/nvme0n1 bs=2M
. The reads from/dev/zero
are of course more CPU-limited, but for the SSD I/O queing also plays a role. It's a bigger difference than I expected.
– marcelm
2 days ago
|
show 5 more comments
7
"... you usually won't gain any performance when going above ~16–32 MiB blocks" - In my experience, you tend not to gain much, or even lose performance above 64-128 kilobyte. At that point, you're well in the diminishing returns wrt syscall cost, and cache contention starts to play a role.
– marcelm
Dec 27 at 20:43
3
@marcelm I've helped architect high performance systems where IO performance would improve as block size increased to 1-2 MB blocks, and in some cases up to 8 MB or so. Per LUN. And as filesystems were constructed using multiple parallel LUNs, to get get best performance meant using multiple threads for IO, each doing 1 MB+ blocks. Sustained IO rates were over 1 GB/sec. And those were all spinning disks, so I can see high-performance arrays of SSDs swallowing or generating data faster and faster as the block size grows to 16 or even 32 MB blocks. Easily. Maybe even larger.
– Andrew Henle
Dec 28 at 11:01
4
I'll explicitly note thatiflag=fullblock
is a GNU extension to the POSIXdd
utility. As the question doesn't specify Linux, I think the use of Linux-specific extensions should probably be explicitly noted lest some future reader trying to solve a similar issue on a non-Linux system be confused.
– Andrew Henle
Dec 28 at 12:37
5
@AndrewHenle Ah, interesting! I did a quick test withdd
on my machine, with block sizes from 1k to 512M. Reading from an Intel 750 SSD, optimal performance (about 1300MiB/s) was achieved at 2MiB blocks, roughly matching your results. Larger block sizes neither helped nor hindered. Reading from/dev/zero
, optimal performance (almost 20GiB/s) was at 64KiB and 128KiB blocks; both smaller and larger blocks decreased performance, roughly matching my previous comment. Bottom line: benchmark for your actual situation. And of course, neither of us benchmarked/dev/random
:P
– marcelm
2 days ago
3
@Xen2050 I did some more quick tests, and it appearsdd
is faster. A quick strace showed thathead
uses 8KiB reads, and two 4KiB writes, which is interesting (GNU coreutils 8.26 on Debian 9.6 / Linux 4.8).head
speeds are indeed somewhere betweendd bs=4k
anddd bs=8k
.head
speeds are down ~40% compared todd if=/dev/zero bs=64k
and down ~25% compared todd if=/dev/nvme0n1 bs=2M
. The reads from/dev/zero
are of course more CPU-limited, but for the SSD I/O queing also plays a role. It's a bigger difference than I expected.
– marcelm
2 days ago
7
7
"... you usually won't gain any performance when going above ~16–32 MiB blocks" - In my experience, you tend not to gain much, or even lose performance above 64-128 kilobyte. At that point, you're well in the diminishing returns wrt syscall cost, and cache contention starts to play a role.
– marcelm
Dec 27 at 20:43
"... you usually won't gain any performance when going above ~16–32 MiB blocks" - In my experience, you tend not to gain much, or even lose performance above 64-128 kilobyte. At that point, you're well in the diminishing returns wrt syscall cost, and cache contention starts to play a role.
– marcelm
Dec 27 at 20:43
3
3
@marcelm I've helped architect high performance systems where IO performance would improve as block size increased to 1-2 MB blocks, and in some cases up to 8 MB or so. Per LUN. And as filesystems were constructed using multiple parallel LUNs, to get get best performance meant using multiple threads for IO, each doing 1 MB+ blocks. Sustained IO rates were over 1 GB/sec. And those were all spinning disks, so I can see high-performance arrays of SSDs swallowing or generating data faster and faster as the block size grows to 16 or even 32 MB blocks. Easily. Maybe even larger.
– Andrew Henle
Dec 28 at 11:01
@marcelm I've helped architect high performance systems where IO performance would improve as block size increased to 1-2 MB blocks, and in some cases up to 8 MB or so. Per LUN. And as filesystems were constructed using multiple parallel LUNs, to get get best performance meant using multiple threads for IO, each doing 1 MB+ blocks. Sustained IO rates were over 1 GB/sec. And those were all spinning disks, so I can see high-performance arrays of SSDs swallowing or generating data faster and faster as the block size grows to 16 or even 32 MB blocks. Easily. Maybe even larger.
– Andrew Henle
Dec 28 at 11:01
4
4
I'll explicitly note that
iflag=fullblock
is a GNU extension to the POSIX dd
utility. As the question doesn't specify Linux, I think the use of Linux-specific extensions should probably be explicitly noted lest some future reader trying to solve a similar issue on a non-Linux system be confused.– Andrew Henle
Dec 28 at 12:37
I'll explicitly note that
iflag=fullblock
is a GNU extension to the POSIX dd
utility. As the question doesn't specify Linux, I think the use of Linux-specific extensions should probably be explicitly noted lest some future reader trying to solve a similar issue on a non-Linux system be confused.– Andrew Henle
Dec 28 at 12:37
5
5
@AndrewHenle Ah, interesting! I did a quick test with
dd
on my machine, with block sizes from 1k to 512M. Reading from an Intel 750 SSD, optimal performance (about 1300MiB/s) was achieved at 2MiB blocks, roughly matching your results. Larger block sizes neither helped nor hindered. Reading from /dev/zero
, optimal performance (almost 20GiB/s) was at 64KiB and 128KiB blocks; both smaller and larger blocks decreased performance, roughly matching my previous comment. Bottom line: benchmark for your actual situation. And of course, neither of us benchmarked /dev/random
:P– marcelm
2 days ago
@AndrewHenle Ah, interesting! I did a quick test with
dd
on my machine, with block sizes from 1k to 512M. Reading from an Intel 750 SSD, optimal performance (about 1300MiB/s) was achieved at 2MiB blocks, roughly matching your results. Larger block sizes neither helped nor hindered. Reading from /dev/zero
, optimal performance (almost 20GiB/s) was at 64KiB and 128KiB blocks; both smaller and larger blocks decreased performance, roughly matching my previous comment. Bottom line: benchmark for your actual situation. And of course, neither of us benchmarked /dev/random
:P– marcelm
2 days ago
3
3
@Xen2050 I did some more quick tests, and it appears
dd
is faster. A quick strace showed that head
uses 8KiB reads, and two 4KiB writes, which is interesting (GNU coreutils 8.26 on Debian 9.6 / Linux 4.8). head
speeds are indeed somewhere between dd bs=4k
and dd bs=8k
. head
speeds are down ~40% compared to dd if=/dev/zero bs=64k
and down ~25% compared to dd if=/dev/nvme0n1 bs=2M
. The reads from /dev/zero
are of course more CPU-limited, but for the SSD I/O queing also plays a role. It's a bigger difference than I expected.– marcelm
2 days ago
@Xen2050 I did some more quick tests, and it appears
dd
is faster. A quick strace showed that head
uses 8KiB reads, and two 4KiB writes, which is interesting (GNU coreutils 8.26 on Debian 9.6 / Linux 4.8). head
speeds are indeed somewhere between dd bs=4k
and dd bs=8k
. head
speeds are down ~40% compared to dd if=/dev/zero bs=64k
and down ~25% compared to dd if=/dev/nvme0n1 bs=2M
. The reads from /dev/zero
are of course more CPU-limited, but for the SSD I/O queing also plays a role. It's a bigger difference than I expected.– marcelm
2 days ago
|
show 5 more comments
dd
may read less than ibs
(note: bs
specifies both ibs
and obs
), unless iflag=fullblock
is specified. 0+1 records in
indicates that 0
full blocks and 1
partial block was read. However any full or partial block increases the counter.
I don't know the exact mechanism that makes Edit: this concurrent answer explains the mechanism that makes dd
read a block that is less than 1G
in this particular case. I guess any block is read to the memory before it's written, so memory management may interfere (but this is only a guess).dd
read a block that is less than 1G
in this particular case.
Anyway, I don't recommend such large bs
. I would use bs=1M count=1024
. The most important thing is: without iflag=fullblock
any read attempt may read less than ibs
(unless ibs=1
, I think, this is quite inefficient though).
So if you need to read some exact amount of data, use iflag=fullblock
. Note iflag
is not required by POSIX, your dd
may not support it. According to this answer ibs=1
is probably the only POSIX way to read an exact number of bytes. Of course if you change ibs
then you will need to recalculate the count
. In your case lowering ibs
to 32M
or less will probably fix the issue, even without iflag=fullblock
.
In my Kubuntu I would fix your command like this:
dd if=/dev/urandom of=output bs=1M count=1024 iflag=fullblock
add a comment |
dd
may read less than ibs
(note: bs
specifies both ibs
and obs
), unless iflag=fullblock
is specified. 0+1 records in
indicates that 0
full blocks and 1
partial block was read. However any full or partial block increases the counter.
I don't know the exact mechanism that makes Edit: this concurrent answer explains the mechanism that makes dd
read a block that is less than 1G
in this particular case. I guess any block is read to the memory before it's written, so memory management may interfere (but this is only a guess).dd
read a block that is less than 1G
in this particular case.
Anyway, I don't recommend such large bs
. I would use bs=1M count=1024
. The most important thing is: without iflag=fullblock
any read attempt may read less than ibs
(unless ibs=1
, I think, this is quite inefficient though).
So if you need to read some exact amount of data, use iflag=fullblock
. Note iflag
is not required by POSIX, your dd
may not support it. According to this answer ibs=1
is probably the only POSIX way to read an exact number of bytes. Of course if you change ibs
then you will need to recalculate the count
. In your case lowering ibs
to 32M
or less will probably fix the issue, even without iflag=fullblock
.
In my Kubuntu I would fix your command like this:
dd if=/dev/urandom of=output bs=1M count=1024 iflag=fullblock
add a comment |
dd
may read less than ibs
(note: bs
specifies both ibs
and obs
), unless iflag=fullblock
is specified. 0+1 records in
indicates that 0
full blocks and 1
partial block was read. However any full or partial block increases the counter.
I don't know the exact mechanism that makes Edit: this concurrent answer explains the mechanism that makes dd
read a block that is less than 1G
in this particular case. I guess any block is read to the memory before it's written, so memory management may interfere (but this is only a guess).dd
read a block that is less than 1G
in this particular case.
Anyway, I don't recommend such large bs
. I would use bs=1M count=1024
. The most important thing is: without iflag=fullblock
any read attempt may read less than ibs
(unless ibs=1
, I think, this is quite inefficient though).
So if you need to read some exact amount of data, use iflag=fullblock
. Note iflag
is not required by POSIX, your dd
may not support it. According to this answer ibs=1
is probably the only POSIX way to read an exact number of bytes. Of course if you change ibs
then you will need to recalculate the count
. In your case lowering ibs
to 32M
or less will probably fix the issue, even without iflag=fullblock
.
In my Kubuntu I would fix your command like this:
dd if=/dev/urandom of=output bs=1M count=1024 iflag=fullblock
dd
may read less than ibs
(note: bs
specifies both ibs
and obs
), unless iflag=fullblock
is specified. 0+1 records in
indicates that 0
full blocks and 1
partial block was read. However any full or partial block increases the counter.
I don't know the exact mechanism that makes Edit: this concurrent answer explains the mechanism that makes dd
read a block that is less than 1G
in this particular case. I guess any block is read to the memory before it's written, so memory management may interfere (but this is only a guess).dd
read a block that is less than 1G
in this particular case.
Anyway, I don't recommend such large bs
. I would use bs=1M count=1024
. The most important thing is: without iflag=fullblock
any read attempt may read less than ibs
(unless ibs=1
, I think, this is quite inefficient though).
So if you need to read some exact amount of data, use iflag=fullblock
. Note iflag
is not required by POSIX, your dd
may not support it. According to this answer ibs=1
is probably the only POSIX way to read an exact number of bytes. Of course if you change ibs
then you will need to recalculate the count
. In your case lowering ibs
to 32M
or less will probably fix the issue, even without iflag=fullblock
.
In my Kubuntu I would fix your command like this:
dd if=/dev/urandom of=output bs=1M count=1024 iflag=fullblock
edited Dec 27 at 11:34
answered Dec 27 at 11:29
Kamil Maciorowski
24.5k155277
24.5k155277
add a comment |
add a comment |
Thanks for contributing an answer to Super User!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1388082%2fdd-is-producing-a-32-mb-random-file-instead-of-1-gb%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
IMHO I don't think there are many valid use cases for
dd
at all. I'd usehead
,cat
orrsync
in its place almost always. And your question if one of the reasons why the alternatives are usually safer.– Bakuriu
2 days ago
@Bakuriu - also, if you just want to produce a file full of zeroes (or rather you do not care about what is inside it) use truncate. It is much faster.
– Konrad Gajewski
2 days ago
@KonradGajewski FYI truncate tries to make a sparse file (if that matters)
– Xen2050
yesterday
4
@Bakuriu
head
cannot do this task without the-c
option that isn't in POSIX. I don't know any version ofcat
which can solve this.rsync
is a completely nonstandard utility. That is neither here nr there; skimming through its man page, I don't see how it can solve this problem, either.– Kaz
15 hours ago
Technically,
/dev/urandom
isn't in POSIX either...– grawity
6 hours ago