Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?

In the x86-64 Tour of Intel Manuals, I read

Perhaps the most surprising fact is that an instruction such as MOV EAX, EBX automatically zeroes upper 32 bits of RAX register.

The Intel documentation (3.4.1.1 General-Purpose Registers in 64-Bit Mode in manual Basic Architecture) quoted at the same source tells us:

64-bit operands generate a 64-bit result in the destination general-purpose register.

32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register.

8-bit and 16-bit operands generate an 8-bit or 16-bit result. The upper 56 bits or 48 bits (respectively) of the destination general-purpose register are not be modified by the operation. If the result of an 8-bit or 16-bit operation is intended for 64-bit address calculation, explicitly sign-extend the register to the full 64-bits.

In x86-32 and x86-64 assembly, 16 bit instructions such as

mov ax, bx

don't show this kind of "strange" behaviour that the upper word of eax is zeroed.

Thus: what is the reason why this behaviour was introduced? At a first glance it seems illogical (but the reason might be that I am used to the quirks of x86-32 assembly).

edited Aug 1 '18 at 16:49

asked Jun 24 '12 at 11:40

Nubok

1,62142038

16

If you Google for "Partial register stall", you'll find quite a bit of information about the problem they were (almost certainly) trying to avoid.

– Jerry Coffin
Jun 24 '12 at 14:38

3

stackoverflow.com/questions/25455447/…

– Hans Passant
Aug 27 '15 at 7:16

2

Not just "most". AFAIK, all instructions with an r32 destination operand zero the high 32, rather than merging. For example, some assemblers will replace pmovmskb r64, xmm with pmovmskb r32, xmm, saving a REX, because the 64bit destination version behaves identically. Even though the Operation section of the manual lists all 6 combinations of 32/64bit dest and 64/128/256b source separately, the implicit zero-extension of the r32 form duplicates the explicit zero-extension of the r64 form. I'm curious about the HW implementation...

– Peter Cordes
May 26 '16 at 23:38

2

@HansPassant, the circular reference begins.

– kchoi
Jul 15 '16 at 23:26

1

Related: xor eax,eax or xor r8d,r8d is the best way to zero RAX or R8 (saving a REX prefix for RAX, and 64-bit XOR isn't even handled specially on Silvermont). Related: How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

– Peter Cordes
Nov 21 '17 at 23:44

add a comment |

In the x86-64 Tour of Intel Manuals, I read

Perhaps the most surprising fact is that an instruction such as MOV EAX, EBX automatically zeroes upper 32 bits of RAX register.

The Intel documentation (3.4.1.1 General-Purpose Registers in 64-Bit Mode in manual Basic Architecture) quoted at the same source tells us:

64-bit operands generate a 64-bit result in the destination general-purpose register.

32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register.

8-bit and 16-bit operands generate an 8-bit or 16-bit result. The upper 56 bits or 48 bits (respectively) of the destination general-purpose register are not be modified by the operation. If the result of an 8-bit or 16-bit operation is intended for 64-bit address calculation, explicitly sign-extend the register to the full 64-bits.

In x86-32 and x86-64 assembly, 16 bit instructions such as

mov ax, bx

don't show this kind of "strange" behaviour that the upper word of eax is zeroed.

Thus: what is the reason why this behaviour was introduced? At a first glance it seems illogical (but the reason might be that I am used to the quirks of x86-32 assembly).

edited Aug 1 '18 at 16:49

asked Jun 24 '12 at 11:40

Nubok

1,62142038

16

If you Google for "Partial register stall", you'll find quite a bit of information about the problem they were (almost certainly) trying to avoid.

– Jerry Coffin
Jun 24 '12 at 14:38

3

stackoverflow.com/questions/25455447/…

– Hans Passant
Aug 27 '15 at 7:16

2

Not just "most". AFAIK, all instructions with an r32 destination operand zero the high 32, rather than merging. For example, some assemblers will replace pmovmskb r64, xmm with pmovmskb r32, xmm, saving a REX, because the 64bit destination version behaves identically. Even though the Operation section of the manual lists all 6 combinations of 32/64bit dest and 64/128/256b source separately, the implicit zero-extension of the r32 form duplicates the explicit zero-extension of the r64 form. I'm curious about the HW implementation...

– Peter Cordes
May 26 '16 at 23:38

2

@HansPassant, the circular reference begins.

– kchoi
Jul 15 '16 at 23:26

1

Related: xor eax,eax or xor r8d,r8d is the best way to zero RAX or R8 (saving a REX prefix for RAX, and 64-bit XOR isn't even handled specially on Silvermont). Related: How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

– Peter Cordes
Nov 21 '17 at 23:44

add a comment |

In the x86-64 Tour of Intel Manuals, I read

Perhaps the most surprising fact is that an instruction such as MOV EAX, EBX automatically zeroes upper 32 bits of RAX register.

The Intel documentation (3.4.1.1 General-Purpose Registers in 64-Bit Mode in manual Basic Architecture) quoted at the same source tells us:

64-bit operands generate a 64-bit result in the destination general-purpose register.

32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register.

8-bit and 16-bit operands generate an 8-bit or 16-bit result. The upper 56 bits or 48 bits (respectively) of the destination general-purpose register are not be modified by the operation. If the result of an 8-bit or 16-bit operation is intended for 64-bit address calculation, explicitly sign-extend the register to the full 64-bits.

In x86-32 and x86-64 assembly, 16 bit instructions such as

mov ax, bx

don't show this kind of "strange" behaviour that the upper word of eax is zeroed.

Thus: what is the reason why this behaviour was introduced? At a first glance it seems illogical (but the reason might be that I am used to the quirks of x86-32 assembly).

edited Aug 1 '18 at 16:49

asked Jun 24 '12 at 11:40

Nubok

1,62142038

In the x86-64 Tour of Intel Manuals, I read

Perhaps the most surprising fact is that an instruction such as MOV EAX, EBX automatically zeroes upper 32 bits of RAX register.

The Intel documentation (3.4.1.1 General-Purpose Registers in 64-Bit Mode in manual Basic Architecture) quoted at the same source tells us:

64-bit operands generate a 64-bit result in the destination general-purpose register.

32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register.

8-bit and 16-bit operands generate an 8-bit or 16-bit result. The upper 56 bits or 48 bits (respectively) of the destination general-purpose register are not be modified by the operation. If the result of an 8-bit or 16-bit operation is intended for 64-bit address calculation, explicitly sign-extend the register to the full 64-bits.

In x86-32 and x86-64 assembly, 16 bit instructions such as

mov ax, bx

don't show this kind of "strange" behaviour that the upper word of eax is zeroed.

Thus: what is the reason why this behaviour was introduced? At a first glance it seems illogical (but the reason might be that I am used to the quirks of x86-32 assembly).

assembly x86 x86-64 cpu-registers zero-extension

edited Aug 1 '18 at 16:49

asked Jun 24 '12 at 11:40

Nubok

1,62142038

edited Aug 1 '18 at 16:49

asked Jun 24 '12 at 11:40

Nubok

1,62142038

edited Aug 1 '18 at 16:49

asked Jun 24 '12 at 11:40

Nubok

1,62142038

asked Jun 24 '12 at 11:40

Nubok

1,62142038

asked Jun 24 '12 at 11:40

Nubok

1,62142038

16

If you Google for "Partial register stall", you'll find quite a bit of information about the problem they were (almost certainly) trying to avoid.

– Jerry Coffin
Jun 24 '12 at 14:38

3

stackoverflow.com/questions/25455447/…

– Hans Passant
Aug 27 '15 at 7:16

2

Not just "most". AFAIK, all instructions with an r32 destination operand zero the high 32, rather than merging. For example, some assemblers will replace pmovmskb r64, xmm with pmovmskb r32, xmm, saving a REX, because the 64bit destination version behaves identically. Even though the Operation section of the manual lists all 6 combinations of 32/64bit dest and 64/128/256b source separately, the implicit zero-extension of the r32 form duplicates the explicit zero-extension of the r64 form. I'm curious about the HW implementation...

– Peter Cordes
May 26 '16 at 23:38

2

@HansPassant, the circular reference begins.

– kchoi
Jul 15 '16 at 23:26

1

Related: xor eax,eax or xor r8d,r8d is the best way to zero RAX or R8 (saving a REX prefix for RAX, and 64-bit XOR isn't even handled specially on Silvermont). Related: How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

– Peter Cordes
Nov 21 '17 at 23:44

add a comment |

16

If you Google for "Partial register stall", you'll find quite a bit of information about the problem they were (almost certainly) trying to avoid.

– Jerry Coffin
Jun 24 '12 at 14:38

3

stackoverflow.com/questions/25455447/…

– Hans Passant
Aug 27 '15 at 7:16

2

Not just "most". AFAIK, all instructions with an r32 destination operand zero the high 32, rather than merging. For example, some assemblers will replace pmovmskb r64, xmm with pmovmskb r32, xmm, saving a REX, because the 64bit destination version behaves identically. Even though the Operation section of the manual lists all 6 combinations of 32/64bit dest and 64/128/256b source separately, the implicit zero-extension of the r32 form duplicates the explicit zero-extension of the r64 form. I'm curious about the HW implementation...

– Peter Cordes
May 26 '16 at 23:38

2

@HansPassant, the circular reference begins.

– kchoi
Jul 15 '16 at 23:26

1

Related: xor eax,eax or xor r8d,r8d is the best way to zero RAX or R8 (saving a REX prefix for RAX, and 64-bit XOR isn't even handled specially on Silvermont). Related: How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

– Peter Cordes
Nov 21 '17 at 23:44

If you Google for "Partial register stall", you'll find quite a bit of information about the problem they were (almost certainly) trying to avoid.

– Jerry Coffin
Jun 24 '12 at 14:38

stackoverflow.com/questions/25455447/…

– Hans Passant
Aug 27 '15 at 7:16

Not just "most". AFAIK, all instructions with an r32 destination operand zero the high 32, rather than merging. For example, some assemblers will replace pmovmskb r64, xmm with pmovmskb r32, xmm, saving a REX, because the 64bit destination version behaves identically. Even though the Operation section of the manual lists all 6 combinations of 32/64bit dest and 64/128/256b source separately, the implicit zero-extension of the r32 form duplicates the explicit zero-extension of the r64 form. I'm curious about the HW implementation...

– Peter Cordes
May 26 '16 at 23:38

@HansPassant, the circular reference begins.

– kchoi
Jul 15 '16 at 23:26

Related: xor eax,eax or xor r8d,r8d is the best way to zero RAX or R8 (saving a REX prefix for RAX, and 64-bit XOR isn't even handled specially on Silvermont). Related: How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

– Peter Cordes
Nov 21 '17 at 23:44

add a comment |

2 Answers
2

active

oldest

votes

I'm not AMD or speaking for them, but I would have done it the same way. Because zeroing the high half doesn't create a dependency on the previous value, that the cpu would have to wait on. The register renaming mechanism would essentially be defeated if it wasn't done that way. This way you can write fast 32bit code in 64bit mode without having to explicitly break dependencies all the time. Without this behaviour, every single 32bit instruction in 64bit mode would have to wait on something that happened before, even though that high part would almost never be used.

The behaviour for 16bit instructions is the strange one. The dependency madness is one of the reasons that 16bit instructions are avoided now.

edited Jun 24 '12 at 12:03

answered Jun 24 '12 at 11:53

harold

41.7k357109

7

I don't think it's strange, I think they didn't want to break too much and kept the old behavior there.

– Alexey Frunze
Jun 24 '12 at 11:56

4

@Alex when they introduced 32bit mode, there was no old behaviour for the high part. There was no high part before.. Of course after that it couldn't be changed anymore.

– harold
Jun 24 '12 at 11:59

1

I was speaking about 16-bit operands, why the top bits don't get zeroed in that case. They don't in non-64-bit modes. And that's kept in 64-bit mode too.

– Alexey Frunze
Jun 24 '12 at 12:04

3

I interpreted your "The behaviour for 16bit instructions is the strange one" as "it's strange that zero-extension doesn't happen with 16-bit operands in 64-bit mode". Hence my comments about keeping it the same way in 64-bit mode for better compatibility.

– Alexey Frunze
Jun 24 '12 at 12:09

8

@Alex oh I see. Ok. I don't think it's strange from that perspective. Just from a "looking back, maybe it wasn't such a good idea"-perspective. Guess I should have been clearer :)

– harold
Jun 24 '12 at 12:12

|
show 4 more comments

It simply saves space in the instructions, and the instruction set. You can move small immediate values to a 64-bit register by using existing (32-bit) instructions.

It also saves you from having to encode 8 byte values for MOV RAX, 42, when MOV EAX, 42 can be reused.

This optimization is not as important for 8 and 16 bit ops (because they are smaller), and changing the rules there would also break old code.

answered Jun 24 '12 at 11:50

Bo Persson

78.2k17118184

6

If that's correct, wouldn't it have made more sense for it to sign-extend rather than 0 extend?

– Damien_The_Unbeliever
Jun 24 '12 at 11:54

2

@Alex: And sign-extension isn't? Both can be done very cheaply in hardware.

– jalf
Jun 24 '12 at 11:59

2

@Alex: no it's not. It would be a bit slower if done in software, sure, but in hardware, it'd, at worst, cost a few more transistors, which, on a chip the size and complexity of a modern CPU, that's really not an issue.

– jalf
Jun 24 '12 at 14:03

14

Sign extension is slower, even in hardware. Zero extension can be done in parallel with whatever computation produces the lower half, but sign extension can't be done until (at least the sign of) the lower half has been computed.

– Jerry Coffin
Jun 24 '12 at 14:26

10

Another related trick is to use XOR EAX, EAX because XOR RAX, RAX would need an REX prefix.

– Neil
Oct 2 '13 at 9:12

|
show 7 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f11177137%2fwhy-do-x86-64-instructions-on-32-bit-registers-zero-the-upper-part-of-the-full-6%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

The behaviour for 16bit instructions is the strange one. The dependency madness is one of the reasons that 16bit instructions are avoided now.

edited Jun 24 '12 at 12:03

answered Jun 24 '12 at 11:53

harold

41.7k357109

7

I don't think it's strange, I think they didn't want to break too much and kept the old behavior there.

– Alexey Frunze
Jun 24 '12 at 11:56

4

@Alex when they introduced 32bit mode, there was no old behaviour for the high part. There was no high part before.. Of course after that it couldn't be changed anymore.

– harold
Jun 24 '12 at 11:59

1

I was speaking about 16-bit operands, why the top bits don't get zeroed in that case. They don't in non-64-bit modes. And that's kept in 64-bit mode too.

– Alexey Frunze
Jun 24 '12 at 12:04

3

I interpreted your "The behaviour for 16bit instructions is the strange one" as "it's strange that zero-extension doesn't happen with 16-bit operands in 64-bit mode". Hence my comments about keeping it the same way in 64-bit mode for better compatibility.

– Alexey Frunze
Jun 24 '12 at 12:09

8

@Alex oh I see. Ok. I don't think it's strange from that perspective. Just from a "looking back, maybe it wasn't such a good idea"-perspective. Guess I should have been clearer :)

– harold
Jun 24 '12 at 12:12

|
show 4 more comments

The behaviour for 16bit instructions is the strange one. The dependency madness is one of the reasons that 16bit instructions are avoided now.

edited Jun 24 '12 at 12:03

answered Jun 24 '12 at 11:53

harold

41.7k357109

7

I don't think it's strange, I think they didn't want to break too much and kept the old behavior there.

– Alexey Frunze
Jun 24 '12 at 11:56

4

@Alex when they introduced 32bit mode, there was no old behaviour for the high part. There was no high part before.. Of course after that it couldn't be changed anymore.

– harold
Jun 24 '12 at 11:59

1

I was speaking about 16-bit operands, why the top bits don't get zeroed in that case. They don't in non-64-bit modes. And that's kept in 64-bit mode too.

– Alexey Frunze
Jun 24 '12 at 12:04

3

I interpreted your "The behaviour for 16bit instructions is the strange one" as "it's strange that zero-extension doesn't happen with 16-bit operands in 64-bit mode". Hence my comments about keeping it the same way in 64-bit mode for better compatibility.

– Alexey Frunze
Jun 24 '12 at 12:09

8

@Alex oh I see. Ok. I don't think it's strange from that perspective. Just from a "looking back, maybe it wasn't such a good idea"-perspective. Guess I should have been clearer :)

– harold
Jun 24 '12 at 12:12

|
show 4 more comments

The behaviour for 16bit instructions is the strange one. The dependency madness is one of the reasons that 16bit instructions are avoided now.

edited Jun 24 '12 at 12:03

answered Jun 24 '12 at 11:53

harold

41.7k357109

The behaviour for 16bit instructions is the strange one. The dependency madness is one of the reasons that 16bit instructions are avoided now.

edited Jun 24 '12 at 12:03

answered Jun 24 '12 at 11:53

harold

41.7k357109

edited Jun 24 '12 at 12:03

answered Jun 24 '12 at 11:53

harold

41.7k357109

answered Jun 24 '12 at 11:53

harold

41.7k357109

answered Jun 24 '12 at 11:53

harold

41.7k357109

7

I don't think it's strange, I think they didn't want to break too much and kept the old behavior there.

– Alexey Frunze
Jun 24 '12 at 11:56

4

@Alex when they introduced 32bit mode, there was no old behaviour for the high part. There was no high part before.. Of course after that it couldn't be changed anymore.

– harold
Jun 24 '12 at 11:59

1

I was speaking about 16-bit operands, why the top bits don't get zeroed in that case. They don't in non-64-bit modes. And that's kept in 64-bit mode too.

– Alexey Frunze
Jun 24 '12 at 12:04

3

I interpreted your "The behaviour for 16bit instructions is the strange one" as "it's strange that zero-extension doesn't happen with 16-bit operands in 64-bit mode". Hence my comments about keeping it the same way in 64-bit mode for better compatibility.

– Alexey Frunze
Jun 24 '12 at 12:09

8

@Alex oh I see. Ok. I don't think it's strange from that perspective. Just from a "looking back, maybe it wasn't such a good idea"-perspective. Guess I should have been clearer :)

– harold
Jun 24 '12 at 12:12

|
show 4 more comments

7

I don't think it's strange, I think they didn't want to break too much and kept the old behavior there.

– Alexey Frunze
Jun 24 '12 at 11:56

4

@Alex when they introduced 32bit mode, there was no old behaviour for the high part. There was no high part before.. Of course after that it couldn't be changed anymore.

– harold
Jun 24 '12 at 11:59

1

I was speaking about 16-bit operands, why the top bits don't get zeroed in that case. They don't in non-64-bit modes. And that's kept in 64-bit mode too.

– Alexey Frunze
Jun 24 '12 at 12:04

3

I interpreted your "The behaviour for 16bit instructions is the strange one" as "it's strange that zero-extension doesn't happen with 16-bit operands in 64-bit mode". Hence my comments about keeping it the same way in 64-bit mode for better compatibility.

– Alexey Frunze
Jun 24 '12 at 12:09

8

@Alex oh I see. Ok. I don't think it's strange from that perspective. Just from a "looking back, maybe it wasn't such a good idea"-perspective. Guess I should have been clearer :)

– harold
Jun 24 '12 at 12:12

I don't think it's strange, I think they didn't want to break too much and kept the old behavior there.

– Alexey Frunze
Jun 24 '12 at 11:56

@Alex when they introduced 32bit mode, there was no old behaviour for the high part. There was no high part before.. Of course after that it couldn't be changed anymore.

– harold
Jun 24 '12 at 11:59

I was speaking about 16-bit operands, why the top bits don't get zeroed in that case. They don't in non-64-bit modes. And that's kept in 64-bit mode too.

– Alexey Frunze
Jun 24 '12 at 12:04

I interpreted your "The behaviour for 16bit instructions is the strange one" as "it's strange that zero-extension doesn't happen with 16-bit operands in 64-bit mode". Hence my comments about keeping it the same way in 64-bit mode for better compatibility.

– Alexey Frunze
Jun 24 '12 at 12:09

@Alex oh I see. Ok. I don't think it's strange from that perspective. Just from a "looking back, maybe it wasn't such a good idea"-perspective. Guess I should have been clearer :)

– harold
Jun 24 '12 at 12:12

|
show 4 more comments

It simply saves space in the instructions, and the instruction set. You can move small immediate values to a 64-bit register by using existing (32-bit) instructions.

It also saves you from having to encode 8 byte values for MOV RAX, 42, when MOV EAX, 42 can be reused.

This optimization is not as important for 8 and 16 bit ops (because they are smaller), and changing the rules there would also break old code.

answered Jun 24 '12 at 11:50

Bo Persson

78.2k17118184

6

If that's correct, wouldn't it have made more sense for it to sign-extend rather than 0 extend?

– Damien_The_Unbeliever
Jun 24 '12 at 11:54

2

@Alex: And sign-extension isn't? Both can be done very cheaply in hardware.

– jalf
Jun 24 '12 at 11:59

2

@Alex: no it's not. It would be a bit slower if done in software, sure, but in hardware, it'd, at worst, cost a few more transistors, which, on a chip the size and complexity of a modern CPU, that's really not an issue.

– jalf
Jun 24 '12 at 14:03

14

Sign extension is slower, even in hardware. Zero extension can be done in parallel with whatever computation produces the lower half, but sign extension can't be done until (at least the sign of) the lower half has been computed.

– Jerry Coffin
Jun 24 '12 at 14:26

10

Another related trick is to use XOR EAX, EAX because XOR RAX, RAX would need an REX prefix.

– Neil
Oct 2 '13 at 9:12

|
show 7 more comments

It simply saves space in the instructions, and the instruction set. You can move small immediate values to a 64-bit register by using existing (32-bit) instructions.

It also saves you from having to encode 8 byte values for MOV RAX, 42, when MOV EAX, 42 can be reused.

This optimization is not as important for 8 and 16 bit ops (because they are smaller), and changing the rules there would also break old code.

answered Jun 24 '12 at 11:50

Bo Persson

78.2k17118184

6

If that's correct, wouldn't it have made more sense for it to sign-extend rather than 0 extend?

– Damien_The_Unbeliever
Jun 24 '12 at 11:54

2

@Alex: And sign-extension isn't? Both can be done very cheaply in hardware.

– jalf
Jun 24 '12 at 11:59

2

@Alex: no it's not. It would be a bit slower if done in software, sure, but in hardware, it'd, at worst, cost a few more transistors, which, on a chip the size and complexity of a modern CPU, that's really not an issue.

– jalf
Jun 24 '12 at 14:03

14

Sign extension is slower, even in hardware. Zero extension can be done in parallel with whatever computation produces the lower half, but sign extension can't be done until (at least the sign of) the lower half has been computed.

– Jerry Coffin
Jun 24 '12 at 14:26

10

Another related trick is to use XOR EAX, EAX because XOR RAX, RAX would need an REX prefix.

– Neil
Oct 2 '13 at 9:12

|
show 7 more comments

It simply saves space in the instructions, and the instruction set. You can move small immediate values to a 64-bit register by using existing (32-bit) instructions.

It also saves you from having to encode 8 byte values for MOV RAX, 42, when MOV EAX, 42 can be reused.

This optimization is not as important for 8 and 16 bit ops (because they are smaller), and changing the rules there would also break old code.

answered Jun 24 '12 at 11:50

Bo Persson

78.2k17118184

It simply saves space in the instructions, and the instruction set. You can move small immediate values to a 64-bit register by using existing (32-bit) instructions.

It also saves you from having to encode 8 byte values for MOV RAX, 42, when MOV EAX, 42 can be reused.

This optimization is not as important for 8 and 16 bit ops (because they are smaller), and changing the rules there would also break old code.

answered Jun 24 '12 at 11:50

Bo Persson

78.2k17118184

answered Jun 24 '12 at 11:50

Bo Persson

78.2k17118184

answered Jun 24 '12 at 11:50

Bo Persson

78.2k17118184

answered Jun 24 '12 at 11:50

Bo Persson

78.2k17118184

6

If that's correct, wouldn't it have made more sense for it to sign-extend rather than 0 extend?

– Damien_The_Unbeliever
Jun 24 '12 at 11:54

2

@Alex: And sign-extension isn't? Both can be done very cheaply in hardware.

– jalf
Jun 24 '12 at 11:59

2

@Alex: no it's not. It would be a bit slower if done in software, sure, but in hardware, it'd, at worst, cost a few more transistors, which, on a chip the size and complexity of a modern CPU, that's really not an issue.

– jalf
Jun 24 '12 at 14:03

14

Sign extension is slower, even in hardware. Zero extension can be done in parallel with whatever computation produces the lower half, but sign extension can't be done until (at least the sign of) the lower half has been computed.

– Jerry Coffin
Jun 24 '12 at 14:26

10

Another related trick is to use XOR EAX, EAX because XOR RAX, RAX would need an REX prefix.

– Neil
Oct 2 '13 at 9:12

|
show 7 more comments

6

If that's correct, wouldn't it have made more sense for it to sign-extend rather than 0 extend?

– Damien_The_Unbeliever
Jun 24 '12 at 11:54

2

@Alex: And sign-extension isn't? Both can be done very cheaply in hardware.

– jalf
Jun 24 '12 at 11:59

2

@Alex: no it's not. It would be a bit slower if done in software, sure, but in hardware, it'd, at worst, cost a few more transistors, which, on a chip the size and complexity of a modern CPU, that's really not an issue.

– jalf
Jun 24 '12 at 14:03

14

Sign extension is slower, even in hardware. Zero extension can be done in parallel with whatever computation produces the lower half, but sign extension can't be done until (at least the sign of) the lower half has been computed.

– Jerry Coffin
Jun 24 '12 at 14:26

10

Another related trick is to use XOR EAX, EAX because XOR RAX, RAX would need an REX prefix.

– Neil
Oct 2 '13 at 9:12

If that's correct, wouldn't it have made more sense for it to sign-extend rather than 0 extend?

– Damien_The_Unbeliever
Jun 24 '12 at 11:54

@Alex: And sign-extension isn't? Both can be done very cheaply in hardware.

– jalf
Jun 24 '12 at 11:59

@Alex: no it's not. It would be a bit slower if done in software, sure, but in hardware, it'd, at worst, cost a few more transistors, which, on a chip the size and complexity of a modern CPU, that's really not an issue.

– jalf
Jun 24 '12 at 14:03

Sign extension is slower, even in hardware. Zero extension can be done in parallel with whatever computation produces the lower half, but sign extension can't be done until (at least the sign of) the lower half has been computed.

– Jerry Coffin
Jun 24 '12 at 14:26

Another related trick is to use XOR EAX, EAX because XOR RAX, RAX would need an REX prefix.

– Neil
Oct 2 '13 at 9:12

|
show 7 more comments

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bdtjtk