Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?












88















In the x86-64 Tour of Intel Manuals, I read




Perhaps the most surprising fact is that an instruction such as MOV EAX, EBX automatically zeroes upper 32 bits of RAX register.




The Intel documentation (3.4.1.1 General-Purpose Registers in 64-Bit Mode in manual Basic Architecture) quoted at the same source tells us:





  • 64-bit operands generate a 64-bit result in the destination general-purpose register.

  • 32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register.

  • 8-bit and 16-bit operands generate an 8-bit or 16-bit result. The upper 56 bits or 48 bits (respectively) of the destination general-purpose register are not be modified by the operation. If the result of an 8-bit or 16-bit operation is intended for 64-bit address calculation, explicitly sign-extend the register to the full 64-bits.




In x86-32 and x86-64 assembly, 16 bit instructions such as



mov ax, bx


don't show this kind of "strange" behaviour that the upper word of eax is zeroed.



Thus: what is the reason why this behaviour was introduced? At a first glance it seems illogical (but the reason might be that I am used to the quirks of x86-32 assembly).










share|improve this question




















  • 16





    If you Google for "Partial register stall", you'll find quite a bit of information about the problem they were (almost certainly) trying to avoid.

    – Jerry Coffin
    Jun 24 '12 at 14:38






  • 3





    stackoverflow.com/questions/25455447/…

    – Hans Passant
    Aug 27 '15 at 7:16






  • 2





    Not just "most". AFAIK, all instructions with an r32 destination operand zero the high 32, rather than merging. For example, some assemblers will replace pmovmskb r64, xmm with pmovmskb r32, xmm, saving a REX, because the 64bit destination version behaves identically. Even though the Operation section of the manual lists all 6 combinations of 32/64bit dest and 64/128/256b source separately, the implicit zero-extension of the r32 form duplicates the explicit zero-extension of the r64 form. I'm curious about the HW implementation...

    – Peter Cordes
    May 26 '16 at 23:38






  • 2





    @HansPassant, the circular reference begins.

    – kchoi
    Jul 15 '16 at 23:26






  • 1





    Related: xor eax,eax or xor r8d,r8d is the best way to zero RAX or R8 (saving a REX prefix for RAX, and 64-bit XOR isn't even handled specially on Silvermont). Related: How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

    – Peter Cordes
    Nov 21 '17 at 23:44


















88















In the x86-64 Tour of Intel Manuals, I read




Perhaps the most surprising fact is that an instruction such as MOV EAX, EBX automatically zeroes upper 32 bits of RAX register.




The Intel documentation (3.4.1.1 General-Purpose Registers in 64-Bit Mode in manual Basic Architecture) quoted at the same source tells us:





  • 64-bit operands generate a 64-bit result in the destination general-purpose register.

  • 32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register.

  • 8-bit and 16-bit operands generate an 8-bit or 16-bit result. The upper 56 bits or 48 bits (respectively) of the destination general-purpose register are not be modified by the operation. If the result of an 8-bit or 16-bit operation is intended for 64-bit address calculation, explicitly sign-extend the register to the full 64-bits.




In x86-32 and x86-64 assembly, 16 bit instructions such as



mov ax, bx


don't show this kind of "strange" behaviour that the upper word of eax is zeroed.



Thus: what is the reason why this behaviour was introduced? At a first glance it seems illogical (but the reason might be that I am used to the quirks of x86-32 assembly).










share|improve this question




















  • 16





    If you Google for "Partial register stall", you'll find quite a bit of information about the problem they were (almost certainly) trying to avoid.

    – Jerry Coffin
    Jun 24 '12 at 14:38






  • 3





    stackoverflow.com/questions/25455447/…

    – Hans Passant
    Aug 27 '15 at 7:16






  • 2





    Not just "most". AFAIK, all instructions with an r32 destination operand zero the high 32, rather than merging. For example, some assemblers will replace pmovmskb r64, xmm with pmovmskb r32, xmm, saving a REX, because the 64bit destination version behaves identically. Even though the Operation section of the manual lists all 6 combinations of 32/64bit dest and 64/128/256b source separately, the implicit zero-extension of the r32 form duplicates the explicit zero-extension of the r64 form. I'm curious about the HW implementation...

    – Peter Cordes
    May 26 '16 at 23:38






  • 2





    @HansPassant, the circular reference begins.

    – kchoi
    Jul 15 '16 at 23:26






  • 1





    Related: xor eax,eax or xor r8d,r8d is the best way to zero RAX or R8 (saving a REX prefix for RAX, and 64-bit XOR isn't even handled specially on Silvermont). Related: How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

    – Peter Cordes
    Nov 21 '17 at 23:44
















88












88








88


20






In the x86-64 Tour of Intel Manuals, I read




Perhaps the most surprising fact is that an instruction such as MOV EAX, EBX automatically zeroes upper 32 bits of RAX register.




The Intel documentation (3.4.1.1 General-Purpose Registers in 64-Bit Mode in manual Basic Architecture) quoted at the same source tells us:





  • 64-bit operands generate a 64-bit result in the destination general-purpose register.

  • 32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register.

  • 8-bit and 16-bit operands generate an 8-bit or 16-bit result. The upper 56 bits or 48 bits (respectively) of the destination general-purpose register are not be modified by the operation. If the result of an 8-bit or 16-bit operation is intended for 64-bit address calculation, explicitly sign-extend the register to the full 64-bits.




In x86-32 and x86-64 assembly, 16 bit instructions such as



mov ax, bx


don't show this kind of "strange" behaviour that the upper word of eax is zeroed.



Thus: what is the reason why this behaviour was introduced? At a first glance it seems illogical (but the reason might be that I am used to the quirks of x86-32 assembly).










share|improve this question
















In the x86-64 Tour of Intel Manuals, I read




Perhaps the most surprising fact is that an instruction such as MOV EAX, EBX automatically zeroes upper 32 bits of RAX register.




The Intel documentation (3.4.1.1 General-Purpose Registers in 64-Bit Mode in manual Basic Architecture) quoted at the same source tells us:





  • 64-bit operands generate a 64-bit result in the destination general-purpose register.

  • 32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register.

  • 8-bit and 16-bit operands generate an 8-bit or 16-bit result. The upper 56 bits or 48 bits (respectively) of the destination general-purpose register are not be modified by the operation. If the result of an 8-bit or 16-bit operation is intended for 64-bit address calculation, explicitly sign-extend the register to the full 64-bits.




In x86-32 and x86-64 assembly, 16 bit instructions such as



mov ax, bx


don't show this kind of "strange" behaviour that the upper word of eax is zeroed.



Thus: what is the reason why this behaviour was introduced? At a first glance it seems illogical (but the reason might be that I am used to the quirks of x86-32 assembly).







assembly x86 x86-64 cpu-registers zero-extension






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Aug 1 '18 at 16:49







Nubok

















asked Jun 24 '12 at 11:40









NubokNubok

1,62142038




1,62142038








  • 16





    If you Google for "Partial register stall", you'll find quite a bit of information about the problem they were (almost certainly) trying to avoid.

    – Jerry Coffin
    Jun 24 '12 at 14:38






  • 3





    stackoverflow.com/questions/25455447/…

    – Hans Passant
    Aug 27 '15 at 7:16






  • 2





    Not just "most". AFAIK, all instructions with an r32 destination operand zero the high 32, rather than merging. For example, some assemblers will replace pmovmskb r64, xmm with pmovmskb r32, xmm, saving a REX, because the 64bit destination version behaves identically. Even though the Operation section of the manual lists all 6 combinations of 32/64bit dest and 64/128/256b source separately, the implicit zero-extension of the r32 form duplicates the explicit zero-extension of the r64 form. I'm curious about the HW implementation...

    – Peter Cordes
    May 26 '16 at 23:38






  • 2





    @HansPassant, the circular reference begins.

    – kchoi
    Jul 15 '16 at 23:26






  • 1





    Related: xor eax,eax or xor r8d,r8d is the best way to zero RAX or R8 (saving a REX prefix for RAX, and 64-bit XOR isn't even handled specially on Silvermont). Related: How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

    – Peter Cordes
    Nov 21 '17 at 23:44
















  • 16





    If you Google for "Partial register stall", you'll find quite a bit of information about the problem they were (almost certainly) trying to avoid.

    – Jerry Coffin
    Jun 24 '12 at 14:38






  • 3





    stackoverflow.com/questions/25455447/…

    – Hans Passant
    Aug 27 '15 at 7:16






  • 2





    Not just "most". AFAIK, all instructions with an r32 destination operand zero the high 32, rather than merging. For example, some assemblers will replace pmovmskb r64, xmm with pmovmskb r32, xmm, saving a REX, because the 64bit destination version behaves identically. Even though the Operation section of the manual lists all 6 combinations of 32/64bit dest and 64/128/256b source separately, the implicit zero-extension of the r32 form duplicates the explicit zero-extension of the r64 form. I'm curious about the HW implementation...

    – Peter Cordes
    May 26 '16 at 23:38






  • 2





    @HansPassant, the circular reference begins.

    – kchoi
    Jul 15 '16 at 23:26






  • 1





    Related: xor eax,eax or xor r8d,r8d is the best way to zero RAX or R8 (saving a REX prefix for RAX, and 64-bit XOR isn't even handled specially on Silvermont). Related: How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

    – Peter Cordes
    Nov 21 '17 at 23:44










16




16





If you Google for "Partial register stall", you'll find quite a bit of information about the problem they were (almost certainly) trying to avoid.

– Jerry Coffin
Jun 24 '12 at 14:38





If you Google for "Partial register stall", you'll find quite a bit of information about the problem they were (almost certainly) trying to avoid.

– Jerry Coffin
Jun 24 '12 at 14:38




3




3





stackoverflow.com/questions/25455447/…

– Hans Passant
Aug 27 '15 at 7:16





stackoverflow.com/questions/25455447/…

– Hans Passant
Aug 27 '15 at 7:16




2




2





Not just "most". AFAIK, all instructions with an r32 destination operand zero the high 32, rather than merging. For example, some assemblers will replace pmovmskb r64, xmm with pmovmskb r32, xmm, saving a REX, because the 64bit destination version behaves identically. Even though the Operation section of the manual lists all 6 combinations of 32/64bit dest and 64/128/256b source separately, the implicit zero-extension of the r32 form duplicates the explicit zero-extension of the r64 form. I'm curious about the HW implementation...

– Peter Cordes
May 26 '16 at 23:38





Not just "most". AFAIK, all instructions with an r32 destination operand zero the high 32, rather than merging. For example, some assemblers will replace pmovmskb r64, xmm with pmovmskb r32, xmm, saving a REX, because the 64bit destination version behaves identically. Even though the Operation section of the manual lists all 6 combinations of 32/64bit dest and 64/128/256b source separately, the implicit zero-extension of the r32 form duplicates the explicit zero-extension of the r64 form. I'm curious about the HW implementation...

– Peter Cordes
May 26 '16 at 23:38




2




2





@HansPassant, the circular reference begins.

– kchoi
Jul 15 '16 at 23:26





@HansPassant, the circular reference begins.

– kchoi
Jul 15 '16 at 23:26




1




1





Related: xor eax,eax or xor r8d,r8d is the best way to zero RAX or R8 (saving a REX prefix for RAX, and 64-bit XOR isn't even handled specially on Silvermont). Related: How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

– Peter Cordes
Nov 21 '17 at 23:44







Related: xor eax,eax or xor r8d,r8d is the best way to zero RAX or R8 (saving a REX prefix for RAX, and 64-bit XOR isn't even handled specially on Silvermont). Related: How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

– Peter Cordes
Nov 21 '17 at 23:44














2 Answers
2






active

oldest

votes


















72














I'm not AMD or speaking for them, but I would have done it the same way. Because zeroing the high half doesn't create a dependency on the previous value, that the cpu would have to wait on. The register renaming mechanism would essentially be defeated if it wasn't done that way. This way you can write fast 32bit code in 64bit mode without having to explicitly break dependencies all the time. Without this behaviour, every single 32bit instruction in 64bit mode would have to wait on something that happened before, even though that high part would almost never be used.



The behaviour for 16bit instructions is the strange one. The dependency madness is one of the reasons that 16bit instructions are avoided now.






share|improve this answer





















  • 7





    I don't think it's strange, I think they didn't want to break too much and kept the old behavior there.

    – Alexey Frunze
    Jun 24 '12 at 11:56






  • 4





    @Alex when they introduced 32bit mode, there was no old behaviour for the high part. There was no high part before.. Of course after that it couldn't be changed anymore.

    – harold
    Jun 24 '12 at 11:59






  • 1





    I was speaking about 16-bit operands, why the top bits don't get zeroed in that case. They don't in non-64-bit modes. And that's kept in 64-bit mode too.

    – Alexey Frunze
    Jun 24 '12 at 12:04








  • 3





    I interpreted your "The behaviour for 16bit instructions is the strange one" as "it's strange that zero-extension doesn't happen with 16-bit operands in 64-bit mode". Hence my comments about keeping it the same way in 64-bit mode for better compatibility.

    – Alexey Frunze
    Jun 24 '12 at 12:09








  • 8





    @Alex oh I see. Ok. I don't think it's strange from that perspective. Just from a "looking back, maybe it wasn't such a good idea"-perspective. Guess I should have been clearer :)

    – harold
    Jun 24 '12 at 12:12



















8














It simply saves space in the instructions, and the instruction set. You can move small immediate values to a 64-bit register by using existing (32-bit) instructions.



It also saves you from having to encode 8 byte values for MOV RAX, 42, when MOV EAX, 42 can be reused.



This optimization is not as important for 8 and 16 bit ops (because they are smaller), and changing the rules there would also break old code.






share|improve this answer



















  • 6





    If that's correct, wouldn't it have made more sense for it to sign-extend rather than 0 extend?

    – Damien_The_Unbeliever
    Jun 24 '12 at 11:54






  • 2





    @Alex: And sign-extension isn't? Both can be done very cheaply in hardware.

    – jalf
    Jun 24 '12 at 11:59








  • 2





    @Alex: no it's not. It would be a bit slower if done in software, sure, but in hardware, it'd, at worst, cost a few more transistors, which, on a chip the size and complexity of a modern CPU, that's really not an issue.

    – jalf
    Jun 24 '12 at 14:03






  • 14





    Sign extension is slower, even in hardware. Zero extension can be done in parallel with whatever computation produces the lower half, but sign extension can't be done until (at least the sign of) the lower half has been computed.

    – Jerry Coffin
    Jun 24 '12 at 14:26






  • 10





    Another related trick is to use XOR EAX, EAX because XOR RAX, RAX would need an REX prefix.

    – Neil
    Oct 2 '13 at 9:12











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f11177137%2fwhy-do-x86-64-instructions-on-32-bit-registers-zero-the-upper-part-of-the-full-6%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









72














I'm not AMD or speaking for them, but I would have done it the same way. Because zeroing the high half doesn't create a dependency on the previous value, that the cpu would have to wait on. The register renaming mechanism would essentially be defeated if it wasn't done that way. This way you can write fast 32bit code in 64bit mode without having to explicitly break dependencies all the time. Without this behaviour, every single 32bit instruction in 64bit mode would have to wait on something that happened before, even though that high part would almost never be used.



The behaviour for 16bit instructions is the strange one. The dependency madness is one of the reasons that 16bit instructions are avoided now.






share|improve this answer





















  • 7





    I don't think it's strange, I think they didn't want to break too much and kept the old behavior there.

    – Alexey Frunze
    Jun 24 '12 at 11:56






  • 4





    @Alex when they introduced 32bit mode, there was no old behaviour for the high part. There was no high part before.. Of course after that it couldn't be changed anymore.

    – harold
    Jun 24 '12 at 11:59






  • 1





    I was speaking about 16-bit operands, why the top bits don't get zeroed in that case. They don't in non-64-bit modes. And that's kept in 64-bit mode too.

    – Alexey Frunze
    Jun 24 '12 at 12:04








  • 3





    I interpreted your "The behaviour for 16bit instructions is the strange one" as "it's strange that zero-extension doesn't happen with 16-bit operands in 64-bit mode". Hence my comments about keeping it the same way in 64-bit mode for better compatibility.

    – Alexey Frunze
    Jun 24 '12 at 12:09








  • 8





    @Alex oh I see. Ok. I don't think it's strange from that perspective. Just from a "looking back, maybe it wasn't such a good idea"-perspective. Guess I should have been clearer :)

    – harold
    Jun 24 '12 at 12:12
















72














I'm not AMD or speaking for them, but I would have done it the same way. Because zeroing the high half doesn't create a dependency on the previous value, that the cpu would have to wait on. The register renaming mechanism would essentially be defeated if it wasn't done that way. This way you can write fast 32bit code in 64bit mode without having to explicitly break dependencies all the time. Without this behaviour, every single 32bit instruction in 64bit mode would have to wait on something that happened before, even though that high part would almost never be used.



The behaviour for 16bit instructions is the strange one. The dependency madness is one of the reasons that 16bit instructions are avoided now.






share|improve this answer





















  • 7





    I don't think it's strange, I think they didn't want to break too much and kept the old behavior there.

    – Alexey Frunze
    Jun 24 '12 at 11:56






  • 4





    @Alex when they introduced 32bit mode, there was no old behaviour for the high part. There was no high part before.. Of course after that it couldn't be changed anymore.

    – harold
    Jun 24 '12 at 11:59






  • 1





    I was speaking about 16-bit operands, why the top bits don't get zeroed in that case. They don't in non-64-bit modes. And that's kept in 64-bit mode too.

    – Alexey Frunze
    Jun 24 '12 at 12:04








  • 3





    I interpreted your "The behaviour for 16bit instructions is the strange one" as "it's strange that zero-extension doesn't happen with 16-bit operands in 64-bit mode". Hence my comments about keeping it the same way in 64-bit mode for better compatibility.

    – Alexey Frunze
    Jun 24 '12 at 12:09








  • 8





    @Alex oh I see. Ok. I don't think it's strange from that perspective. Just from a "looking back, maybe it wasn't such a good idea"-perspective. Guess I should have been clearer :)

    – harold
    Jun 24 '12 at 12:12














72












72








72







I'm not AMD or speaking for them, but I would have done it the same way. Because zeroing the high half doesn't create a dependency on the previous value, that the cpu would have to wait on. The register renaming mechanism would essentially be defeated if it wasn't done that way. This way you can write fast 32bit code in 64bit mode without having to explicitly break dependencies all the time. Without this behaviour, every single 32bit instruction in 64bit mode would have to wait on something that happened before, even though that high part would almost never be used.



The behaviour for 16bit instructions is the strange one. The dependency madness is one of the reasons that 16bit instructions are avoided now.






share|improve this answer















I'm not AMD or speaking for them, but I would have done it the same way. Because zeroing the high half doesn't create a dependency on the previous value, that the cpu would have to wait on. The register renaming mechanism would essentially be defeated if it wasn't done that way. This way you can write fast 32bit code in 64bit mode without having to explicitly break dependencies all the time. Without this behaviour, every single 32bit instruction in 64bit mode would have to wait on something that happened before, even though that high part would almost never be used.



The behaviour for 16bit instructions is the strange one. The dependency madness is one of the reasons that 16bit instructions are avoided now.







share|improve this answer














share|improve this answer



share|improve this answer








edited Jun 24 '12 at 12:03

























answered Jun 24 '12 at 11:53









haroldharold

41.7k357109




41.7k357109








  • 7





    I don't think it's strange, I think they didn't want to break too much and kept the old behavior there.

    – Alexey Frunze
    Jun 24 '12 at 11:56






  • 4





    @Alex when they introduced 32bit mode, there was no old behaviour for the high part. There was no high part before.. Of course after that it couldn't be changed anymore.

    – harold
    Jun 24 '12 at 11:59






  • 1





    I was speaking about 16-bit operands, why the top bits don't get zeroed in that case. They don't in non-64-bit modes. And that's kept in 64-bit mode too.

    – Alexey Frunze
    Jun 24 '12 at 12:04








  • 3





    I interpreted your "The behaviour for 16bit instructions is the strange one" as "it's strange that zero-extension doesn't happen with 16-bit operands in 64-bit mode". Hence my comments about keeping it the same way in 64-bit mode for better compatibility.

    – Alexey Frunze
    Jun 24 '12 at 12:09








  • 8





    @Alex oh I see. Ok. I don't think it's strange from that perspective. Just from a "looking back, maybe it wasn't such a good idea"-perspective. Guess I should have been clearer :)

    – harold
    Jun 24 '12 at 12:12














  • 7





    I don't think it's strange, I think they didn't want to break too much and kept the old behavior there.

    – Alexey Frunze
    Jun 24 '12 at 11:56






  • 4





    @Alex when they introduced 32bit mode, there was no old behaviour for the high part. There was no high part before.. Of course after that it couldn't be changed anymore.

    – harold
    Jun 24 '12 at 11:59






  • 1





    I was speaking about 16-bit operands, why the top bits don't get zeroed in that case. They don't in non-64-bit modes. And that's kept in 64-bit mode too.

    – Alexey Frunze
    Jun 24 '12 at 12:04








  • 3





    I interpreted your "The behaviour for 16bit instructions is the strange one" as "it's strange that zero-extension doesn't happen with 16-bit operands in 64-bit mode". Hence my comments about keeping it the same way in 64-bit mode for better compatibility.

    – Alexey Frunze
    Jun 24 '12 at 12:09








  • 8





    @Alex oh I see. Ok. I don't think it's strange from that perspective. Just from a "looking back, maybe it wasn't such a good idea"-perspective. Guess I should have been clearer :)

    – harold
    Jun 24 '12 at 12:12








7




7





I don't think it's strange, I think they didn't want to break too much and kept the old behavior there.

– Alexey Frunze
Jun 24 '12 at 11:56





I don't think it's strange, I think they didn't want to break too much and kept the old behavior there.

– Alexey Frunze
Jun 24 '12 at 11:56




4




4





@Alex when they introduced 32bit mode, there was no old behaviour for the high part. There was no high part before.. Of course after that it couldn't be changed anymore.

– harold
Jun 24 '12 at 11:59





@Alex when they introduced 32bit mode, there was no old behaviour for the high part. There was no high part before.. Of course after that it couldn't be changed anymore.

– harold
Jun 24 '12 at 11:59




1




1





I was speaking about 16-bit operands, why the top bits don't get zeroed in that case. They don't in non-64-bit modes. And that's kept in 64-bit mode too.

– Alexey Frunze
Jun 24 '12 at 12:04







I was speaking about 16-bit operands, why the top bits don't get zeroed in that case. They don't in non-64-bit modes. And that's kept in 64-bit mode too.

– Alexey Frunze
Jun 24 '12 at 12:04






3




3





I interpreted your "The behaviour for 16bit instructions is the strange one" as "it's strange that zero-extension doesn't happen with 16-bit operands in 64-bit mode". Hence my comments about keeping it the same way in 64-bit mode for better compatibility.

– Alexey Frunze
Jun 24 '12 at 12:09







I interpreted your "The behaviour for 16bit instructions is the strange one" as "it's strange that zero-extension doesn't happen with 16-bit operands in 64-bit mode". Hence my comments about keeping it the same way in 64-bit mode for better compatibility.

– Alexey Frunze
Jun 24 '12 at 12:09






8




8





@Alex oh I see. Ok. I don't think it's strange from that perspective. Just from a "looking back, maybe it wasn't such a good idea"-perspective. Guess I should have been clearer :)

– harold
Jun 24 '12 at 12:12





@Alex oh I see. Ok. I don't think it's strange from that perspective. Just from a "looking back, maybe it wasn't such a good idea"-perspective. Guess I should have been clearer :)

– harold
Jun 24 '12 at 12:12













8














It simply saves space in the instructions, and the instruction set. You can move small immediate values to a 64-bit register by using existing (32-bit) instructions.



It also saves you from having to encode 8 byte values for MOV RAX, 42, when MOV EAX, 42 can be reused.



This optimization is not as important for 8 and 16 bit ops (because they are smaller), and changing the rules there would also break old code.






share|improve this answer



















  • 6





    If that's correct, wouldn't it have made more sense for it to sign-extend rather than 0 extend?

    – Damien_The_Unbeliever
    Jun 24 '12 at 11:54






  • 2





    @Alex: And sign-extension isn't? Both can be done very cheaply in hardware.

    – jalf
    Jun 24 '12 at 11:59








  • 2





    @Alex: no it's not. It would be a bit slower if done in software, sure, but in hardware, it'd, at worst, cost a few more transistors, which, on a chip the size and complexity of a modern CPU, that's really not an issue.

    – jalf
    Jun 24 '12 at 14:03






  • 14





    Sign extension is slower, even in hardware. Zero extension can be done in parallel with whatever computation produces the lower half, but sign extension can't be done until (at least the sign of) the lower half has been computed.

    – Jerry Coffin
    Jun 24 '12 at 14:26






  • 10





    Another related trick is to use XOR EAX, EAX because XOR RAX, RAX would need an REX prefix.

    – Neil
    Oct 2 '13 at 9:12
















8














It simply saves space in the instructions, and the instruction set. You can move small immediate values to a 64-bit register by using existing (32-bit) instructions.



It also saves you from having to encode 8 byte values for MOV RAX, 42, when MOV EAX, 42 can be reused.



This optimization is not as important for 8 and 16 bit ops (because they are smaller), and changing the rules there would also break old code.






share|improve this answer



















  • 6





    If that's correct, wouldn't it have made more sense for it to sign-extend rather than 0 extend?

    – Damien_The_Unbeliever
    Jun 24 '12 at 11:54






  • 2





    @Alex: And sign-extension isn't? Both can be done very cheaply in hardware.

    – jalf
    Jun 24 '12 at 11:59








  • 2





    @Alex: no it's not. It would be a bit slower if done in software, sure, but in hardware, it'd, at worst, cost a few more transistors, which, on a chip the size and complexity of a modern CPU, that's really not an issue.

    – jalf
    Jun 24 '12 at 14:03






  • 14





    Sign extension is slower, even in hardware. Zero extension can be done in parallel with whatever computation produces the lower half, but sign extension can't be done until (at least the sign of) the lower half has been computed.

    – Jerry Coffin
    Jun 24 '12 at 14:26






  • 10





    Another related trick is to use XOR EAX, EAX because XOR RAX, RAX would need an REX prefix.

    – Neil
    Oct 2 '13 at 9:12














8












8








8







It simply saves space in the instructions, and the instruction set. You can move small immediate values to a 64-bit register by using existing (32-bit) instructions.



It also saves you from having to encode 8 byte values for MOV RAX, 42, when MOV EAX, 42 can be reused.



This optimization is not as important for 8 and 16 bit ops (because they are smaller), and changing the rules there would also break old code.






share|improve this answer













It simply saves space in the instructions, and the instruction set. You can move small immediate values to a 64-bit register by using existing (32-bit) instructions.



It also saves you from having to encode 8 byte values for MOV RAX, 42, when MOV EAX, 42 can be reused.



This optimization is not as important for 8 and 16 bit ops (because they are smaller), and changing the rules there would also break old code.







share|improve this answer












share|improve this answer



share|improve this answer










answered Jun 24 '12 at 11:50









Bo PerssonBo Persson

78.2k17118184




78.2k17118184








  • 6





    If that's correct, wouldn't it have made more sense for it to sign-extend rather than 0 extend?

    – Damien_The_Unbeliever
    Jun 24 '12 at 11:54






  • 2





    @Alex: And sign-extension isn't? Both can be done very cheaply in hardware.

    – jalf
    Jun 24 '12 at 11:59








  • 2





    @Alex: no it's not. It would be a bit slower if done in software, sure, but in hardware, it'd, at worst, cost a few more transistors, which, on a chip the size and complexity of a modern CPU, that's really not an issue.

    – jalf
    Jun 24 '12 at 14:03






  • 14





    Sign extension is slower, even in hardware. Zero extension can be done in parallel with whatever computation produces the lower half, but sign extension can't be done until (at least the sign of) the lower half has been computed.

    – Jerry Coffin
    Jun 24 '12 at 14:26






  • 10





    Another related trick is to use XOR EAX, EAX because XOR RAX, RAX would need an REX prefix.

    – Neil
    Oct 2 '13 at 9:12














  • 6





    If that's correct, wouldn't it have made more sense for it to sign-extend rather than 0 extend?

    – Damien_The_Unbeliever
    Jun 24 '12 at 11:54






  • 2





    @Alex: And sign-extension isn't? Both can be done very cheaply in hardware.

    – jalf
    Jun 24 '12 at 11:59








  • 2





    @Alex: no it's not. It would be a bit slower if done in software, sure, but in hardware, it'd, at worst, cost a few more transistors, which, on a chip the size and complexity of a modern CPU, that's really not an issue.

    – jalf
    Jun 24 '12 at 14:03






  • 14





    Sign extension is slower, even in hardware. Zero extension can be done in parallel with whatever computation produces the lower half, but sign extension can't be done until (at least the sign of) the lower half has been computed.

    – Jerry Coffin
    Jun 24 '12 at 14:26






  • 10





    Another related trick is to use XOR EAX, EAX because XOR RAX, RAX would need an REX prefix.

    – Neil
    Oct 2 '13 at 9:12








6




6





If that's correct, wouldn't it have made more sense for it to sign-extend rather than 0 extend?

– Damien_The_Unbeliever
Jun 24 '12 at 11:54





If that's correct, wouldn't it have made more sense for it to sign-extend rather than 0 extend?

– Damien_The_Unbeliever
Jun 24 '12 at 11:54




2




2





@Alex: And sign-extension isn't? Both can be done very cheaply in hardware.

– jalf
Jun 24 '12 at 11:59







@Alex: And sign-extension isn't? Both can be done very cheaply in hardware.

– jalf
Jun 24 '12 at 11:59






2




2





@Alex: no it's not. It would be a bit slower if done in software, sure, but in hardware, it'd, at worst, cost a few more transistors, which, on a chip the size and complexity of a modern CPU, that's really not an issue.

– jalf
Jun 24 '12 at 14:03





@Alex: no it's not. It would be a bit slower if done in software, sure, but in hardware, it'd, at worst, cost a few more transistors, which, on a chip the size and complexity of a modern CPU, that's really not an issue.

– jalf
Jun 24 '12 at 14:03




14




14





Sign extension is slower, even in hardware. Zero extension can be done in parallel with whatever computation produces the lower half, but sign extension can't be done until (at least the sign of) the lower half has been computed.

– Jerry Coffin
Jun 24 '12 at 14:26





Sign extension is slower, even in hardware. Zero extension can be done in parallel with whatever computation produces the lower half, but sign extension can't be done until (at least the sign of) the lower half has been computed.

– Jerry Coffin
Jun 24 '12 at 14:26




10




10





Another related trick is to use XOR EAX, EAX because XOR RAX, RAX would need an REX prefix.

– Neil
Oct 2 '13 at 9:12





Another related trick is to use XOR EAX, EAX because XOR RAX, RAX would need an REX prefix.

– Neil
Oct 2 '13 at 9:12


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f11177137%2fwhy-do-x86-64-instructions-on-32-bit-registers-zero-the-upper-part-of-the-full-6%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Monofisismo

Angular Downloading a file using contenturl with Basic Authentication

Olmecas