ARMv8 floating point output inline assembly

For adding two integers, I write:

int sum;

asm volatile("add %0, x3, x4" : "=r"(sum) : :);

How can I do this with two floats?
I tried:

float sum;

asm volatile("fadd %0, s3, s4" : "=r"(sum) : :);

But it gives me an error:

Error: operand 1 should be a SIMD vector register -- `fadd x0,s3,s4'

Any ideas?

edited Dec 31 '18 at 0:54

Peter Cordes

121k17184312

asked Dec 28 '18 at 14:41

今天春天

486418

add a comment |

For adding two integers, I write:

int sum;

asm volatile("add %0, x3, x4" : "=r"(sum) : :);

How can I do this with two floats?
I tried:

float sum;

asm volatile("fadd %0, s3, s4" : "=r"(sum) : :);

But it gives me an error:

Error: operand 1 should be a SIMD vector register -- `fadd x0,s3,s4'

Any ideas?

edited Dec 31 '18 at 0:54

Peter Cordes

121k17184312

asked Dec 28 '18 at 14:41

今天春天

486418

add a comment |

For adding two integers, I write:

int sum;

asm volatile("add %0, x3, x4" : "=r"(sum) : :);

How can I do this with two floats?
I tried:

float sum;

asm volatile("fadd %0, s3, s4" : "=r"(sum) : :);

But it gives me an error:

Error: operand 1 should be a SIMD vector register -- `fadd x0,s3,s4'

Any ideas?

edited Dec 31 '18 at 0:54

Peter Cordes

121k17184312

asked Dec 28 '18 at 14:41

今天春天

486418

For adding two integers, I write:

int sum;

asm volatile("add %0, x3, x4" : "=r"(sum) : :);

How can I do this with two floats?
I tried:

float sum;

asm volatile("fadd %0, s3, s4" : "=r"(sum) : :);

But it gives me an error:

Error: operand 1 should be a SIMD vector register -- `fadd x0,s3,s4'

Any ideas?

gcc floating-point arm inline-assembly arm64

edited Dec 31 '18 at 0:54

Peter Cordes

121k17184312

asked Dec 28 '18 at 14:41

今天春天

486418

edited Dec 31 '18 at 0:54

Peter Cordes

121k17184312

asked Dec 28 '18 at 14:41

今天春天

486418

edited Dec 31 '18 at 0:54

Peter Cordes

121k17184312

edited Dec 31 '18 at 0:54

Peter Cordes

121k17184312

edited Dec 31 '18 at 0:54

Peter Cordes

121k17184312

asked Dec 28 '18 at 14:41

今天春天

486418

asked Dec 28 '18 at 14:41

今天春天

486418

asked Dec 28 '18 at 14:41

今天春天

486418

add a comment |

2 Answers
2

active

oldest

votes

Because registers can have multiple names in AArch64 (v0, b0, h0, s0, d0 all refer to the same register) it is necessary to add an output modifier to the print string:

On Godbolt

float foo()

{

    float sum;

    asm volatile("fadd %s0, s3, s4" : "=w"(sum) : :);

    return sum;

}



double dsum()

{

    double sum;

    asm volatile("fadd %d0, d3, d4" : "=w"(sum) : :);

    return sum;

}

Will produce:

foo:

        fadd s0, s3, s4 // sum

        ret     

dsum:

        fadd d0, d3, d4 // sum

        ret

answered Jan 3 at 23:30

James Greenhalgh

2,0141012

Nice, I figured there was probably a modifier while writing my answer, but I didn't see it in the GCC manual. Is this documented anywhere? I only know of the modifiers for x86 registers being in the gcc manual, at the bottom of the Extended asm section: gcc.gnu.org/onlinedocs/gcc/…

– Peter Cordes
Jan 4 at 2:49

@PeterCordes The problem is that the gcc people are reluctant to document these things. If they're documented, then they're not allowed to change them (which they probably don't do much anyway). One could argue that there are already enough people using them that changing them would cause wide-spread consternation, but that has not yet proven to be a good enough argument to overcome the inertia (but feel free to try!). x86 got doc'ed cuz I was changing the asm docs and added them, and no one felt strongly enough about it to argue with me. I only did x86 cuz that's what I know. Sorry.

– David Wohlferd
Jan 4 at 3:24

1

As an aside to James and OP: Be aware that since these modifiers AREN'T doc'ed, using them is unsupported. As with any undocumented feature, gcc can change them at any time. They probably won't, but they can.

– David Wohlferd
Jan 4 at 3:26

1

If you wanted to document the subset of modifiers that we shouldn’t change, I’ll approve the patch on list. Some (these, %w0 for printing the w rather than x name) should just be documented and fixed. Especially where behavior is consistent between GCC and Clang. To my great shame, the best documentation we have is at github.com/gcc-mirror/gcc/blob/master/gcc/config/aarch64/…

– James Greenhalgh
Jan 4 at 13:34

@PeterCordes - Is there any way you can take James up on his offer here? I'd love to help, but I know almost zero about arm, so selecting the subset is beyond what I can offer. I could help with the texinfo, but that's probably the least challenging part of this.

– David Wohlferd
Jan 4 at 23:40

|
show 5 more comments

"=r" is the constraint for GP integer registers.

The GCC manual claims that "=w" is the constraint for an FP / SIMD register on AArch64. But if you try that, you get v0 not s0, which won't assemble. I don't know a workaround here, you should probably report on the gcc bugzilla that the constraint documented in the manual doesn't work for scalar FP.

On Godbolt I tried this source:

float foo()

{

    float sum;

#ifdef __aarch64__

    asm volatile("fadd %0, s3, s4" : "=w"(sum) : :);   // AArch64

#else

    asm volatile("fadds %0, s3, s4" : "=t"(sum) : :);  // ARM32

#endif

    return sum;

}



double dsum()

{

    double sum;

#ifdef __aarch64__

    asm volatile("fadd %0, d3, d4" : "=w"(sum) : :);   // AArch64

#else

    asm volatile("faddd %0, d3, d4" : "=w"(sum) : :);  // ARM32

#endif

    return sum;

}

clang7.0 (with its built-in assembler) requires the asm to be actually valid. But for gcc we're only compiling to asm, and Godbolt doesn't have a "binary mode" for non-x86.

# AArch64 gcc 8.2  -xc -O3 -fverbose-asm -Wall

# INVALID ASM, errors if you try to actually assemble it.

foo:

    fadd v0, s3, s4 // sum

    ret     

dsum:

    fadd v0, d3, d4 // sum

    ret

clang produces the same asm, and its built-in assembler errors with:

<source>:5:18: error: invalid operand for instruction

    asm volatile("fadd %0, s3, s4" : "=w"(sum) : :);

                 ^

<inline asm>:1:11: note: instantiated into assembly here

        fadd v0, s3, s4

             ^

On 32-bit ARM, =t" for single works, but "=w" for (which the manual says you should use for double-precision) also gives you s0 with gcc. It works with clang, though. You have to use -mfloat-abi=hard and a -mcpu= something with an FPU, e.g. -mcpu=cortex-a15

# clang7.0 -xc -O3 -Wall--target=arm -mcpu=cortex-a15 -mfloat-abi=hard

# valid asm for ARM 32

foo:

        vadd.f32        s0, s3, s4

        bx      lr

dsum:

        vadd.f64        d0, d3, d4

        bx      lr

But gcc fails:

# ARM gcc 8.2  -xc -O3 -fverbose-asm -Wall -mfloat-abi=hard -mcpu=cortex-a15

foo:

        fadds s0, s3, s4        @ sum

        bx      lr  @

dsum:

        faddd s0, d3, d4        @ sum    @@@ INVALID

        bx      lr  @

So you can use =t for single just fine with gcc, but for double presumably you need a %something0 modifier to print the register name as d0 instead of s0, with a "=w" output.

Obviously these asm statements would only be useful for anything beyond learning the syntax if you add constraints to specify the input operands as well, instead of reading whatever happened to be sitting in s3 and s4.

See also https://stackoverflow.com/tags/inline-assembly/info

edited Dec 31 '18 at 0:25

answered Dec 30 '18 at 5:56

Peter Cordes

121k17184312

@DavidWohlferd: I meant that without input constraints, this toy inline asm is only useful for learning the syntax, not doing anything useful. To go beyond that, you need input constraints like "w" (input1).

– Peter Cordes
Dec 30 '18 at 23:52

@DavidWohlferd: I think a valid parsing of my sentence is that change is required for the statements to be useful for anything other than learning/testing the syntax. In my last edit I changed the wording of the rest of the sentence to try to make that clearer, but feel free to edit if you're still convinced it's confusing. Probably you aren't the only one that parsed it differently from how I intended.

– Peter Cordes
Dec 31 '18 at 5:20

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53960240%2farmv8-floating-point-output-inline-assembly%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Because registers can have multiple names in AArch64 (v0, b0, h0, s0, d0 all refer to the same register) it is necessary to add an output modifier to the print string:

On Godbolt

float foo()

{

    float sum;

    asm volatile("fadd %s0, s3, s4" : "=w"(sum) : :);

    return sum;

}



double dsum()

{

    double sum;

    asm volatile("fadd %d0, d3, d4" : "=w"(sum) : :);

    return sum;

}

Will produce:

foo:

        fadd s0, s3, s4 // sum

        ret     

dsum:

        fadd d0, d3, d4 // sum

        ret

answered Jan 3 at 23:30

James Greenhalgh

2,0141012

Nice, I figured there was probably a modifier while writing my answer, but I didn't see it in the GCC manual. Is this documented anywhere? I only know of the modifiers for x86 registers being in the gcc manual, at the bottom of the Extended asm section: gcc.gnu.org/onlinedocs/gcc/…

– Peter Cordes
Jan 4 at 2:49

@PeterCordes The problem is that the gcc people are reluctant to document these things. If they're documented, then they're not allowed to change them (which they probably don't do much anyway). One could argue that there are already enough people using them that changing them would cause wide-spread consternation, but that has not yet proven to be a good enough argument to overcome the inertia (but feel free to try!). x86 got doc'ed cuz I was changing the asm docs and added them, and no one felt strongly enough about it to argue with me. I only did x86 cuz that's what I know. Sorry.

– David Wohlferd
Jan 4 at 3:24

1

As an aside to James and OP: Be aware that since these modifiers AREN'T doc'ed, using them is unsupported. As with any undocumented feature, gcc can change them at any time. They probably won't, but they can.

– David Wohlferd
Jan 4 at 3:26

1

If you wanted to document the subset of modifiers that we shouldn’t change, I’ll approve the patch on list. Some (these, %w0 for printing the w rather than x name) should just be documented and fixed. Especially where behavior is consistent between GCC and Clang. To my great shame, the best documentation we have is at github.com/gcc-mirror/gcc/blob/master/gcc/config/aarch64/…

– James Greenhalgh
Jan 4 at 13:34

@PeterCordes - Is there any way you can take James up on his offer here? I'd love to help, but I know almost zero about arm, so selecting the subset is beyond what I can offer. I could help with the texinfo, but that's probably the least challenging part of this.

– David Wohlferd
Jan 4 at 23:40

|
show 5 more comments

Because registers can have multiple names in AArch64 (v0, b0, h0, s0, d0 all refer to the same register) it is necessary to add an output modifier to the print string:

On Godbolt

float foo()

{

    float sum;

    asm volatile("fadd %s0, s3, s4" : "=w"(sum) : :);

    return sum;

}



double dsum()

{

    double sum;

    asm volatile("fadd %d0, d3, d4" : "=w"(sum) : :);

    return sum;

}

Will produce:

foo:

        fadd s0, s3, s4 // sum

        ret     

dsum:

        fadd d0, d3, d4 // sum

        ret

answered Jan 3 at 23:30

James Greenhalgh

2,0141012

Nice, I figured there was probably a modifier while writing my answer, but I didn't see it in the GCC manual. Is this documented anywhere? I only know of the modifiers for x86 registers being in the gcc manual, at the bottom of the Extended asm section: gcc.gnu.org/onlinedocs/gcc/…

– Peter Cordes
Jan 4 at 2:49

@PeterCordes The problem is that the gcc people are reluctant to document these things. If they're documented, then they're not allowed to change them (which they probably don't do much anyway). One could argue that there are already enough people using them that changing them would cause wide-spread consternation, but that has not yet proven to be a good enough argument to overcome the inertia (but feel free to try!). x86 got doc'ed cuz I was changing the asm docs and added them, and no one felt strongly enough about it to argue with me. I only did x86 cuz that's what I know. Sorry.

– David Wohlferd
Jan 4 at 3:24

1

As an aside to James and OP: Be aware that since these modifiers AREN'T doc'ed, using them is unsupported. As with any undocumented feature, gcc can change them at any time. They probably won't, but they can.

– David Wohlferd
Jan 4 at 3:26

1

If you wanted to document the subset of modifiers that we shouldn’t change, I’ll approve the patch on list. Some (these, %w0 for printing the w rather than x name) should just be documented and fixed. Especially where behavior is consistent between GCC and Clang. To my great shame, the best documentation we have is at github.com/gcc-mirror/gcc/blob/master/gcc/config/aarch64/…

– James Greenhalgh
Jan 4 at 13:34

@PeterCordes - Is there any way you can take James up on his offer here? I'd love to help, but I know almost zero about arm, so selecting the subset is beyond what I can offer. I could help with the texinfo, but that's probably the least challenging part of this.

– David Wohlferd
Jan 4 at 23:40

|
show 5 more comments

Because registers can have multiple names in AArch64 (v0, b0, h0, s0, d0 all refer to the same register) it is necessary to add an output modifier to the print string:

On Godbolt

float foo()

{

    float sum;

    asm volatile("fadd %s0, s3, s4" : "=w"(sum) : :);

    return sum;

}



double dsum()

{

    double sum;

    asm volatile("fadd %d0, d3, d4" : "=w"(sum) : :);

    return sum;

}

Will produce:

foo:

        fadd s0, s3, s4 // sum

        ret     

dsum:

        fadd d0, d3, d4 // sum

        ret

answered Jan 3 at 23:30

James Greenhalgh

2,0141012

Because registers can have multiple names in AArch64 (v0, b0, h0, s0, d0 all refer to the same register) it is necessary to add an output modifier to the print string:

On Godbolt

float foo()

{

    float sum;

    asm volatile("fadd %s0, s3, s4" : "=w"(sum) : :);

    return sum;

}



double dsum()

{

    double sum;

    asm volatile("fadd %d0, d3, d4" : "=w"(sum) : :);

    return sum;

}

Will produce:

foo:

        fadd s0, s3, s4 // sum

        ret     

dsum:

        fadd d0, d3, d4 // sum

        ret

answered Jan 3 at 23:30

James Greenhalgh

2,0141012

answered Jan 3 at 23:30

James Greenhalgh

2,0141012

answered Jan 3 at 23:30

James Greenhalgh

2,0141012

answered Jan 3 at 23:30

James Greenhalgh

2,0141012

Nice, I figured there was probably a modifier while writing my answer, but I didn't see it in the GCC manual. Is this documented anywhere? I only know of the modifiers for x86 registers being in the gcc manual, at the bottom of the Extended asm section: gcc.gnu.org/onlinedocs/gcc/…

– Peter Cordes
Jan 4 at 2:49

@PeterCordes The problem is that the gcc people are reluctant to document these things. If they're documented, then they're not allowed to change them (which they probably don't do much anyway). One could argue that there are already enough people using them that changing them would cause wide-spread consternation, but that has not yet proven to be a good enough argument to overcome the inertia (but feel free to try!). x86 got doc'ed cuz I was changing the asm docs and added them, and no one felt strongly enough about it to argue with me. I only did x86 cuz that's what I know. Sorry.

– David Wohlferd
Jan 4 at 3:24

1

As an aside to James and OP: Be aware that since these modifiers AREN'T doc'ed, using them is unsupported. As with any undocumented feature, gcc can change them at any time. They probably won't, but they can.

– David Wohlferd
Jan 4 at 3:26

1

If you wanted to document the subset of modifiers that we shouldn’t change, I’ll approve the patch on list. Some (these, %w0 for printing the w rather than x name) should just be documented and fixed. Especially where behavior is consistent between GCC and Clang. To my great shame, the best documentation we have is at github.com/gcc-mirror/gcc/blob/master/gcc/config/aarch64/…

– James Greenhalgh
Jan 4 at 13:34

@PeterCordes - Is there any way you can take James up on his offer here? I'd love to help, but I know almost zero about arm, so selecting the subset is beyond what I can offer. I could help with the texinfo, but that's probably the least challenging part of this.

– David Wohlferd
Jan 4 at 23:40

|
show 5 more comments

Nice, I figured there was probably a modifier while writing my answer, but I didn't see it in the GCC manual. Is this documented anywhere? I only know of the modifiers for x86 registers being in the gcc manual, at the bottom of the Extended asm section: gcc.gnu.org/onlinedocs/gcc/…

– Peter Cordes
Jan 4 at 2:49

@PeterCordes The problem is that the gcc people are reluctant to document these things. If they're documented, then they're not allowed to change them (which they probably don't do much anyway). One could argue that there are already enough people using them that changing them would cause wide-spread consternation, but that has not yet proven to be a good enough argument to overcome the inertia (but feel free to try!). x86 got doc'ed cuz I was changing the asm docs and added them, and no one felt strongly enough about it to argue with me. I only did x86 cuz that's what I know. Sorry.

– David Wohlferd
Jan 4 at 3:24

1

As an aside to James and OP: Be aware that since these modifiers AREN'T doc'ed, using them is unsupported. As with any undocumented feature, gcc can change them at any time. They probably won't, but they can.

– David Wohlferd
Jan 4 at 3:26

1

If you wanted to document the subset of modifiers that we shouldn’t change, I’ll approve the patch on list. Some (these, %w0 for printing the w rather than x name) should just be documented and fixed. Especially where behavior is consistent between GCC and Clang. To my great shame, the best documentation we have is at github.com/gcc-mirror/gcc/blob/master/gcc/config/aarch64/…

– James Greenhalgh
Jan 4 at 13:34

@PeterCordes - Is there any way you can take James up on his offer here? I'd love to help, but I know almost zero about arm, so selecting the subset is beyond what I can offer. I could help with the texinfo, but that's probably the least challenging part of this.

– David Wohlferd
Jan 4 at 23:40

Nice, I figured there was probably a modifier while writing my answer, but I didn't see it in the GCC manual. Is this documented anywhere? I only know of the modifiers for x86 registers being in the gcc manual, at the bottom of the Extended asm section: gcc.gnu.org/onlinedocs/gcc/…

– Peter Cordes
Jan 4 at 2:49

@PeterCordes The problem is that the gcc people are reluctant to document these things. If they're documented, then they're not allowed to change them (which they probably don't do much anyway). One could argue that there are already enough people using them that changing them would cause wide-spread consternation, but that has not yet proven to be a good enough argument to overcome the inertia (but feel free to try!). x86 got doc'ed cuz I was changing the asm docs and added them, and no one felt strongly enough about it to argue with me. I only did x86 cuz that's what I know. Sorry.

– David Wohlferd
Jan 4 at 3:24

As an aside to James and OP: Be aware that since these modifiers AREN'T doc'ed, using them is unsupported. As with any undocumented feature, gcc can change them at any time. They probably won't, but they can.

– David Wohlferd
Jan 4 at 3:26

If you wanted to document the subset of modifiers that we shouldn’t change, I’ll approve the patch on list. Some (these, %w0 for printing the w rather than x name) should just be documented and fixed. Especially where behavior is consistent between GCC and Clang. To my great shame, the best documentation we have is at github.com/gcc-mirror/gcc/blob/master/gcc/config/aarch64/…

– James Greenhalgh
Jan 4 at 13:34

@PeterCordes - Is there any way you can take James up on his offer here? I'd love to help, but I know almost zero about arm, so selecting the subset is beyond what I can offer. I could help with the texinfo, but that's probably the least challenging part of this.

– David Wohlferd
Jan 4 at 23:40

|
show 5 more comments

"=r" is the constraint for GP integer registers.

On Godbolt I tried this source:

float foo()

{

    float sum;

#ifdef __aarch64__

    asm volatile("fadd %0, s3, s4" : "=w"(sum) : :);   // AArch64

#else

    asm volatile("fadds %0, s3, s4" : "=t"(sum) : :);  // ARM32

#endif

    return sum;

}



double dsum()

{

    double sum;

#ifdef __aarch64__

    asm volatile("fadd %0, d3, d4" : "=w"(sum) : :);   // AArch64

#else

    asm volatile("faddd %0, d3, d4" : "=w"(sum) : :);  // ARM32

#endif

    return sum;

}

clang7.0 (with its built-in assembler) requires the asm to be actually valid. But for gcc we're only compiling to asm, and Godbolt doesn't have a "binary mode" for non-x86.

# AArch64 gcc 8.2  -xc -O3 -fverbose-asm -Wall

# INVALID ASM, errors if you try to actually assemble it.

foo:

    fadd v0, s3, s4 // sum

    ret     

dsum:

    fadd v0, d3, d4 // sum

    ret

clang produces the same asm, and its built-in assembler errors with:

<source>:5:18: error: invalid operand for instruction

    asm volatile("fadd %0, s3, s4" : "=w"(sum) : :);

                 ^

<inline asm>:1:11: note: instantiated into assembly here

        fadd v0, s3, s4

             ^

# clang7.0 -xc -O3 -Wall--target=arm -mcpu=cortex-a15 -mfloat-abi=hard

# valid asm for ARM 32

foo:

        vadd.f32        s0, s3, s4

        bx      lr

dsum:

        vadd.f64        d0, d3, d4

        bx      lr

But gcc fails:

# ARM gcc 8.2  -xc -O3 -fverbose-asm -Wall -mfloat-abi=hard -mcpu=cortex-a15

foo:

        fadds s0, s3, s4        @ sum

        bx      lr  @

dsum:

        faddd s0, d3, d4        @ sum    @@@ INVALID

        bx      lr  @

So you can use =t for single just fine with gcc, but for double presumably you need a %something0 modifier to print the register name as d0 instead of s0, with a "=w" output.

See also https://stackoverflow.com/tags/inline-assembly/info

edited Dec 31 '18 at 0:25

answered Dec 30 '18 at 5:56

Peter Cordes

121k17184312

@DavidWohlferd: I meant that without input constraints, this toy inline asm is only useful for learning the syntax, not doing anything useful. To go beyond that, you need input constraints like "w" (input1).

– Peter Cordes
Dec 30 '18 at 23:52

@DavidWohlferd: I think a valid parsing of my sentence is that change is required for the statements to be useful for anything other than learning/testing the syntax. In my last edit I changed the wording of the rest of the sentence to try to make that clearer, but feel free to edit if you're still convinced it's confusing. Probably you aren't the only one that parsed it differently from how I intended.

– Peter Cordes
Dec 31 '18 at 5:20

add a comment |

"=r" is the constraint for GP integer registers.

On Godbolt I tried this source:

float foo()

{

    float sum;

#ifdef __aarch64__

    asm volatile("fadd %0, s3, s4" : "=w"(sum) : :);   // AArch64

#else

    asm volatile("fadds %0, s3, s4" : "=t"(sum) : :);  // ARM32

#endif

    return sum;

}



double dsum()

{

    double sum;

#ifdef __aarch64__

    asm volatile("fadd %0, d3, d4" : "=w"(sum) : :);   // AArch64

#else

    asm volatile("faddd %0, d3, d4" : "=w"(sum) : :);  // ARM32

#endif

    return sum;

}

clang7.0 (with its built-in assembler) requires the asm to be actually valid. But for gcc we're only compiling to asm, and Godbolt doesn't have a "binary mode" for non-x86.

# AArch64 gcc 8.2  -xc -O3 -fverbose-asm -Wall

# INVALID ASM, errors if you try to actually assemble it.

foo:

    fadd v0, s3, s4 // sum

    ret     

dsum:

    fadd v0, d3, d4 // sum

    ret

clang produces the same asm, and its built-in assembler errors with:

<source>:5:18: error: invalid operand for instruction

    asm volatile("fadd %0, s3, s4" : "=w"(sum) : :);

                 ^

<inline asm>:1:11: note: instantiated into assembly here

        fadd v0, s3, s4

             ^

# clang7.0 -xc -O3 -Wall--target=arm -mcpu=cortex-a15 -mfloat-abi=hard

# valid asm for ARM 32

foo:

        vadd.f32        s0, s3, s4

        bx      lr

dsum:

        vadd.f64        d0, d3, d4

        bx      lr

But gcc fails:

# ARM gcc 8.2  -xc -O3 -fverbose-asm -Wall -mfloat-abi=hard -mcpu=cortex-a15

foo:

        fadds s0, s3, s4        @ sum

        bx      lr  @

dsum:

        faddd s0, d3, d4        @ sum    @@@ INVALID

        bx      lr  @

So you can use =t for single just fine with gcc, but for double presumably you need a %something0 modifier to print the register name as d0 instead of s0, with a "=w" output.

See also https://stackoverflow.com/tags/inline-assembly/info

edited Dec 31 '18 at 0:25

answered Dec 30 '18 at 5:56

Peter Cordes

121k17184312

@DavidWohlferd: I meant that without input constraints, this toy inline asm is only useful for learning the syntax, not doing anything useful. To go beyond that, you need input constraints like "w" (input1).

– Peter Cordes
Dec 30 '18 at 23:52

@DavidWohlferd: I think a valid parsing of my sentence is that change is required for the statements to be useful for anything other than learning/testing the syntax. In my last edit I changed the wording of the rest of the sentence to try to make that clearer, but feel free to edit if you're still convinced it's confusing. Probably you aren't the only one that parsed it differently from how I intended.

– Peter Cordes
Dec 31 '18 at 5:20

add a comment |

"=r" is the constraint for GP integer registers.

On Godbolt I tried this source:

float foo()

{

    float sum;

#ifdef __aarch64__

    asm volatile("fadd %0, s3, s4" : "=w"(sum) : :);   // AArch64

#else

    asm volatile("fadds %0, s3, s4" : "=t"(sum) : :);  // ARM32

#endif

    return sum;

}



double dsum()

{

    double sum;

#ifdef __aarch64__

    asm volatile("fadd %0, d3, d4" : "=w"(sum) : :);   // AArch64

#else

    asm volatile("faddd %0, d3, d4" : "=w"(sum) : :);  // ARM32

#endif

    return sum;

}

clang7.0 (with its built-in assembler) requires the asm to be actually valid. But for gcc we're only compiling to asm, and Godbolt doesn't have a "binary mode" for non-x86.

# AArch64 gcc 8.2  -xc -O3 -fverbose-asm -Wall

# INVALID ASM, errors if you try to actually assemble it.

foo:

    fadd v0, s3, s4 // sum

    ret     

dsum:

    fadd v0, d3, d4 // sum

    ret

clang produces the same asm, and its built-in assembler errors with:

<source>:5:18: error: invalid operand for instruction

    asm volatile("fadd %0, s3, s4" : "=w"(sum) : :);

                 ^

<inline asm>:1:11: note: instantiated into assembly here

        fadd v0, s3, s4

             ^

# clang7.0 -xc -O3 -Wall--target=arm -mcpu=cortex-a15 -mfloat-abi=hard

# valid asm for ARM 32

foo:

        vadd.f32        s0, s3, s4

        bx      lr

dsum:

        vadd.f64        d0, d3, d4

        bx      lr

But gcc fails:

# ARM gcc 8.2  -xc -O3 -fverbose-asm -Wall -mfloat-abi=hard -mcpu=cortex-a15

foo:

        fadds s0, s3, s4        @ sum

        bx      lr  @

dsum:

        faddd s0, d3, d4        @ sum    @@@ INVALID

        bx      lr  @

So you can use =t for single just fine with gcc, but for double presumably you need a %something0 modifier to print the register name as d0 instead of s0, with a "=w" output.

See also https://stackoverflow.com/tags/inline-assembly/info

edited Dec 31 '18 at 0:25

answered Dec 30 '18 at 5:56

Peter Cordes

121k17184312

"=r" is the constraint for GP integer registers.

On Godbolt I tried this source:

float foo()

{

    float sum;

#ifdef __aarch64__

    asm volatile("fadd %0, s3, s4" : "=w"(sum) : :);   // AArch64

#else

    asm volatile("fadds %0, s3, s4" : "=t"(sum) : :);  // ARM32

#endif

    return sum;

}



double dsum()

{

    double sum;

#ifdef __aarch64__

    asm volatile("fadd %0, d3, d4" : "=w"(sum) : :);   // AArch64

#else

    asm volatile("faddd %0, d3, d4" : "=w"(sum) : :);  // ARM32

#endif

    return sum;

}

clang7.0 (with its built-in assembler) requires the asm to be actually valid. But for gcc we're only compiling to asm, and Godbolt doesn't have a "binary mode" for non-x86.

# AArch64 gcc 8.2  -xc -O3 -fverbose-asm -Wall

# INVALID ASM, errors if you try to actually assemble it.

foo:

    fadd v0, s3, s4 // sum

    ret     

dsum:

    fadd v0, d3, d4 // sum

    ret

clang produces the same asm, and its built-in assembler errors with:

<source>:5:18: error: invalid operand for instruction

    asm volatile("fadd %0, s3, s4" : "=w"(sum) : :);

                 ^

<inline asm>:1:11: note: instantiated into assembly here

        fadd v0, s3, s4

             ^

# clang7.0 -xc -O3 -Wall--target=arm -mcpu=cortex-a15 -mfloat-abi=hard

# valid asm for ARM 32

foo:

        vadd.f32        s0, s3, s4

        bx      lr

dsum:

        vadd.f64        d0, d3, d4

        bx      lr

But gcc fails:

# ARM gcc 8.2  -xc -O3 -fverbose-asm -Wall -mfloat-abi=hard -mcpu=cortex-a15

foo:

        fadds s0, s3, s4        @ sum

        bx      lr  @

dsum:

        faddd s0, d3, d4        @ sum    @@@ INVALID

        bx      lr  @

So you can use =t for single just fine with gcc, but for double presumably you need a %something0 modifier to print the register name as d0 instead of s0, with a "=w" output.

See also https://stackoverflow.com/tags/inline-assembly/info

edited Dec 31 '18 at 0:25

answered Dec 30 '18 at 5:56

Peter Cordes

121k17184312

edited Dec 31 '18 at 0:25

answered Dec 30 '18 at 5:56

Peter Cordes

121k17184312

answered Dec 30 '18 at 5:56

Peter Cordes

121k17184312

answered Dec 30 '18 at 5:56

Peter Cordes

121k17184312

@DavidWohlferd: I meant that without input constraints, this toy inline asm is only useful for learning the syntax, not doing anything useful. To go beyond that, you need input constraints like "w" (input1).

– Peter Cordes
Dec 30 '18 at 23:52

@DavidWohlferd: I think a valid parsing of my sentence is that change is required for the statements to be useful for anything other than learning/testing the syntax. In my last edit I changed the wording of the rest of the sentence to try to make that clearer, but feel free to edit if you're still convinced it's confusing. Probably you aren't the only one that parsed it differently from how I intended.

– Peter Cordes
Dec 31 '18 at 5:20

add a comment |

@DavidWohlferd: I meant that without input constraints, this toy inline asm is only useful for learning the syntax, not doing anything useful. To go beyond that, you need input constraints like "w" (input1).

– Peter Cordes
Dec 30 '18 at 23:52

@DavidWohlferd: I think a valid parsing of my sentence is that change is required for the statements to be useful for anything other than learning/testing the syntax. In my last edit I changed the wording of the rest of the sentence to try to make that clearer, but feel free to edit if you're still convinced it's confusing. Probably you aren't the only one that parsed it differently from how I intended.

– Peter Cordes
Dec 31 '18 at 5:20

@DavidWohlferd: I meant that without input constraints, this toy inline asm is only useful for learning the syntax, not doing anything useful. To go beyond that, you need input constraints like "w" (input1).

– Peter Cordes
Dec 30 '18 at 23:52

@DavidWohlferd: I think a valid parsing of my sentence is that change is required for the statements to be useful for anything other than learning/testing the syntax. In my last edit I changed the wording of the rest of the sentence to try to make that clearer, but feel free to edit if you're still convinced it's confusing. Probably you aren't the only one that parsed it differently from how I intended.

– Peter Cordes
Dec 31 '18 at 5:20

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

w,y Y7 6aQRy3Iz,2,d iI,W GY Nm7fuPP,tfItwWziA874DGv5IcJ8CpNBM96wmg275YdZvtAhgvoFr2xL TdNahVfka1 h3wldC

搜尋此網誌

Bdtjtk