Bug 1170376 - Enable -moutline-atomics flag for GCC in Tumbleweed
Enable -moutline-atomics flag for GCC in Tumbleweed
Status: RESOLVED UPSTREAM
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Other
Current
aarch64 Linux
: P5 - None : Normal (vote)
: ---
Assigned To: Aaron Puchert
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-04-23 19:03 UTC by Guillaume GARDET
Modified: 2020-05-04 09:51 UTC (History)
6 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Guillaume GARDET 2020-04-23 19:03:40 UTC
GCC 10 and GCC 9.3.1+ support '-moutline-atomics' [0], so we could enable this flag in Tumbleweed to improve LSE atomic operations on armv8.1+ hardware, with very small overhead for armv8.0 hardware.

The problem is llvm does not support this flag and returns an error if used:
  clang-10.0: error: unknown argument: '-moutline-atomics'

The problem is optflags in Tumbleweed PrjConf targets both GCC and LLVM, so adding '-moutline-atomics' would break LLVM builds.
I can see few solutions:
1) Enable '-moutline-atomics' by default in GCC packages [1]
2) Add '-moutline-atomics' to the list of ignored flags in LLVM
3) Split optflags to optflags_common, optflags_gcc and optflags_llvm

[0]: https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#AArch64-Options
[1]: https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01035.html
Comment 1 Dirk Mueller 2020-04-23 20:37:19 UTC
I have difficulties imaging 3) to work, because we'd have to patch everything to use optflags_llvm if and only if clang is called, which is very intensive. 

the patch in 1) didn't land upstream, right? so thats also not nice. 

It seems like 2) is the only option.
Comment 2 Aaron Puchert 2020-04-23 21:52:12 UTC
(In reply to Guillaume GARDET from comment #0)
> 2) Add '-moutline-atomics' to the list of ignored flags in LLVM

(In reply to Dirk Mueller from comment #1)
> It seems like 2) is the only option.

Wouldn't be the first time we're doing this so it's fine for me. [1]

Another option would be to remove it manually in the affected packages, I think there are few packages that build with Clang. But it's probably not a bad idea to go with 2) either way so that Clang supports our default flags.

Silly question: why can this behavior not be achieved with -mtune=armv8.1-a?

[1] https://build.opensuse.org/package/view_file/devel:tools:compiler/llvm10/clang-ignore-stack-clash-protector.patch
Comment 3 Richard Biener 2020-04-24 06:50:59 UTC
My comment on the clang "issue" is that packages should be built with GCC, not clang.

Now, I would appreciate upstream GCC to either make the default of
-moutline-atomics configurable at build time (--with-outline-atomics?)
or as suggested based on tuning (while we have explicit default arch/tuning
for 32bit arm we leave things as default for aarch64, no --with-arch
or --with-tune is done at configure time).  We can of course patch our GCC package as well.

Note that any change of GCCs _default_ behavior also affects users and
would _differ_ from default upstream behavior which might be confusing to
them.  I'm also not sure whether -march=armv8.1-a -moutline-atomics will
outline atomics or use inline LSE which users might expect.

--> So IMHO using RPM_OPT_FLAGS is the way to go.

Or simply adjust selected packages where it matters at all and where we
know GCC is used for compiling.
Comment 4 Guillaume GARDET 2020-04-24 06:58:56 UTC
(In reply to Aaron Puchert from comment #2)
> (In reply to Guillaume GARDET from comment #0)
> > 2) Add '-moutline-atomics' to the list of ignored flags in LLVM
> 
> (In reply to Dirk Mueller from comment #1)
> > It seems like 2) is the only option.
> 
> Wouldn't be the first time we're doing this so it's fine for me. [1]
> 
> Another option would be to remove it manually in the affected packages, I
> think there are few packages that build with Clang. But it's probably not a
> bad idea to go with 2) either way so that Clang supports our default flags.

So, going with 2) seems to be ok for everyone!

> Silly question: why can this behavior not be achieved with -mtune=armv8.1-a?

AFAIK, '-mtune=armv8.1-a' does not exist, and '-march=armv8.1-a' would break all armv8.0 hardware as it will use instructions only available on v8.1, including LSE atomics.
'-moutline-atomics' will make use of new LSE instructions only if hardware support it, and will fallback to legacy instructions if hardware does not support the new ones.

@Aaron, would you take care of the patches or should I prepare a SR?
Comment 5 Aaron Puchert 2020-04-24 21:21:03 UTC
(In reply to Richard Biener from comment #3)
> Now, I would appreciate upstream GCC to either make the default of
> -moutline-atomics configurable at build time (--with-outline-atomics?)
> or as suggested based on tuning (while we have explicit default arch/tuning
> for 32bit arm we leave things as default for aarch64, no --with-arch
> or --with-tune is done at configure time).  We can of course patch our GCC
> package as well.
So unless upstream makes it the default, we have to either differ from upstream in the defaults or build packages with non-default flags. Not sure which one is the lesser evil. Upstream didn't go into much detail about not making this the default (https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01104.html). Given our reason to enable it (comment 0), what would be reasons not to?

And if we think it's fine as package default, maybe a user might want this as default for compiling their own stuff as well?
 
> I'm also not sure whether -march=armv8.1-a -moutline-atomics will
> outline atomics or use inline LSE which users might expect.
According to https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html, the option is not "applicable" then:

    This option is only applicable when compiling for the base ARMv8.0
    instruction set. If using a later revision, e.g. -march=armv8.1-a or
    -march=armv8-a+lse, the ARMv8.1-Atomics instructions will be used directly.

Not sure what this means.

> Or simply adjust selected packages where it matters at all and where we
> know GCC is used for compiling.
Yeah, would be interesting to know for which packages this makes a noticeable difference. Atomics aren't used very widely, are they? (Apart from futexes of courses, but for that compiling pthreads with the flag should be enough, unless I'm missing something.)

(In reply to Guillaume GARDET from comment #4)
> AFAIK, '-mtune=armv8.1-a' does not exist, and '-march=armv8.1-a' would break
> all armv8.0 hardware as it will use instructions only available on v8.1,
> including LSE atomics.
Sure, I get that -march doesn't do it, but why not use the -mtune mechanism instead of inventing a new flag? Because -mtune doesn't usually insert fallbacks? This is not a rhetorical question, I'm genuinely curious.

> @Aaron, would you take care of the patches or should I prepare a SR?
No problem, I can do that. Just want to explore a bit what the options are.
Comment 6 Michael Matz 2020-04-24 21:53:22 UTC
(In reply to Aaron Puchert from comment #5)
> So unless upstream makes it the default, we have to either differ from
> upstream in the defaults or build packages with non-default flags. Not sure
> which one is the lesser evil. Upstream didn't go into much detail about not
> making this the default
> (https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01104.html). Given our
> reason to enable it (comment 0), what would be reasons not to?

There aren't really.  I would argue for enabling it upstream, and we should
work on making that so.

> And if we think it's fine as package default, maybe a user might want this
> as default for compiling their own stuff as well?

Right, but that's only the second best solution, with upstream being the better 
one.

> > I'm also not sure whether -march=armv8.1-a -moutline-atomics will
> > outline atomics or use inline LSE which users might expect.
> According to https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html, the
> option is not "applicable" then:
> 
>     This option is only applicable when compiling for the base ARMv8.0
>     instruction set. If using a later revision, e.g. -march=armv8.1-a or
>     -march=armv8-a+lse, the ARMv8.1-Atomics instructions will be used
> directly.
> 
> Not sure what this means.

The code sequences that would make -moutline-atomics have any effect simply aren't used with -march=armv8.1-a or later.  I don't know if using the option
nevertheless leads to an error (I would hope not), or is silently ignored.

> (In reply to Guillaume GARDET from comment #4)
> > AFAIK, '-mtune=armv8.1-a' does not exist, and '-march=armv8.1-a' would break
> > all armv8.0 hardware as it will use instructions only available on v8.1,
> > including LSE atomics.
> Sure, I get that -march doesn't do it, but why not use the -mtune mechanism
> instead of inventing a new flag? Because -mtune doesn't usually insert
> fallbacks? This is not a rhetorical question, I'm genuinely curious.

You are right that in principle generating code for v8.0 (march) but tuning for v8.1 (mtune) could enable -moutline-atomics.  It's just not designed that way
right now.  march is for generating code that can require at least that
architecture level, while mtune is for using code sequences that are preferred
on that architecture level, which still within the constraints of the march
level.  That is exactly what moutline-atomics is about: generating v8.0
compatible code that works magically faster on v8.1.

But there's a slight difference to traditional mtune behaviour: in this case
there are calls to external routines inserted, not just some slight preferences in using (say)  shl x,1 vs. an add x,x instructions.

In any case: I'd pretty much like for upstream to make this default.  I think
really anyone would want it.
Comment 7 Guillaume GARDET 2020-04-30 15:31:30 UTC
GCC 10.1 will enbale '-moutline-atomics' by default.

Commit: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=cd4b68527988f42c10c0d6c10e812d299887e0c2
Comment 8 Aaron Puchert 2020-04-30 20:16:46 UTC
(In reply to Michael Matz from comment #6)
> But there's a slight difference to traditional mtune behaviour: in this case
> there are calls to external routines inserted, not just some slight
> preferences in using (say)  shl x,1 vs. an add x,x instructions.
That makes sense indeed, it's a bit different.

(In reply to Guillaume GARDET from comment #7)
> GCC 10.1 will enbale '-moutline-atomics' by default.
So we don't need to explicitly enable the flag, and I don't need to patch Clang to ignore it, right?
Comment 9 Guillaume GARDET 2020-05-04 08:28:09 UTC
> (In reply to Guillaume GARDET from comment #7)
> > GCC 10.1 will enbale '-moutline-atomics' by default.
> So we don't need to explicitly enable the flag, and I don't need to patch
> Clang to ignore it, right?

GCC10 should land in Tumbleweed as default GCC in few weeks from now, so no need to patch anything for Tumbleweed. We could revisit this if we want to enable '-moutline-atomics' for SLE or Leap, later.
Comment 10 Dirk Mueller 2020-05-04 09:51:13 UTC
fixed upstream by flipping the default in GCC