1. <i id='Ri2FW'><tr id='Ri2FW'><dt id='Ri2FW'><q id='Ri2FW'><span id='Ri2FW'><b id='Ri2FW'><form id='Ri2FW'><ins id='Ri2FW'></ins><ul id='Ri2FW'></ul><sub id='Ri2FW'></sub></form><legend id='Ri2FW'></legend><bdo id='Ri2FW'><pre id='Ri2FW'><center id='Ri2FW'></center></pre></bdo></b><th id='Ri2FW'></th></span></q></dt></tr></i><div id='Ri2FW'><tfoot id='Ri2FW'></tfoot><dl id='Ri2FW'><fieldset id='Ri2FW'></fieldset></dl></div>
    <legend id='Ri2FW'><style id='Ri2FW'><dir id='Ri2FW'><q id='Ri2FW'></q></dir></style></legend>

    <small id='Ri2FW'></small><noframes id='Ri2FW'>

  2. <tfoot id='Ri2FW'></tfoot>

        <bdo id='Ri2FW'></bdo><ul id='Ri2FW'></ul>

      在 g++ 上进行聚合初始化的 std::array 生成大量代码

      时间:2023-09-16
    1. <i id='daX4w'><tr id='daX4w'><dt id='daX4w'><q id='daX4w'><span id='daX4w'><b id='daX4w'><form id='daX4w'><ins id='daX4w'></ins><ul id='daX4w'></ul><sub id='daX4w'></sub></form><legend id='daX4w'></legend><bdo id='daX4w'><pre id='daX4w'><center id='daX4w'></center></pre></bdo></b><th id='daX4w'></th></span></q></dt></tr></i><div id='daX4w'><tfoot id='daX4w'></tfoot><dl id='daX4w'><fieldset id='daX4w'></fieldset></dl></div>

        <tbody id='daX4w'></tbody>

          <tfoot id='daX4w'></tfoot>

          <small id='daX4w'></small><noframes id='daX4w'>

          • <bdo id='daX4w'></bdo><ul id='daX4w'></ul>

                <legend id='daX4w'><style id='daX4w'><dir id='daX4w'><q id='daX4w'></q></dir></style></legend>
                本文介绍了在 g++ 上进行聚合初始化的 std::array 生成大量代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

                问题描述

                在 g++ 4.9.2 和 5.3.1 上,这段代码需要几秒钟的时间来编译并生成一个 52,776 字节的可执行文件:

                On g++ 4.9.2 and 5.3.1, this code takes several seconds to compile and produces a 52,776 byte executable:

                #include <array>
                #include <iostream>
                
                int main()
                {
                    constexpr std::size_t size = 4096;
                
                    struct S
                    {
                        float f;
                        S() : f(0.0f) {}
                    };
                
                    std::array<S, size> a = {};  // <-- note aggregate initialization
                
                    for (auto& e : a)
                        std::cerr << e.f;
                
                    return 0;
                }
                

                增加 size 似乎会线性增加编译时间和可执行文件的大小.我无法使用 clang 3.5 或 Visual C++ 2015 重现此行为.使用 -Os 没有区别.

                Increasing size seems to increase compilation time and executable size linearly. I cannot reproduce this behaviour with either clang 3.5 or Visual C++ 2015. Using -Os makes no difference.

                $ time g++ -O2 -std=c++11 test.cpp
                real    0m4.178s
                user    0m4.060s
                sys     0m0.068s
                

                检查汇编代码发现a的初始化被展开,生成4096 movl指令:

                Inspecting the assembly code reveals that the initialization of a is unrolled, generating 4096 movl instructions:

                main:
                .LFB1313:
                    .cfi_startproc
                    pushq   %rbx
                    .cfi_def_cfa_offset 16
                    .cfi_offset 3, -16
                    subq    $16384, %rsp
                    .cfi_def_cfa_offset 16400
                    movl    $0x00000000, (%rsp)
                    movl    $0x00000000, 4(%rsp)
                    movq    %rsp, %rbx
                    movl    $0x00000000, 8(%rsp)
                    movl    $0x00000000, 12(%rsp)
                    movl    $0x00000000, 16(%rsp)
                       [...skipping 4000 lines...]
                    movl    $0x00000000, 16376(%rsp)
                    movl    $0x00000000, 16380(%rsp)
                

                这仅在 T 具有非平凡构造函数并且使用 {} 初始化数组时才会发生.如果我执行以下任何操作,g++ 会生成一个简单的循环:

                This only happens when T has a non-trivial constructor and the array is initialized using {}. If I do any of the following, g++ generates a simple loop:

                1. 删除S::S();
                2. 移除S::S()并在类中初始化S::f
                3. 移除聚合初始化(= {});
                4. 不使用-O2编译.
                1. Remove S::S();
                2. Remove S::S() and initialize S::f in-class;
                3. Remove the aggregate initialization (= {});
                4. Compile without -O2.

                我完全将循环展开作为一种优化,但我认为这不是一个很好的优化.在我将此报告为错误之前,有人可以确认这是否是预期的行为吗?

                I'm all for loop unrolling as an optimization, but I don't think this is a very good one. Before I report this as a bug, can someone confirm whether this is the expected behaviour?

                推荐答案

                似乎有一个相关的错误报告,错误 59659 - 大的零初始化 std::array 编译时间过长.它被认为是 4.9.0 的固定",所以我认为这个测试用例要么是回归,要么是补丁未涵盖的边缘情况.值得一提的是,错误报告的两个测试用例1,2 在 GCC 4.9.0 和 5.3 上都对我表现出症状.1

                There appears to be a related bug report, Bug 59659 - large zero-initialized std::array compile time excessive. It was considered "fixed" for 4.9.0, so I consider this testcase either a regression or an edgecase not covered by the patch. For what it's worth, two of the bug report's test cases1, 2 exhibit symptoms for me on both GCC 4.9.0 as well as 5.3.1

                还有两个相关的错误报告:

                There are two more related bug reports:

                Bug 68203 - 使用 -std=c++ 的嵌套数组对结构的无限编译时间11

                安德鲁平斯基 2015-11-04 07:56:57 UTC

                Andrew Pinski 2015-11-04 07:56:57 UTC

                这很可能是一个内存占用,它产生了大量的默认构造函数而不是对它们的循环.

                This is most likely a memory hog which is generating lots of default constructors rather than a loop over them.

                那个声称是这个的复制品:

                That one claims to be a duplicate of this one:

                错误 56671 - Gcc 使用大量内存和处理器能力以及大型 C++11 位集

                乔纳森·韦克利 2016-01-26 15:12:27 UTC

                Jonathan Wakely 2016-01-26 15:12:27 UTC

                为这个 constexpr 构造函数生成数组初始化是问题:

                Generating the array initialization for this constexpr constructor is the problem:

                  constexpr _Base_bitset(unsigned long long __val) noexcept
                  : _M_w{ _WordT(__val)
                   } { }
                

                确实,如果我们将其更改为 S a[4096] {}; 我们不会遇到问题.

                Indeed if we change it to S a[4096] {}; we don't get the problem.

                使用 perf 我们可以看到 GCC 大部分时间都花在了什么地方.第一:

                Using perf we can see where GCC is spending most of its time. First:

                perf record g++ -std=c++11 -O2 test.cpp

                然后性能报告:

                  10.33%  cc1plus   cc1plus                 [.] get_ref_base_and_extent
                   6.36%  cc1plus   cc1plus                 [.] memrefs_conflict_p
                   6.25%  cc1plus   cc1plus                 [.] vn_reference_lookup_2
                   6.16%  cc1plus   cc1plus                 [.] exp_equiv_p
                   5.99%  cc1plus   cc1plus                 [.] walk_non_aliased_vuses
                   5.02%  cc1plus   cc1plus                 [.] find_base_term
                   4.98%  cc1plus   cc1plus                 [.] invalidate
                   4.73%  cc1plus   cc1plus                 [.] write_dependence_p
                   4.68%  cc1plus   cc1plus                 [.] estimate_calls_size_and_time
                   4.11%  cc1plus   cc1plus                 [.] ix86_find_base_term
                   3.41%  cc1plus   cc1plus                 [.] rtx_equal_p
                   2.87%  cc1plus   cc1plus                 [.] cse_insn
                   2.77%  cc1plus   cc1plus                 [.] record_store
                   2.66%  cc1plus   cc1plus                 [.] vn_reference_eq
                   2.48%  cc1plus   cc1plus                 [.] operand_equal_p
                   1.21%  cc1plus   cc1plus                 [.] integer_zerop
                   1.00%  cc1plus   cc1plus                 [.] base_alias_check
                

                这对 GCC 开发人员以外的任何人都没有多大意义,但看到什么占用了如此多的编译时间仍然很有趣.

                This won't mean much to anyone but GCC developers but it's still interesting to see what's taking up so much compilation time.

                Clang 3.7.0 在这方面比 GCC 做得更好.在 -O2 编译时间不到一秒钟,生成一个小得多的可执行文件(8960 字节)和这个程序集:

                Clang 3.7.0 does a much better job at this than GCC. At -O2 it takes less than a second to compile, produces a much smaller executable (8960 bytes) and this assembly:

                0000000000400810 <main>:
                  400810:   53                      push   rbx
                  400811:   48 81 ec 00 40 00 00    sub    rsp,0x4000
                  400818:   48 8d 3c 24             lea    rdi,[rsp]
                  40081c:   31 db                   xor    ebx,ebx
                  40081e:   31 f6                   xor    esi,esi
                  400820:   ba 00 40 00 00          mov    edx,0x4000
                  400825:   e8 56 fe ff ff          call   400680 <memset@plt>
                  40082a:   66 0f 1f 44 00 00       nop    WORD PTR [rax+rax*1+0x0]
                  400830:   f3 0f 10 04 1c          movss  xmm0,DWORD PTR [rsp+rbx*1]
                  400835:   f3 0f 5a c0             cvtss2sd xmm0,xmm0
                  400839:   bf 60 10 60 00          mov    edi,0x601060
                  40083e:   e8 9d fe ff ff          call   4006e0 <_ZNSo9_M_insertIdEERSoT_@plt>
                  400843:   48 83 c3 04             add    rbx,0x4
                  400847:   48 81 fb 00 40 00 00    cmp    rbx,0x4000
                  40084e:   75 e0                   jne    400830 <main+0x20>
                  400850:   31 c0                   xor    eax,eax
                  400852:   48 81 c4 00 40 00 00    add    rsp,0x4000
                  400859:   5b                      pop    rbx
                  40085a:   c3                      ret    
                  40085b:   0f 1f 44 00 00          nop    DWORD PTR [rax+rax*1+0x0]
                

                另一方面,使用 GCC 5.3.1,在没有优化的情况下,编译速度非常快,但仍会生成 95328 大小的可执行文件.使用 -O2 编译将可执行文件大小减少到 53912,但编译时间需要 4 秒.我绝对会将这个报告给他们的 bugzilla.

                On the other hand with GCC 5.3.1, with no optimizations, it compiles very quickly but still produces a 95328 sized executable. Compiling with -O2 reduces the executable size to 53912 but compilation time takes 4 seconds. I would definitely report this to their bugzilla.

                这篇关于在 g++ 上进行聚合初始化的 std::array 生成大量代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!

                上一篇:GCC 内联 C++ 函数是否没有“inline"关键字? 下一篇:带有未初始化存储的 STL 向量?

                相关文章

                最新文章

                <small id='53Xvp'></small><noframes id='53Xvp'>

                1. <i id='53Xvp'><tr id='53Xvp'><dt id='53Xvp'><q id='53Xvp'><span id='53Xvp'><b id='53Xvp'><form id='53Xvp'><ins id='53Xvp'></ins><ul id='53Xvp'></ul><sub id='53Xvp'></sub></form><legend id='53Xvp'></legend><bdo id='53Xvp'><pre id='53Xvp'><center id='53Xvp'></center></pre></bdo></b><th id='53Xvp'></th></span></q></dt></tr></i><div id='53Xvp'><tfoot id='53Xvp'></tfoot><dl id='53Xvp'><fieldset id='53Xvp'></fieldset></dl></div>

                2. <legend id='53Xvp'><style id='53Xvp'><dir id='53Xvp'><q id='53Xvp'></q></dir></style></legend>

                  <tfoot id='53Xvp'></tfoot>

                  • <bdo id='53Xvp'></bdo><ul id='53Xvp'></ul>