c++的一个诡异问题，高手请进 - Programming版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Programming版 - c++的一个诡异问题，高手请进

相关主题
● 哪位大牛简单说说compiler里的bootstrap是干啥用的？	● An interesting C++ compile error
● gcc 4.81.或者Clang 都自称C++11 feature 全部支持了	● 问一个简单的C++问题
● C++ 11问题：emplace_back()	● c function 在 c里调用和C++调用结果不一样
● 问个disable copy constructor的问题	● 问一下，DLL里面怎么调用外部类啊？
● c++什么编译器好使？	● 急，VC7.1编译错误
● 大家觉得C++复杂在哪里？	● 请问c++里empty class的问题
● 玩了玩emscripten	● vector的析构问题
● 非虚函数里调用虚函数无效？	● C语言的变量都一定要放在stack上吗？

相关话题的讨论汇总
话题: time话题: t1话题: std

进入Programming版参与讨论

1

(共1页)

N******K 发帖数: 10202	1 编译器 vs2013 release模式 Tempfunction 计算一个数然后输出 N 控制计算复杂度我用function pointer ：TempfunctionPtr = &Tempfunction 然后比较直接调用函数Tempfunction 和通过指针间接调用函数结果很诡异 N=1000 , Tempfunction_time=1, TempfunctionPtr_time=2; N=2000 , Tempfunction_time=2, TempfunctionPtr_time=6; N=4000 , Tempfunction_time=4, TempfunctionPtr_time=11; N=8000 , Tempfunction_time=8, TempfunctionPtr_time=21; TempfunctionPtr_time - Tempfunction_time 应该是常数 N并不改变函数调用的次数 ================= #include #include void Tempfunction(double& a, int N) { a = 0; for (double i = 0; i < N; ++i) { a += i; } } int main() { int N = 1000; // from 1000 to 8000 double Value = 0; auto t0 = std::time(0); for (int i = 0; i < 1000000; ++i) { Tempfunction(Value, N); } auto t1 = std::time(0); auto Tempfunction_time = t1-t0; std::cout << "Tempfunction_time = " << Tempfunction_time << '\n'; auto TempfunctionPtr = &Tempfunction; Value = 0; t0 = std::time(0); for (int i = 0; i < 1000000; ++i) { (*TempfunctionPtr)(Value, N); } t1 = std::time(0); auto TempfunctionPtr_time = t1-t0; std::cout << "TempfunctionPtr_time = " << TempfunctionPtr_time << '\n'; std::system("pause"); }
n*****t 发帖数: 22014	2 高手没空，蓝翔技校试着回答: 间接调用开销稍大，不过貌似不该差那么多【在 N******K 的大作中提到】 : 编译器 vs2013 release模式 : Tempfunction 计算一个数然后输出 N 控制计算复杂度 : 我用function pointer ：TempfunctionPtr = &Tempfunction : 然后比较直接调用函数Tempfunction 和通过指针间接调用函数 : 结果很诡异 : N=1000 , Tempfunction_time=1, TempfunctionPtr_time=2; : N=2000 , Tempfunction_time=2, TempfunctionPtr_time=6; : N=4000 , Tempfunction_time=4, TempfunctionPtr_time=11; : N=8000 , Tempfunction_time=8, TempfunctionPtr_time=21; : TempfunctionPtr_time - Tempfunction_time 应该是常数
n*****t 发帖数: 22014	3 你可以 O0 试试看【在 N******K 的大作中提到】 : 编译器 vs2013 release模式 : Tempfunction 计算一个数然后输出 N 控制计算复杂度 : 我用function pointer ：TempfunctionPtr = &Tempfunction : 然后比较直接调用函数Tempfunction 和通过指针间接调用函数 : 结果很诡异 : N=1000 , Tempfunction_time=1, TempfunctionPtr_time=2; : N=2000 , Tempfunction_time=2, TempfunctionPtr_time=6; : N=4000 , Tempfunction_time=4, TempfunctionPtr_time=11; : N=8000 , Tempfunction_time=8, TempfunctionPtr_time=21; : TempfunctionPtr_time - Tempfunction_time 应该是常数
N******K 发帖数: 10202	4 开销应该是常数和函数具体内容(N)无关太诡异了【在 n*****t 的大作中提到】 : 高手没空，蓝翔技校试着回答: : 间接调用开销稍大，不过貌似不该差那么多
n*****t 发帖数: 22014	5 神马系统？运行时间长了，被中断次数多了？N 放大 100 倍看看？【在 N******K 的大作中提到】 : 开销应该是常数和函数具体内容(N)无关 : 太诡异了
a*****e 发帖数: 1700	6 编译器的问题。用 clang++ 试了一下，两个时间是一样的。如果用 clang++ -O2 的化，还需要打印 Value 的值，否则整个计算被忽略掉了。
n*****t 发帖数: 22014	7 艾玛，这编译器胆子够大的 LOL 【在 a*****e 的大作中提到】 : 编译器的问题。用 clang++ 试了一下，两个时间是一样的。 : 如果用 clang++ -O2 的化，还需要打印 Value 的值，否则整个计算被忽略掉了。
a*****e 发帖数: 1700	8 其实这个没什么，只要能够确定(1)循环必然终止(2)无副作用(3)无返回值，那么这段代码属于 dead code。常见的编译器都有这么一个 pass，功能差不多。【在 n*****t 的大作中提到】 : 艾玛，这编译器胆子够大的 LOL
a***n 发帖数: 538	9 用volatile double就一样了。第一种情况可能优化的时候用了register，因为a的地址在编译的时候就知道了吧。
w*******e 发帖数: 285	10 this is because in release mode, the first function call is in-lined, and the second one to call the function pointer is not in-lined by vs compiler. 【在 N******K 的大作中提到】 : 编译器 vs2013 release模式 : Tempfunction 计算一个数然后输出 N 控制计算复杂度 : 我用function pointer ：TempfunctionPtr = &Tempfunction : 然后比较直接调用函数Tempfunction 和通过指针间接调用函数 : 结果很诡异 : N=1000 , Tempfunction_time=1, TempfunctionPtr_time=2; : N=2000 , Tempfunction_time=2, TempfunctionPtr_time=6; : N=4000 , Tempfunction_time=4, TempfunctionPtr_time=11; : N=8000 , Tempfunction_time=8, TempfunctionPtr_time=21; : TempfunctionPtr_time - Tempfunction_time 应该是常数
d*****u 发帖数: 43	11 The difference is from inline optimization of the compiler. When the function is inlined, the compiler has more information and opportunities to optimize the code. At inlining, the compiler can emit different code for the inner most loop body a+=i; . On my VS2013, a+=i is as follows in Tempfunction(): movsd xmm0, QWORD PTR [eax] // load a addsd xmm0, xmm1 // a += i addsd xmm1, xmm2 // i += 1 movsd QWORD PTR [eax], xmm0 // save a comisd xmm3, xmm1 // cmp N, i ja SHORT $LL3@Tempfuncti While in main(), when inlined, a+=i is like: movapd xmm0, xmm1 // temp = i addsd xmm1, xmm3 // i += 1 addsd xmm2, xmm0 // a += temp comisd xmm4, xmm1 // cmp N, i ja SHORT $LL13@main You can see that the difference is that in Tempfunction() has to load and save "a" for every iteration. In main() after inlining the compiler knows that "a" is not used otherwise, so it is free to use a register for all iterations, and only save to memory when the loop is done. The execution time of non-inline code is 1000000(NT1+T2), the inline code 1000000(NT3+T4). If T1==T3, the difference will be 1000000(T2-T4), which does not depend on N. This assumes the only difference is from the function call overhead. This is obviously not true according to your timing data. The actual difference has two parts: 1000000N(T1-T3) + 1000000(T2-T4). Given your timing data, it seems that T1~=2.5e-9, T2~=1e-6, T3~=1e-9, T4~=0. The load/save part is a big deal; it roughly accounts for 1.5e-9, making non -inlined version 150% slower for the inner loop body. PS: If the function chooses to use a pass-by-value "a" then returns "a", the loop body will not have the load/save overhead. PS2: The code should use int i=0 as the loop variable in Tempfunction() in this example. Using double i prevents the compiler from using loop unrolling , because double is not exact to represent arbitrary integer and the compiler has to be faithful to this choice that the code writer decides to take. Now if you use int i, VS2013 tries hard to unroll the loop and creates very complex assembly code that also handles corner cases for N that is not a multiple of chosen loop unrolling factor.
N******K 发帖数: 10202	12 多谢这个解释详细关于： PS: If the function chooses to use a pass-by-value "a" then returns "a", the loop body will not have the load/save overhead. 我测试过如果这样的话两种调用方式的时间是一样的【在 d*****u 的大作中提到】 : The difference is from inline optimization of the compiler. : When the function is inlined, the compiler has more information and : opportunities to optimize the code. At inlining, the compiler can emit : different code for the inner most loop body a+=i; . : On my VS2013, a+=i is as follows in Tempfunction(): : movsd xmm0, QWORD PTR [eax] // load a : addsd xmm0, xmm1 // a += i : addsd xmm1, xmm2 // i += 1 : movsd QWORD PTR [eax], xmm0 // save a : comisd xmm3, xmm1 // cmp N, i
N******K 发帖数: 10202	13 修改为如下 void Tempfunction(double& a, int N) { a = 0; double b = 0; for (double i = 0; i < N; ++i) { b += i; //a += i; } a = b; } 结果 N=1000 , Tempfunction_time=1, TempfunctionPtr_time=1; N=2000 , Tempfunction_time=2, TempfunctionPtr_time=2; N=4000 , Tempfunction_time=4, TempfunctionPtr_time=4; N=8000 , Tempfunction_time=8, TempfunctionPtr_time=8; 【在 N******K 的大作中提到】 : 编译器 vs2013 release模式 : Tempfunction 计算一个数然后输出 N 控制计算复杂度 : 我用function pointer ：TempfunctionPtr = &Tempfunction : 然后比较直接调用函数Tempfunction 和通过指针间接调用函数 : 结果很诡异 : N=1000 , Tempfunction_time=1, TempfunctionPtr_time=2; : N=2000 , Tempfunction_time=2, TempfunctionPtr_time=6; : N=4000 , Tempfunction_time=4, TempfunctionPtr_time=11; : N=8000 , Tempfunction_time=8, TempfunctionPtr_time=21; : TempfunctionPtr_time - Tempfunction_time 应该是常数

1

(共1页)

进入Programming版参与讨论

相关主题
● C语言的变量都一定要放在stack上吗？	● c++什么编译器好使？
● 正看一本叫code complete的书，有一句话关于c不明白	● 大家觉得C++复杂在哪里？
● 请推荐一本经典的讲编译器的书	● 玩了玩emscripten
● c的小问题	● 非虚函数里调用虚函数无效？
● 哪位大牛简单说说compiler里的bootstrap是干啥用的？	● An interesting C++ compile error
● gcc 4.81.或者Clang 都自称C++11 feature 全部支持了	● 问一个简单的C++问题
● C++ 11问题：emplace_back()	● c function 在 c里调用和C++调用结果不一样
● 问个disable copy constructor的问题	● 问一下，DLL里面怎么调用外部类啊？

相关话题的讨论汇总
话题: time话题: t1话题: std

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)