关于openmp的讨论汇总 - 话题女王

x*****u
发帖数: 3419

来自主题: Computation版 - Using OpenMP 3 zz

http://www.linux-mag.com/2004-03/extreme_01.html
Linux Magazine / March 2004 / EXTREME LINUX
Using OpenMP, Part 3
EXTREME LINUX
Using OpenMP, Part 3
by Forrest Hoffman
This is the third and final column in a series on shared memory parallel
ization using OpenMP. Often used to improve performance of scientific models
on symmetric multi-processor (SMP) machines or SMP nodes in a Linux cluster
, OpenMP consists of a portable set of compiler directives, library calls, a
nd environment variables.

l******9
发帖数: 579

来自主题: CS版 - openMP or boost::thread (pthread) for multithreading ?

Hi,
I am trying to do parallelization for a computing intensive problem.
I am working on a Linux cluster where each node is a multicore processor.
e.g. 2 or 4 quad-core processor per node.
I want to reduce latency and improve performance as much as possible.
I plan to use multiprocessing and multithreading at the same.
Each process run on a distinct node and each process spawn many threads
on each node. This is a 2 level parallelism.
For multiprocessing, I would like to choose MPI.
For multithre... 阅读全帖

l******9
发帖数: 579

来自主题: Hardware版 - openMP or boost::thread (pthread) for multithreading ?

l******9
发帖数: 579

来自主题: Linux版 - openMP or boost::thread (pthread) for multithreading ?

l******9
发帖数: 579

来自主题: Programming版 - openMP or boost::thread (pthread) for multithreading ?

y**b
发帖数: 10166

来自主题: Programming版 - openmp并行计算疑问

用openmp并行化一个模拟程序，发现：
计算只要运行足够步长，openmp每次运行给出的结果都不一样，
而串行运算的结果始终一样。这正常吗？
直觉上openmp由于每次运行都采用不同的计算顺序(各线程的
先后顺序是随机的)，从而可能改变误差的积累方式，一般怎么
处理这类问题？谢谢。

l******9
发帖数: 579

来自主题: Software版 - openMP or boost::thread (pthread) for multithreading ?

l******9
发帖数: 579

来自主题: Unix版 - openMP or boost::thread (pthread) for multithreading ?

x*****u
发帖数: 3419

来自主题: Computation版 - OpenMP Multi-Processing 2 zz

http://www.linux-mag.com/2004-02/extreme_01.html
Linux Magazine / February 2004 / EXTREME LINUX
OpenMP Multi-Processing, Part 2
EXTREME LINUX
OpenMP Multi-Processing, Part 2
by Forrest Hoffman
This month, we continue our focus on shared-memory parallelism using Ope
nMP. As a quick review, remember that OpenMP consists of a set of compiler d
irectives, a handful of library calls, and a set of environment variables th
at can be used to specify run-time parameters. Available for both FORTRAN an

b*****l
发帖数: 9499

来自主题: Linux版 - OpenMP 求救。。。 (转载)

【以下文字转载自 Thoughts 讨论区】
发信人: bigsail (河马·旋木), 信区: Thoughts
标题: OpenMP 求救。。。
发信站: BBS 未名空间站 (Sat Apr 30 02:18:47 2011, 美东)
在学 OpenMP，第一步就不通：设多线程失败。。。
TestOMP.cpp 的 code 很简单：开 5 个线程，每个介绍一下自己，就完事了.
#include
#include
using namespace std;
main () {
omp_set_num_threads(5);
cout << "Fork! " << endl;
#pragma omp parallel
{
// Obtain and print thread id
cout<< "Hello World from thread = " << omp_get_thread_num()
<< " of " << omp_get_num_threads() << endl;
... 阅读全帖

O*******d
发帖数: 20343

来自主题: Programming版 - A helloworld OpenMP question?

You have to activate OpenMP in your compiler. For Visual Studio 2008,
Project->"your project"->C/C++->Language->OpenMP Support

O*******d
发帖数: 20343

来自主题: Programming版 - A helloworld OpenMP question?

The default number of threads in OpenMP is the number of CPU on your
computer if you do not call omp_set_num_threads(). Of course you have to
activate OpenMP support of your compiler.

O*******d
发帖数: 20343

来自主题: Programming版 - A helloworld OpenMP question?

比较新的compiler一般都支持OpenMP。但是可能需要激活，至少Visual Studio 2008
是这样的。激活就是把compiler
支持OpenMP的功能调用起来。如果不激活，不管你有几个CPU，就只有一个thread。你
call omp_set_num_threads()
在没有激活的compiler下是无效的，但也不会给错。这是为了backward
compatibility.

y****n
发帖数: 15

来自主题: Programming版 - 一个OpenMP问题求教

有一个关于openmp的问题想请教各位大牛。原始程序(如A)需要分配一个临时数组再释
放。用OpenMP改成并行实现后(如B)，不同线程不能共享这个数组，每个线程需要独立
分配这段内存。
如果在循环体内分配内存，那一共分配了nk=121次，效率很低。实际上如果存在4个线
程，只要在每个线程中分配一次就行了。不知道应该如何实现，请大牛们指点。
非常感谢。
-------------------------------------
Program A:
-------------------------------------
float* pfSdx = (float *) calloc( N );
for (int k = 0; k < nk; k++)
{
...
}
free( (float *) pfSdx );
-------------------------------------
Program B:
-------------------------------------
#pragma omp parallel for
for (int k = 0; k < ... 阅读全帖

x*****u
发帖数: 3419

来自主题: Computation版 - 1 Multi-Processing with OpenMP zz

http://www.linux-mag.com/2004-01/extreme_01.html
Linux Magazine / January 2004 / EXTREME LINUX
Multi-Processing with OpenMP
EXTREME LINUX
Multi-Processing with OpenMP
by Forrest Hoffman
In this column's previous discussions of parallel programming, the focus
has been on distributed memory parallelism, since most Linux clusters are b
est suited to this programming model. Nevertheless, today's clusters often c
ontain two or four (or more) processors per node. While one could simply sta
rt mult

b*****l
发帖数: 9499

来自主题: Thoughts版 - OpenMP 求救。。。

在学 OpenMP，第一步就不通：设多线程失败。。。
TestOMP.cpp 的 code 很简单：开 5 个线程，每个介绍一下自己，就完事了.
#include
#include
using namespace std;
main () {
omp_set_num_threads(5);
cout << "Fork! " << endl;
#pragma omp parallel
{
// Obtain and print thread id
cout<< "Hello World from thread = " << omp_get_thread_num()
<< " of " << omp_get_num_threads() << endl;
// Only master thread does this
if (omp_get_thread_num() == 0)
cout << "Master thread: number of threads = " <<
omp... 阅读全帖

x******n
发帖数: 9057

来自主题: Thoughts版 - OpenMP 求救。。。

晕死，头一回听说openmp

x*z
发帖数: 1010

来自主题: Hardware版 - openMP or boost::thread (pthread) for multithreading ?

Most MPI libraries have shared memory implemented, which actually has
less overhead than OpenMP or threading.

l******9
发帖数: 579

来自主题: Hardware版 - openMP or boost::thread (pthread) for multithreading ?

In MPI libraries with shared memory implemented, we have inter-process
communication or inter-thread communication ?
If it is former, why process has less overhead than thread ?
If it is later, why it has less overhead than openMP and threading?
Does MPI has some built-in advantages over them ?
Any help is really appreciated.
Thanks

Q*T
发帖数: 263

来自主题: Linux版 - OpenMP 求救。。。 (转载)

Enable OpenMP support when linking:
g++ -fopenmp -c -o TestOMP.o TestOMP.cpp

G*****7
发帖数: 1759

来自主题: Linux版 - openMP or boost::thread (pthread) for multithreading ?

intel cilk plus.
i like cilk/openmp better.
http://software.intel.com/en-us/forums/showthread.php?t=78311

y****e
发帖数: 23939

来自主题: Programming版 - A helloworld OpenMP question?

这里没有人用过OpenMP的？

y****e
发帖数: 23939

来自主题: Programming版 - A helloworld OpenMP question?

谢谢你的回复。不过我还是有点不明白，我现在是在Linux里面用g++ compile的。
compile没有问
题，不知道你说的activate OpenMP是什么意思？
我的系统是intel dual core的，应该算两个processor吧。
而且我确实call了omp_set_num_threads()了呀。
但只起来了一个thread。

to

p******m
发帖数: 353

来自主题: Programming版 - A helloworld OpenMP question?

我尝试用intel 9 编译器在vc 6.0的环境里编译openmp，　但其中一个线程老是被重复
执行，　不知道为什么？　有谁遇到过类似的问题吗？

p******m
发帖数: 353

来自主题: Programming版 - OpenMP能编译产生DLL吗？

请问有没有人用过OpenMP？
能编译产生DLL吗？　被调用的ＤＬＬ还有并行功能吗？

p******m
发帖数: 353

来自主题: Programming版 - OpenMP能编译产生DLL吗？

我尝试用intel 9 编译器在vc 6.0的环境里编译openmp，　但其中一个线程老是被重复
执行，　不知道为什么？　有谁遇到过类似的问题吗？

s*******e
发帖数: 664

来自主题: Programming版 - [合集] Intel 9编译器在vc 6.0的环境里编译openmp的问题

☆─────────────────────────────────────☆
petersam (google) 于 (Fri Oct 2 16:06:00 2009, 美东) 提到:
我尝试用intel 9 编译器在vc 6.0的环境里编译openmp，　但其中一个线程老是被重复
执行，　不知道为什么？　有谁遇到过类似的问题吗？
☆─────────────────────────────────────☆
petersam (google) 于 (Fri Oct 2 16:36:24 2009, 美东) 提到:
以下是我的测试代码：
#include "stdio.h"
#include "omp.h"
int main(){
int i;
omp_set_num_threads(2);
#pragma omp parallel for
for(i = 0; i < 6; i++ )
printf("i = %d\n", i);
return 0;
}
☆─────────────────────────────────────

O*******d
发帖数: 20343

来自主题: Programming版 - openMP or boost::thread (pthread) for multithreading ?

我个人比较喜欢OpenMP。不需要加很多code，最简单的就只需要加一行， compiler就
可以自动把for loop平行。线程的数目自动和你的CPU核的数目一致，每个核执行for
loop的不同index。这些全都是自动的，不需要你操心。你可以做data parallelism
和task parallelism.

m***x
发帖数: 492

来自主题: Programming版 - openMP or boost::thread (pthread) for multithreading ?

Data parallel use openmp.

y**b
发帖数: 10166

来自主题: Programming版 - openmp并行计算疑问

更新一下，用了GCC quad-precision math libraray，初步结果显示openmp每次运行的
结果完全一致(在原来double输出精度的意义上），而原来double或long double在相同
运算下有明显偏差。
没有白折腾。感叹64位计算还没普及，128位计算已经颇有需求了，很多高精度库恐怕就
是例证。遗憾的是quadmath库目前很慢，我的计算显示大约慢30倍，够慢。

遍，

t****t
发帖数: 6806

来自主题: Programming版 - OpenMP的问题-The process cannot access the file because it is being used by another process

我不懂fortran, 但是第一, 这种小事没必要搞什么openmp这么复杂, 你不就是要一次
开十七八个进程吗? shell就可以搞定了, 看你的程序本来就是shell的包装, 可是这包
装有什么用呢?
第二, 同时跑十七八个进程, 输入可以是同一个文件(但是注意不要exclusive open),
输出如果是同一个文件那就是自找麻烦. 看你的程序, 调用mymodel.exe的时候命令行
完全没有变化, 多半就是麻烦的根源了吧

O*******d
发帖数: 20343

来自主题: Programming版 - OpenMP的问题-The process cannot access the file because it is being used by another process

为什么输入文件要用OpenMP？。输入文件的瓶颈不在CPU，而在硬件IO。

y****n
发帖数: 15

来自主题: Programming版 - 请大牛们帮忙看一段openmp并行代码的问题

下面这段程序使用openmp执行一个类似图像线性插值的算法。
输入为Z(图像)，X(坐标)，Y(坐标),输出为F(图像)
为了避免同时写入数组F的某个元素，使用了#pragma omp atomic
我遇到的问题是，当把线程数设为1和2时，运行程序会得到不同的结果。实在想不出问
题出在什么地方。肯请大牛们帮忙看一看。
#pragma omp parallel for
for (int n = 0; n < MN; n++)
{
double y = Y[n];
double x = X[n];
int fx = (int)floor(x);
int fy = (int)floor(y);

if (fx<1 || x>nw || fy<1 || y>nh) // image index is [1...nw]
{
for (int i = 0; i < ndim; i++)
{
#pragma omp atomic
F[n+i*MN] += Z... 阅读全帖

t****t
发帖数: 6806

来自主题: Programming版 - 请大牛们帮忙看一段openmp并行代码的问题

不懂openmp, 但是浮点数支持atomic吗? I actually don't think so...

p***o
发帖数: 1252

来自主题: Programming版 - c++11 std::thread 和 openmp 那个额外开销少？

纠结这个不如上TBB。再说难道openmp会笨到每次都重新建立新线程而不用线程池？

g****n
发帖数: 13

来自主题: Computation版 - OpenMP入门级问题

Hi
I am new to openMP. now I have some question about it.
I wrote a very simple program in C++.
#include
#include
main ()
{
int nthreads, tid;
int i;
omp_set_num_threads(2);
printf("Number of CPUS:%d\n",omp_get_num_procs());
/* Fork a team of threads giving them their own copies of variables */
#pragma omp parallel private(tid)
{
tid = omp_get_thread_num();
if(tid==0)
{
printf("tid=%d thread = %d\n", 0,tid);
printf("there are %d threads\n",omp_get_num_threads

t*******t
发帖数: 1067

来自主题: Computation版 - Openmp问题请教

请问这里有人在用openmp吗？我有个弱问题请教,在下面这行程序里，如果我有很多变
量是
private,至少超过一行，请问怎么换行，谢谢
!$OMP PARALLEL DO SHARED(n,a), PRIVATE(i,j,k,su,....)

t******0
发帖数: 629

来自主题: Computation版 - OpenMP新手，请问以下HelloWorld程序为什么跑不出并行来

我在网上找到如下手册http://openmp.org/mp-documents/omp-hands-on-SC08.pdf
编写出如下Hello World程序，在VC2012下跑。
#include
#include
#include // system("pause")
int main()
{
omp_set_num_threads(4);
# pragma omp parallel
{
int ID=omp_get_thread_num();
printf("Hello(%d)",ID);
printf("World(%d)n",ID);
}
system("pause"); //课件里没有这句
return(0); //课件里没有这句
}
运行结果就是：
Hello(0)World(0)
Press any key to continue...
说好的1,2,3都没看见了。。。请问我是哪里编... 阅读全帖

U***g
发帖数: 330

来自主题: Computation版 - OpenMP新手，请问以下HelloWorld程序为什么跑不出并行来

编译的时候没有加上openmp的flag

y**b
发帖数: 10166

来自主题: Hardware版 - 我的机器提高计算速度的的潜力有多大？

mpi一直可以做shared memory计算，在一台机器的内存里面通讯，性能能不好吗。
用mpi比mpi+openmp性能还好，很多情况是这样的，我做的情况也是如此。但是不能排
除有些情况不是如此。
关键是，mpi从设计到完成比openmp复杂太多。一个项目，时间上很可能不允许做mpi(
没个半年设计、开发、调试、大规模测试很难搞定)，但是openmp很简单，几天几周基
本都能搞定。
mpi一旦做好了，就不是openmp能比的了。openmp只能运行在一个节点或一台工作站上
，mpi就没这个限制了，几百几千个节点并行的威力没法比。

s******u
发帖数: 501

来自主题: Programming版 - intel knights landing 72core CPU 谁用过？

烂。OpenMP的scaling明显有问题，72核心280线程但是scaling能到50-60x就很不错了
。总而言之，OpenMP对海量线程的优化还是不行，sweet spot停留在8-32线程并行。也
许是kernel thread的模型决定了OpenMP thread的overhead太高，不像GPU那么
lightweight。MPI倒是能做的不错，但是要这么多的进程内存又不够。最大的优点是可
以直接用现有的x86代码（绝大多数已经支持MPI＋OpenMP了），不用像GPU需要重新
fork出来写CUDA，然后maintain两套codebase

y**b
发帖数: 10166

来自主题: Programming版 - intel knights landing 72core CPU 谁用过？

有啥解释吗？
是总体上跟以下因素有关？
mpi靠手工分块(分区)决定计算粒度，这个常常就是一种优化；
而openmp靠机器决定计算粒度，通常太细而overhead太大。
还是跟编译器和底层硬件更有关系？
我做的一种密集颗粒碰撞模拟，也是mpi明显优于openmp，原计划在几千个
节点上采用hybrid mpi/openmp模式，最后发现还是pure mpi模式快得多，
跨五个数量级的模拟都给出同样结论。当然我这个模拟跟那些专门的测试
有所区别，毕竟有其它因素影响：比如有小量代码不适合openmp化，有些
地方加锁，算法还可进一步改进等等。

l******9
发帖数: 579

来自主题: Linux版 - 如何确保多线程程序在 multicore server 用所有 core

I am also thinking about openMP.
But, how to make sure that openMP take full use of available
cores ?
Suppose that I have 24 CPUs, each of them has 6 cores (each core
supports hyperthreading).
I have 10,000 computing tasks, each of them needs 0.001 second.
Some of the tasks need to exchange data, which is very small.
Which task needs to send/receive data to/from which task is pre-defined. It
is known before the program in run.
But, the exchange frequency may be very high.
I want to schedule task... 阅读全帖

y**b
发帖数: 10166

来自主题: Programming版 - 如何查看一个程序/进程使用了哪些cpu?

【以下文字转载自 Linux 讨论区】
发信人: yanb (大象，多移动一点点), 信区: Linux
标题: 如何查看一个程序/进程使用了哪些cpu?
发信站: BBS 未名空间站 (Tue Sep 25 01:10:18 2007), 站内
该程序使用了MPI或OpenMP, 在一个有8个Intel Quad-core(也就是32个core)的
linux服务器上运行.请问有什么命令能看出这个程序使用了哪些cpu及占用率？
目的主要是想直接看看该程序是否真正利用上了MPI或OpenMP。比如OpenMP，
设置OMP_NUM_THREADS=4或8或16...皆能运行，但从处理器结构来看应该是4
才有实际意义，8、16、32究竟是怎么回事? 还有MPI，用下面命令运行
mpirun -np 8或16或32...究竟是否分配到不同cpu上面了？

c******n
发帖数: 16666

来自主题: Programming版 - 【C++算法求助】有个O(n*n)的算法不知道该怎么优化并且并行化计算

说来比较悲催非cs专业，搞了个小程序跑模拟，数据量小的时候还好，数据量一大先
是内存挂了。后来跑去ec2租了个大内存服务器发现跑得还是很慢，仔细一看，有个
function算得特别慢，因为是n*n的复杂度，数据量上去了计算时间马上跳了等量级上
升。自己又是一知半解的，不知道哪位能帮着改进下算法然后提示下OpenMP该怎么做。
简而言之，是个关于水文的模拟，计算流域面积，所以数据的基本单位/对象就是node
。有两个linked-list（求别吐槽用这个而不用vector，摊子摊太大了改起来不容易
，或者如果我现在添加一个vector，复制现有list行不？）里面存的都是node之间的指
针。
第一个linked-list存的所有node的指针，按照node的ID存放，方便遍历所有node
第二个linked-list，其实不止一个，存的是所有在当前node的下游的node的指针，遍
历的话可以从当前node一直走到当前mesh的边界
流域面积的具体计算，就是当前node自己的面积加上其所有上有点面积的总和
比如在下图中，
a b c d e
... 阅读全帖

W***o
发帖数: 6519

来自主题: Programming版 - 求救

try:
gcc -fopenmp -lpthread xxx.cpp
openMP 的东西最好还是在LINUX环境下整方便点，而且有的LINUX还没有 openMP, MPI
library
上次用openMP, MPI 整多线程的 synch/barrier lock发现 ubuntu 就没有这两个
library

cannot

k**********g
发帖数: 989

来自主题: Programming版 - 求助个dll调用的问题

Step into the disassembly. Or use a CPU instruction profiler like AMD
CodeAnalyst or Intel VTune.
If this 0.5 second delay only occurs on the first call after application
launch, I think this is an inevitable cost for using OpenMP. If it happens
on every call then there is a need to investigate.
With the debugger attached, check how many OpenMP threads are created. Also
make sure the EXE and DLL are linking against the correct OpenMP library.

w***g
发帖数: 5958

来自主题: Programming版 - intel knights landing 72core CPU 谁用过？

你有benchmark吗? 你这么说我很涨见识. 我见过的几个, openblas有openmp或者
thread版,
opencv用tbb, fftw用openmp, 还没见过哪个单机跑的轮子用MPI的. 你没有用32MPI我
觉得
就是一个证据, 就是MPI还做不到底. 但是即使是4x8或8x4能把OpenMP干掉我觉得也很
牛.

t*****z
发帖数: 812

来自主题: Computation版 - 【求助】稀疏矩阵与向量相乘，如何有效并行？

假设稀疏矩阵用CRS方式存储，为什么我的openmp并行不好？
#pragma omp parallel for private(i,j,t)
for(i=0; i t = 0.0;
for(j=A.ptr[i];j t += A.value[j] * x[A.index[j]];
y[i] = t;
}
n=400,000. 2,4,8threads 运行的时间差不多，比1thread w/ openmp快，根1thread w
/o openmp差不错
做iterative solver 大家出出点子？

z*******h
发帖数: 346

来自主题: JobHunting版 - 那位讲下并行计算这门课在工作里有实际作用么？

也许是我孤陋寡闻了，我怎么没听说过在Hadoop cluster上用openMP or MPI的。MPI根
本就不可能用，openMP也没必要啊。

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

topics

未名新帖统计// 7月16日

历史上的今天