Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am experimenting with MPI and I kept getting this error when I was running it through mpirun on the command line.

----------------------------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
----------------------------------------------------------------------------------------------

I'm not sure why though, because other mpi programs run perfectly fine.

Here is my code.

#include <stdio.h>
#include <mpi.h>

int func(int num){
    int rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    if (num == 0){
        num = 5;
        MPI_Bcast(&num, 1, MPI_INT, rank, MPI_COMM_WORLD);
    }
    return num;
}

int main(int argc, char **argv){
    int rank, size;
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    printf("On processor %d, func returns %d
", rank, func(rank));
    MPI_Finalize();
    return 0;
}

the program is still giving me the same error. Is MPI_Bcast within an if statement just not valid? Does it still work if you try broadcasting when you're not the root?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
141 views
Welcome To Ask or Share your Answers For Others

1 Answer

The signature of MPI_Bcast as I see it in any reference document is int MPI_Bcast(void* buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm). However, you are passing only four arguments, and looks like you forgot either about the first or second argument.

What is num in your case, and what is your buffer? The answer to this will likely resolve your question, but I am also not sure why your code even compiles. If num is what you want to broadcast, try if MPI_Bcast(& num, 1, MPI_INT, rank, MPI_COMM_WORLD) works for you.

There is another, very serious independent problem. You have some int rank; on your stack and pass this to MPI_Bcast before you ever initialize it. Who is sending? If root is, you could just as well pass 0, or initialize properly by int rank = 0;.

Undetermined values for rank are almost certainly the reason for your job to abort because instances will be randomly sending or receiving.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...