2.1.1-beta (revision 4703)
Usage in writing mode - MPI example

This is a short example of how to use the OTF2 writing interface with MPI. This example is available as source code in the file otf2_mpi_writer_example.c .

We start with inclusion of some standard headers.

#include <stdlib.h>
#include <stdio.h>
#include <inttypes.h>

And than include the MPI and OTF2 header.

#include <mpi.h>
#include <otf2/otf2.h>

Now prepare the inclusion of the <otf2/OTF2_MPI_Collectives.h> header. As it is a header-only interface, it needs some information about the used MPI environment. In particular the MPI datatypes which match the C99 types uint64_t and int64_t. In case you have a MPI 3.0 conforming MPI implementation you can skip this. If not, provide #define's for the following macros prior the #include statement. In this example, we assume a LP64 platform.


After this preparatory step, we can include the <otf2/OTF2_MPI_Collectives.h> header.

We use MPI_Wtime to get timestamps for our events but need to convert the seconds to an integral value. We use a nano second resolution.

get_time( void )
double t = MPI_Wtime() * 1e9;
return ( uint64_t )t;

Define a pre and post flush callback. If no memory is left in OTF2's internal memory buffer or the writer handle is closed a memory buffer flushing routine is triggered. The pre flush callback is triggered right before a buffer flush. It needs to return either OTF2_FLUSH to flush the recorded data to a file or OTF2_NO_FLUSH to suppress flushing data to a file. The post flush callback is triggered right after a memory buffer flush. It has to return a current timestamp which is recorded to mark the time spend in a buffer flush. The callbacks are passed via a struct to OTF2.

pre_flush( void* userData,
OTF2_FileType fileType,
OTF2_LocationRef location,
void* callerData,
bool final )
return OTF2_FLUSH;
post_flush( void* userData,
OTF2_FileType fileType,
OTF2_LocationRef location )
return get_time();
static OTF2_FlushCallbacks flush_callbacks =
.otf2_pre_flush = pre_flush,
.otf2_post_flush = post_flush

Now everything is prepared to begin with the main program.

main( int argc,
char** argv )

First initialize the MPI environment and query the size and rank.

MPI_Init( &argc, &argv );
int size;
MPI_Comm_size( MPI_COMM_WORLD, &size );
int rank;
MPI_Comm_rank( MPI_COMM_WORLD, &rank );

Create new archive handle.

OTF2_Archive* archive = OTF2_Archive_Open( "ArchivePath",
1024 * 1024 /* event chunk size */,
4 * 1024 * 1024 /* def chunk size */,

Set the previously defined flush callbacks.

OTF2_Archive_SetFlushCallbacks( archive, &flush_callbacks, NULL );

Now we provide the OTF2 archive object the MPI collectives. As all ranks in MPI_COMM_WORLD write into the archive, we use this communicator as the global one. We set the local communicator to MPI_COMM_NULL, as we don't care about file optimization here.


Now we can create the event files. Though physical files aren't created yet.

Each rank now requests an event writer with its rank number as the location id.

OTF2_EvtWriter* evt_writer = OTF2_Archive_GetEvtWriter( archive,
rank );

We note the start time in each rank, this is later used to determine the global epoch.

uint64_t epoch_start = get_time();

Write an enter and a leave record for region 0 to the local event writer.

OTF2_EvtWriter_Enter( evt_writer,
0 /* region */ );

We also record a MPI_Barrier in the trace. For this we generate an event before we do the MPI call.

get_time() );

Now we can do the MPI_Barrier call.


After we passed the MPI_Barrier. we can note the end of the collective operation inside the event stream.

0 /* communicator */,
0 /* bytes provided */,
0 /* bytes obtained */ );

Finally we leave the region again with the leave region.

OTF2_EvtWriter_Leave( evt_writer,
0 /* region */ );

The event recording is now done, note the end time in each rank.

uint64_t epoch_end = get_time();

Now close the event writer, before closing the event files collectively.

OTF2_Archive_CloseEvtWriter( archive, evt_writer );

After we wrote all of the events we close the event files again.

We now collect all of the epoch_start and epoch_end timestamps by calculating the minimum and maximize and provide these to the root rank.

uint64_t global_epoch_start;
MPI_Reduce( &epoch_start,
uint64_t global_epoch_end;
MPI_Reduce( &epoch_end,

Only the root rank will write the global definitions, thus only he requests a writer object from the archive.

if ( 0 == rank )
OTF2_GlobalDefWriter* global_def_writer = OTF2_Archive_GetGlobalDefWriter( archive );

We need to define the clock used for this trace and the overall timestamp range.

global_epoch_end - global_epoch_start + 1 );

Now we can start writing the referenced definitions, starting with the strings.

OTF2_GlobalDefWriter_WriteString( global_def_writer, 0, "" );
OTF2_GlobalDefWriter_WriteString( global_def_writer, 1, "Master Thread" );
OTF2_GlobalDefWriter_WriteString( global_def_writer, 2, "MPI_Barrier" );
OTF2_GlobalDefWriter_WriteString( global_def_writer, 3, "PMPI_Barrier" );
OTF2_GlobalDefWriter_WriteString( global_def_writer, 4, "barrier" );
OTF2_GlobalDefWriter_WriteString( global_def_writer, 5, "MyHost" );
OTF2_GlobalDefWriter_WriteString( global_def_writer, 6, "node" );
OTF2_GlobalDefWriter_WriteString( global_def_writer, 7, "MPI" );
OTF2_GlobalDefWriter_WriteString( global_def_writer, 8, "MPI_COMM_WORLD" );

Write definition for the code region which was just entered and left to the global definition writer.

0 /* id */,
2 /* region name */,
3 /* alternative name */,
4 /* description */,
7 /* source file */,
0 /* begin lno */,
0 /* end lno */ );

Write the system tree to the global definition writer.

0 /* id */,
5 /* name */,
6 /* class */,

For each rank we define a new location group and one location. We provide also a unique string for each location group.

for ( int r = 0; r < size; r++ )
char process_name[ 32 ];
sprintf( process_name, "MPI Rank %d", r );
9 + r,
process_name );
r /* id */,
9 + r /* name */,
0 /* system tree */ );
r /* id */,
1 /* name */,
4 /* # events */,
r /* location group */ );

The last step is to define the MPI communicator. This is a three-step process. First we define that this trace actually recorded in the MPI paradigm and enumerate all locations which participate in this paradigm. As we used the MPI ranks directly as the location id, the array with the locations is the identity.

uint64_t comm_locations[ size ];
for ( int r = 0; r < size; r++ )
comm_locations[ r ] = r;
OTF2_GlobalDefWriter_WriteGroup( global_def_writer,
0 /* id */,
7 /* name */,
comm_locations );

Now we can define sub-groups of the previously defined list of communication. locations. For MPI_COMM_WORLD this is the whole group here. Note the these sub-groups are created by using indices into the list of communication locations, and not by enumerating location ids again. But in this example the sub-group is the identity again.

OTF2_GlobalDefWriter_WriteGroup( global_def_writer,
1 /* id */,
0 /* name */,
comm_locations );

Finally we can write the definition of the MPI_COMM_WORLD communicator. This finalizes the writing of the global definitions and we can also close the writer object.

OTF2_GlobalDefWriter_WriteComm( global_def_writer,
0 /* id */,
8 /* name */,
1 /* group */,
OTF2_UNDEFINED_COMM /* parent */ );
global_def_writer );

All the other ranks wait inside this barrier so that root can write the global definitions.


At the end, close the archive, finalize the MPI environment, and exit.

OTF2_Archive_Close( archive );

To compile your program use a command like the following. Note that we need to activate the C99 standard explicitly for GCC.

mpicc -std=c99 `otf2-config --cflags` \
-c otf2_mpi_writer_example.c \
-o otf2_mpi_writer_example.o

Now you can link your program with:

mpicc otf2_mpi_writer_example.o \
`otf2-config --ldflags` \
`otf2-config --libs` \
-o otf2_mpi_writer_example