Sun N1 Grid Engine 6.1 User's Guide

Developing with the C Language Binding

Important Files for the C Language Binding

To use the DRMAA C language binding implementation included with N1 Grid Engine 6.1, you need to know where to find the important files. The most important file is the DRMAA header file that you include from your C application to make the DRMAA functions available to your application. The DRMAA header file resides in sge-root/include/drmaa.h, where sge-root defaults to/usr/SGE. For detailed reference information about the DRMAA functions, see section 5 of the N1 Grid Engine man pages, located in the sge-root/man directory. To compile and link your application, use the DRMAA shared library at sge-root/lib/arch/libdrmaa.so.

Including the DRMAA Header File

To use the DRMAA functions in your application, every source file that uses a DRMAA function must include the DRMAA header file. To include the DRMAA header file in your source file, add the following line to your source code, usually near the top:

#include "drmaa.h"

Compiling Your C Application

When you compile your DRMAA application, you need to include some additional compiler directives to direct the compiler and linker to use DRMAA. The following directions apply for the Sun Studio Compiler Collection and for gcc. These instructions might not apply for other compilers and linkers. Consult the documentation for your specific compiler and linker products.

You must include two directives:

You also need to verify that the sge-root/lib/arch directory is included in your library search path (LD_LIBRARY_PATH on the Solaris Operating Environment and Linux). The sge-root/lib/arch directory is not included automatically when you set your environment using the settings.sh or settings.csh files.


Example 6–1 Compiling Your C Application Using Sun Studio Compiler

The following example shows how you would compile your DRMAA application using the Sun Studio Compiler. The following assumptions apply:

Sample commands would look like the following

% source /sge/default/common/settings.csh
% cc -I/sge/include -ldrmaa app.c

Running Your C Application

To run your compiled DRMAA application, verify the following:

The sge-root/lib/arch directory must be included in the library search path (LD_LIBRARY_PATH on the Solaris Operating Environment and Linux). The sge-root/lib/arch directory is not included automatically when you set your environment using the settings.sh or settings.csh files.

You must be logged into a machine that is an N1 Grid Engine submit host. If the machine is not an N1 Grid Engine submit host, all DRMAA function calls will fail, returning DRMAA_ERRNO_DRM_COMMUNICATION_FAILURE.

ProcedureHow to Use the DRMAA 0.95 C Language Binding

The DRMAA shared library, which is enabled by default, supports version 1.0 of the DRMAA C Language Binding Specification. For reasons of backward compatibility, however, Grid Engine also includes an implementation of the 0.95 version of the DRMAA C Language Binding Specification. You should develop all new applications with the 1.0 shared library, but you might occasionally discover an application that requires the 0.95 implementation.

To enable the 0.95 version of the shared library, follow these steps:

  1. Log in as a user that has permissions to modify the Grid Engine installation.


    % su -
  2. Change to the sge-root/lib/arch directory.


    % cd /sge/lib/sol-sparc64
  3. Remove the libdrmaa.so symbolic link.


    %  rm libdrmaa.so
  4. Create a new symbolic link to the 0.95 library.


    % ln -s libdrmaa.so.0.95 libdrmaa.so

    On the Solaris and Linux platforms, the shared library is tagged with a version number. Applications compiled and linked against the 1.0 version will fail claiming that the library could not be found if the 0.95 version of the shared library is enabled, and vice versa. On other platforms, a 1.0 application will load the 0.95 shared library successfully but might fail due to unknown symbols. A 0.95 application will load the 1.0 shared library successfully but will likely fail due to DRMAA functions returning unexpected error codes.

    • To restore the 1.0 version of the shared library, perform steps 1 through 3 and create a new symbolic link to the 1.0 library.


      % ln -s libdrmaa.so.1.0 libdrmaa.so

C Application Examples

The following examples illustrate some application interactions that use the C language bindings. You can find additional examples on the “How To” section of the Grid Engine Community Site.


Example 6–2 Starting and Stopping a Session

The following code segment shows the most basic DRMAA C binding program.

Every call to a DRMAA function returns an error code. If everything goes well, that code is DRMAA_ERRNO_SUCCESS. If an error occurs, an appropriate error code is returned. Every DRMAA function also takes at least two parameters. These two parameters are a string to populate with a error message in case of an error and an integer representing the maximum length of the error string.

On line 8, the example calls drmaa_init(). This function sets up the DRMAA session and must be called before most other DRMAA functions. Some functions, like drmaa_get_contact(), can be called before drmaa_init(), but these functions only provide general information. Any function that performs an action, such as drmaa_run_job() or drmaa_wait() must be called after drmaa_init() returns. If such a function is called before drmaa_init() returns, it will return the error code DRMAA_ERRNO_NO_ACTIVE_SESSION.

The dmraa_init() function creates a session and starts an event client listener thread. The session is used for organizing jobs submitted through DRMAA, and the thread is used to receive updates from the queue master about the state of jobs and the system in general. Once drmaa_init() has been called successfully, the calling application must also call drmaa_exit() before terminating. If an application does not call drmaa_exit() before terminating, the queue master might be left with a dead event client handle, which can decrease queue master performance.

At the end of the program, on line 17, drmaa_exit() cleans up the session and stops the event client listener thread. Most other DRMAA functions must be called before drmaa_exit(). Some functions, like drmaa_get_contact(), can be called after drmaa_exit(), but these functions only provide general information. Any function that performs an action, such as drmaa_run_job() or drmaa_wait() must be called before drmaa_exit() is called. If such a function is called after drmaa_exit() is called, it will return the error code DRMAA_ERRNO_NO_ACTIVE_SESSION.

01: #include 
02: #include "drmaa.h"
03: 
04: int main(int argc, char **argv) {
05:    char error[DRMAA_ERROR_STRING_BUFFER];
06:    int errnum = 0;
07: 
08:    errnum = drmaa_init(NULL, error, DRMAA_ERROR_STRING_BUFFER);
09: 
10:    if (errnum != DRMAA_ERRNO_SUCCESS) {
11:       fprintf(stderr, "Could not initialize the DRMAA library: %s\n", error);
12:       return 1;
13:    }
14: 
15:    printf("DRMAA library was started successfully\n");
16:    
17:    errnum = drmaa_exit(error, DRMAA_ERROR_STRING_BUFFER);
18: 
19:    if (errnum != DRMAA_ERRNO_SUCCESS) {
20:       fprintf(stderr, "Could not shut down the DRMAA library: %s\n", error);
21:       return 1;
22:    }
23: 
24:    return 0;
25: }


Example 6–3 Running a Job

The following code segment shows how to use the DRMAA C binding to submit a job to N1 Grid Engine. The beginning and end of this program are the same as in Example 6–2. The differences are on lines 16-59. On line 16, DRMAA allocates a job template. A job template is a structure used to store information about a job to be submitted. The same template can be reused for multiple calls to drmaa_run_job() or drmaa_run_bulk_job().

On line 22, the DRMAA_REMOTE_COMMAND attribute is set. This attribute tells DRMAA where to find the program to run. Its value is the path to the executable. The path can be relative or absolute. If relative, the path is relative to the DRMAA_WD attribute, which defaults to the user's home directory. For more information on DRMAA attributes, see the drmaa_attributes man page. For this program to work, the script sleeper.sh must be in your default path.

On line 32, the DRMAA_V_ARGV attribute is set. This attribute tells DRMAA what arguments to pass to the executable. For more information on DRMAA attributes, refer to the drmaa_attributes man page.

On line 43 , drmaa_run_job() submits the job. DRMAA places the id assigned to the job into the character array that is passed to drmaa_run_job(). The job is now running as though submitted by qsub. At this point, calling drmaa_exit() or terminating the program will have no effect on the job.

To clean things up, the job template is deleted on line 54. This frees the memory DRMAA set aside for the job template, but has no effect on submitted jobs.

Finally, on line 61, call drmaa_exit() is called. The call to drmaa_exit() is outside of the if structure started on line 18 because once drmaa_init() is called drmaa_exit() must be called before terminating, regardless of whether the other commands succeed.

01: #include 
02: #include "drmaa.h"
03: 
04: int main(int argc, char **argv) {
05:    char error[DRMAA_ERROR_STRING_BUFFER];
06:    int errnum = 0;
07:    drmaa_job_template_t *jt = NULL;
08: 
09:    errnum = drmaa_init(NULL, error, DRMAA_ERROR_STRING_BUFFER);
10: 
11:    if (errnum != DRMAA_ERRNO_SUCCESS) {
12:       fprintf(stderr, "Could not initialize the DRMAA library: %s\n", error);
13:       return 1;
14:    }
15: 
16:    errnum = drmaa_allocate_job_template(&jt, error, DRMAA_ERROR_STRING_BUFFER);
17: 
18:    if (errnum != DRMAA_ERRNO_SUCCESS) {
19:       fprintf(stderr, "Could not create job template: %s\n", error);
20:    }
21:    else {
22:       errnum = drmaa_set_attribute(jt, DRMAA_REMOTE_COMMAND, "sleeper.sh",
23:                                     error, DRMAA_ERROR_STRING_BUFFER);
24: 
25:       if (errnum != DRMAA_ERRNO_SUCCESS) {
26:          fprintf(stderr, "Could not set attribute \"%s\": %s\n",
27:                   DRMAA_REMOTE_COMMAND, error);
28:       }
29:       else {
30:          const char *args[2] = {"5", NULL};
31:          
32:          errnum = drmaa_set_vector_attribute(jt, DRMAA_V_ARGV, args, error,
33:                                               DRMAA_ERROR_STRING_BUFFER);
34:       }
35:       
36:       if (errnum != DRMAA_ERRNO_SUCCESS) {
37:          fprintf(stderr, "Could not set attribute \"%s\": %s\n",
38:                   DRMAA_REMOTE_COMMAND, error);
39:       }
40:       else {
41:          char jobid[DRMAA_JOBNAME_BUFFER];
42: 
43:          errnum = drmaa_run_job(jobid, DRMAA_JOBNAME_BUFFER, jt, error,
44:                                  DRMAA_ERROR_STRING_BUFFER);
45: 
46:          if (errnum != DRMAA_ERRNO_SUCCESS) {
47:             fprintf(stderr, "Could not submit job: %s\n", error);
48:          }
49:          else {
50:             printf("Your job has been submitted with id %s\n", jobid);
51:          }
52:       } /* else */
53: 
54:       errnum = drmaa_delete_job_template(jt, error, DRMAA_ERROR_STRING_BUFFER);
55: 
56:       if (errnum != DRMAA_ERRNO_SUCCESS) {
57:          fprintf(stderr, "Could not delete job template: %s\n", error);
58:       }
59:    } /* else */
60: 
61:    errnum = drmaa_exit(error, DRMAA_ERROR_STRING_BUFFER);
62: 
63:    if (errnum != DRMAA_ERRNO_SUCCESS) {
64:       fprintf(stderr, "Could not shut down the DRMAA library: %s\n", error);
65:       return 1;
66:    }
67: 
68:    return 0;
69: }