Introduction

ONNX-MLIR project comes with an executable onnx-mlir capable of compiling onnx models to a shared library. In this documentation, we demonstrate how to interact programmatically with the compiled shared library using ONNX-MLIR's Runtime API.

C Runtime API

Data Structures

OMTensor is the data structure used to describe the runtime information (rank, shape, data type, etc) associated with a tensor input or output.

OMTensorList is the data structure used to hold a list of pointers to OMTensor so that they can be passed into and out of the compiled model as inputs and outputs.

Model Entry Point Signature

All compiled model will have the same exact C function signature equivalent to:

OMTensorList* run_main_graph(OMTensorList*);

Intuitively, the model takes a list of tensors as input and returns a list of ensors as output.

Invoke Models Using C Runtime

API

We demonstrate using the API functions to run a simple ONNX model consisting of an add operation. To create such an onnx model, use this python script

To compile the above model, run onnx-mlir add.onnx and a binary library "add.so" should appear. We can use the following C code to call into the compiled function computing the sum of two inputs:

#include <OnnxMlirRuntime.h>
#include <stdio.h>
OMTensorList *run_main_graph(OMTensorList *);
int main() {
  // Shared shape & rank.
  int64_t shape[] = {2, 2};
  int64_t rank = 2;
  // Construct x1 omt filled with 1.
  float x1Data[] = {1., 1., 1., 1., 1., 1.};
  int64_t *x1Shape = {2, 2};
  OMTensor *x1 = omTensorCreate(x1Data, shape, rank, ONNX_TYPE_FLOAT);
  // Construct x2 omt filled with 2.
  float x2Data[] = {2., 2., 2., 2., 2., 2.};
  int64_t *x2Shape = {2, 2};
  OMTensor *x2 = omTensorCreate(x2Data, shape, rank, ONNX_TYPE_FLOAT);
  // Construct a list of omts as input.
  OMTensor *list[2] = {x1, x2};
  OMTensorList *input = omTensorListCreate(list, 2);
  // Call the compiled onnx model function.
  OMTensorList *outputList = run_main_graph(input);
  // Get the first omt as output.
  OMTensor *y = omTensorListGetOmtByIndex(outputList, 0);
  float *outputPtr = (float *)omTensorGetDataPtr(y);
  // Print its content, should be all 3.
  for (int i = 0; i < 6; i++)
    printf("%f ", outputPtr[i]);
  return 0;
}

Compile with gcc main.c add.so -o add, you should see an executable add appearing. Run it, and the output should be:

3.000000 3.000000 3.000000 3.000000 3.000000 3.000000

Exactly as it should be.

Reference

For full reference to available C Runtime API, refer to include/onnx-mlir/Runtime/OMTensor.h and include/onnx-mlir/Runtime/OMTensorList.h.