Renderscript Computation

In this document

Renderscript System Overview
Filterscript
Creating a Computation Renderscript
1. Creating the Renderscript file
2. Calling the Renderscript code

Related Samples

Hello Compute

Renderscript offers a high performance computation API at the native level that you write in C (C99 standard). Renderscript gives your apps the ability to run operations with automatic parallelization across all available processor cores. It also supports different types of processors such as the CPU, GPU or DSP. Renderscript is useful for apps that do image processing, mathematical modeling, or any operations that require lots of mathematical computation.

In addition, you have access to all of these features without having to write code to support different architectures or a different amount of processing cores. You also do not need to recompile your application for different processor types, because Renderscript code is compiled on the device at runtime.

Deprecation Notice: Earlier versions of Renderscript included an experimental graphics engine component. This component is now deprecated as of Android 4.1 (most of the APIs in rs_graphics.rsh and the corresponding APIs in android.renderscript). If you have apps that render graphics with Renderscript, we highly recommend you convert your code to another Android graphics rendering option.

Renderscript System Overview

The Renderscript runtime operates at the native level and still needs to communicate with the Android VM, so the way a Renderscript application is set up is different from a pure VM application. An application that uses Renderscript is still a traditional Android application that runs in the VM, but you write Renderscript code for the parts of your program that require it. No matter what you use it for, Renderscript remains platform independent, so you do not have to target multiple architectures (for example, ARM v5, ARM v7, x86).

The Renderscript system adopts a control and slave architecture where the low-level Renderscript runtime code is controlled by the higher level Android system that runs in a virtual machine (VM). The Android VM still retains all control of memory management and binds memory that it allocates to the Renderscript runtime, so the Renderscript code can access it. The Android framework makes asynchronous calls to Renderscript, and the calls are placed in a message queue and processed as soon as possible. Figure 1 shows how the Renderscript system is structured.

Figure 1. Renderscript system overview

When using Renderscript, there are three layers of APIs that enable communication between the Renderscript runtime and Android framework code:

The Renderscript runtime APIs allow you to do the computation that is required by your application.
The reflected layer APIs are a set of classes that are reflected from your Renderscript runtime code. It is basically a wrapper around the Renderscript code that allows the Android framework to interact with the Renderscript runtime. The Android build tools automatically generate the classes for this layer during the build process. These classes eliminate the need to write JNI glue code, like with the NDK.
The Android framework layer calls the reflected layer to access the Renderscript runtime.

Because of the way Renderscript is structured, the main advantages are:

Portability: Renderscript is designed to run on many types of devices with different processor (CPU, GPU, and DSP for instance) architectures. It supports all of these architectures without having to target each device, because the code is compiled and cached on the device at runtime.
Performance: Renderscript provides a high performance computation API with seamless parallelization across the amount of cores on the device.
Usability: Renderscript simplifies development when possible, such as eliminating JNI glue code.

The main disadvantages are:

Development complexity: Renderscript introduces a new set of APIs that you have to learn.
Debugging visibility: Renderscript can potentially execute (planned feature for later releases) on processors other than the main CPU (such as the GPU), so if this occurs, debugging becomes more difficult.

For a more detailed explanation of how all of these layers work together, see Advanced Renderscript.

Filterscript

Introduced in Android 4.2 (API Level 17), Filterscript defines a subset of Renderscript that focuses on image processing operations, such as those that you would typically write with an OpenGL ES fragment shader. You still write your scripts using the standard Renderscript runtime APIs, but within stricter constraints that ensure wider compatibility and improved optimization across CPUs, GPUs, and DSPs. At compile time, the precompiler evaluates Filterscript files and applies a more stringent set of warnings and errors than it does for standard Renderscript files. The following list describes the major constraints of Filterscript when compared to Renderscript:

Inputs and return values of root functions cannot contain pointers. The default root function signature contains pointers, so you must use the __attribute__((kernel)) attribute to declare a custom root function when using Filterscript.
Built-in types cannot exceed 32-bits.
Filterscript must always use relaxed floating point precision by using the rs_fp_relaxed pragma.
Filterscript files must end with an .fs extension, instead of an .rs extension.

Creating a Renderscript

Renderscript scales to the amount of processing cores available on the device. This is enabled through a function named rsForEach() (or the forEach_root() method at the Android framework level). that automatically partitions work across available processing cores on the device.

Implementing a Renderscript involves creating a .rs file that contains your Renderscript code and calling it at the Android framework level with the forEach_root() or at the Renderscript runtime level with the rsForEach() function. The following diagram describes how a typical Renderscript is set up:

Figure 1. Renderscript overview

The following sections describe how to create a simple Renderscript and use it in an Android application. This example uses the HelloCompute Renderscript sample that is provided in the SDK as a guide (some code has been modified from its original form for simplicity).

Creating the Renderscript file

Your Renderscript code resides in .rs and .rsh files in the <project_root>/src/ directory. This code contains the computation logic and declares all necessary variables and pointers. Every .rs file generally contains the following items:

A pragma declaration (#pragma rs java_package_name(package.name)) that declares the package name of the .java reflection of this Renderscript.
A pragma declaration (#pragma version(1)) that declares the version of Renderscript that you are using (1 is the only value for now).
A root function (or kernel) that is the main entry point to your Renderscript. The default root() function must return void and accept the following arguments:
- Pointers to memory allocations that are used for the input and output of the Renderscript. Both of these pointers are required for Android 3.2 (API level 13) platform versions or older. Android 4.0 (API level 14) and later requires one or both of these allocations.
The following arguments are optional, but both must be supplied if you choose to use them:
- A pointer for user-defined data that the Renderscript might need to carry out computations in addition to the necessary allocations. This can be a pointer to a simple primitive or a more complex struct.
- The size of the user-defined data.
Starting in Android 4.1 (API Level 16), you can choose to define your own root function arguments without adhering to the default root function signature described previously. In addition, you can declare multiple root functions in the same Renderscript. To do this, use the __attribute__((kernel)) attribute to define a custom root function. For example, here's a root function that returns a uchar4 and accepts two uint32_t types:
```
  uchar4 __attribute__((kernel)) root(uint32_t x, uint32_t y) {
    ...
  }
  
```
An optional init() function. This allows you to do any initialization before the root function runs, such as initializing variables. This function runs once and is called automatically when the Renderscript starts, before anything else in your Renderscript.
Any variables, pointers, and structures that you wish to use in your Renderscript code (can be declared in .rsh files if desired)

The following code shows how the mono.rs file is implemented:

#pragma version(1)
#pragma rs java_package_name(com.example.android.rs.hellocompute)

//multipliers to convert a RGB colors to black and white
const static float3 gMonoMult = {0.299f, 0.587f, 0.114f};

void root(const uchar4 *v_in, uchar4 *v_out) {
  //unpack a color to a float4
  float4 f4 = rsUnpackColor8888(*v_in);
  //take the dot product of the color and the multiplier
  float3 mono = dot(f4.rgb, gMonoMult);
  //repack the float to a color
  *v_out = rsPackColorTo8888(mono);
}

Setting floating point precision

You can define the floating point precision required by your compute algorithms. This is useful if you require less precision than the IEEE 754-2008 standard (used by default). You can define the floating-point precision level of your script with the following pragmas:

#pragma rs_fp_full (default if nothing is specified): For apps that require floating point precision as outlined by the IEEE 754-2008 standard.
#pragma rs_fp_relaxed - For apps that don’t require strict IEEE 754-2008 compliance and can tolerate less precision. This mode enables flush-to-zero for denorms and round-towards-zero.
#pragma rs_fp_imprecise - For apps that don’t have stringent precision requirements. This mode enables everything in rs_fp_relaxed along with the following:
- Operations resulting in -0.0 can return +0.0 instead.
- Operations on INF and NAN are undefined.

Script intrinsics

Renderscript adds support for a set of script intrinsics, which are pre-implemented filtering primitives that reduce the amount of code that you need to write. They also are implemented to ensure that your app gets the maximum performance gain possible.

Intrinsics are available for the following:

Blends
Blur
Color matrix
3x3 convolve
5x5 convolve
Per-channel lookup table
Converting an Android YUV buffer to RGB

Calling the Renderscript code

You can call the Renderscript from your Android framework code by creating a Renderscript object by instantiating the (ScriptC_script_name) class. This class contains a method, forEach_root(), that lets you invoke rsForEach. You give it the same parameters that you would if you were invoking it at the Renderscript runtime level. This technique allows your Android application to offload intensive mathematical calculations to Renderscript. See the HelloCompute sample to see how a simple Android application can utilize Renderscript.

To call Renderscript at the Android framework level:

Allocate memory that is needed by the Renderscript in your Android framework code. You need an input and output Allocation for Android 3.2 (API level 13) platform versions and older. The Android 4.0 (API level 14) platform version requires only one or both Allocations.
Create an instance of the ScriptC_script_name class.
Call forEach_root(), passing in the allocations, the Renderscript, and any optional user-defined data. The output allocation will contain the output of the Renderscript.

The following example, taken from the HelloCompute sample, processes a bitmap and outputs a black and white version of it. The createScript() method carries out the steps described previously. This method calls the Renderscript, mono.rs, passing in memory allocations that store the bitmap to be processed as well as the eventual output bitmap. It then displays the processed bitmap onto the screen:

package com.example.android.rs.hellocompute;

import android.app.Activity;
import android.os.Bundle;
import android.graphics.BitmapFactory;
import android.graphics.Bitmap;
import android.renderscript.RenderScript;
import android.renderscript.Allocation;
import android.widget.ImageView;

public class HelloCompute extends Activity {
  private Bitmap mBitmapIn;
  private Bitmap mBitmapOut;

  private RenderScript mRS;
  private Allocation mInAllocation;
  private Allocation mOutAllocation;
  private ScriptC_mono mScript;

  @Override
  protected void onCreate(Bundle savedInstanceState) {
      super.onCreate(savedInstanceState);
      setContentView(R.layout.main);

      mBitmapIn = loadBitmap(R.drawable.data);
      mBitmapOut = Bitmap.createBitmap(mBitmapIn.getWidth(), mBitmapIn.getHeight(),
                                       mBitmapIn.getConfig());

      ImageView in = (ImageView) findViewById(R.id.displayin);
      in.setImageBitmap(mBitmapIn);

      ImageView out = (ImageView) findViewById(R.id.displayout);
      out.setImageBitmap(mBitmapOut);

      createScript();
  }
  private void createScript() {
      mRS = RenderScript.create(this);
      mInAllocation = Allocation.createFromBitmap(mRS, mBitmapIn,
          Allocation.MipmapControl.MIPMAP_NONE,
          Allocation.USAGE_SCRIPT);
      mOutAllocation = Allocation.createTyped(mRS, mInAllocation.getType());
      mScript = new ScriptC_mono(mRS, getResources(), R.raw.mono);
      mScript.forEach_root(mInAllocation, mOutAllocation);
      mOutAllocation.copyTo(mBitmapOut);
  }

  private Bitmap loadBitmap(int resource) {
      final BitmapFactory.Options options = new BitmapFactory.Options();
      options.inPreferredConfig = Bitmap.Config.ARGB_8888;
      return BitmapFactory.decodeResource(getResources(), resource, options);
  }
}

To call Renderscript from another Renderscript file:

Allocate memory that is needed by the Renderscript in your Android framework code. You need an input and output Allocation for Android 3.2 (API level 13) platform versions and older. The Android 4.0 (API level 14) platform version requires only one or both Allocations.
Call rsForEach(), passing in the allocations and any optional user-defined data. The output allocation will contain the output of the Renderscript.

rs_script script;
rs_allocation in_allocation;
rs_allocation out_allocation;
UserData_t data;
...
rsForEach(script, in_allocation, out_allocation, &data, sizeof(data));

In this example, assume that the script and memory allocations have already been allocated and bound at the Android framework level and that UserData_t is a struct declared previously. Passing a pointer to a struct and the size of the struct to rsForEach is optional, but useful if your Renderscript requires additional information other than the necessary memory allocations.

Script groups

You can group Renderscript scripts together and execute them all with a single call as though they were part of a single script. This allows Renderscript to optimize execution of the scripts in ways that it could not do if the scripts were executed individually.

To build a script groupm, use the ScriptGroup.Builder class to create a ScriptGroup defining the operations. At execution time, Renderscript optimizes the run order and the connections between these operations for best performance.

Important: The script group must be a direct acyclic graph for this feature to work.

Results