Conceptual Model

An introduction to Grainite

Application

An application developed in Grainite is composed of Topics, Tables (that contain Grains), Action Handlers (associated with Grains), and Tasks.

Grainite stores data in Topics and Tables. Topics are collections of Events. Tables are collections of Grains. Topics are similar to Topics in other queue-based systems like Kafka, IBM MQ, and others. Tables are similar to tables in most databases, in that they allow structured data to be stored, mutated, and accessed efficiently. Tables contain Grains, and Grains have attached Action Handlers for processing requests to the Grain.

Developers provide the list of Topics, Tables, Grains, their Action Handlers, and the relationships between them in a config file that is submitted to the Grainite server to load a new version of the application.

At this time, Grainite exposes a Java and Python APIs for Client Applications and for implementing Grain Action Handlers. Additional language support is coming for Javascript, Golang, and other languages.

The concepts of Topics, Events, Tables, Grains, Tasks, and Action Handlers are further described in the sections below.

Topics and Events

A Topic is used to store Events. Clients can send (or produce) Events to be stored in Topics and other clients can pull (or consume) Events from Topics. Topics may have multiple producers and consumers.

Events have the property that they are immutable. Usually, Events represent a user interaction, a stock ticker, an inventory change, or other such business interactions that need to be stored and accessed at a later point.

Events stored inside topics may have a key (or multiple keys, in the future). When a primary (or first) key is provided, the Topic provides ordering guarantees by that key. In other words, when reading from the Topic, events for a particular key will be returned in the order they were added to the system. There is no ordering guaranteed across events belonging to different keys.

Tables and Grains

A Table is used to store Grains. Grains include both data (similar to a row in a database table), but also may include some behaviors represented as Grain Action Handlers. Grains may be used to store data like Shopping Carts, Customer data, Product Catalogs, and other such business entities.

Each grain consists of:

  1. a Key - unique identifier for the grain in the Table.

  2. a Value - blob storage. Good for storing frequently accessed data.

  3. 200 Ordered Maps - can contain arbitrarily large amounts of data (as-if each map were a relational table).

Grains may subscribe to topics (to receive new events and process them), receive messages from other Grains or external applications, as well as, post messages to other Grains or invoke external systems.

Grain Action Handlers

Grain Action Handlers are user-provided code used to represent behaviors on a Grain. For instance, when a Grain subscribes to a Topic, an Action Handler provided by the developer will be invoked for (each batch of) events so that it may extract the relevant data from the events, process it, maybe make external calls, save any data derived from the events into its own storage, and send any messages to other Grains or Topics.

Grain Action Handlers are invoked directly within the Grainite Cluster. In a future release, Grainite will be able to invoke remote Action Handlers that might be installed as REST endpoints, gRPC service endpoints or even serverless functions like AWS Lambdas, Azure Durable Functions, or equivalent.

Each Grain may have multiple action handlers, potentially implemented in multiple different languages. For example, one Action Handler might respond to Shopping Cart events, while another Action Handler might respond to Shipping events.

Grainite provides a guarantee that only one action handler is active at any time for a given Grain (identified by a key). With this guarantee, developers writing Action Handlers do not have to worry about locking or concurrency for updating the Grain's state.

An example Grain Action Handler:

public class SenderBankStatsHandler {
  // handleStats is a batch-style action handler
  public ActionResult handleStats(Action action, 
                                        GrainContext context) throws Exception {

    // Retrieve the currently stored aggregate value in the Grain
    TotalAggregates aggs = context.getValue().asType(TotalAggregates.class);

    // This Grain expects all requests to be messages from other Grains that 
    // send an incremental message to be added to the aggregate value.
    GrainRequest request = (GrainRequest) action;

    // Extract the object from the request.
    TransactionResult txn = request.getPayload()
                                   .asType(TransactionResult.class);

    // update the Grain aggregate
    aggs.getAggregates().addPayment(txn.amount);

    // save the updated aggregate to the Grain. Note that updates are buffered 
    // to allow for efficient writes.
    context.setValue(Value.of(aggs));

    // Action completed successfully
    return ActionResult.success(action);
  } 
}

Tasks

Currently, Task development is supported for Java only, though Tasks can be used along with Python apps. Support for developing Tasks in Python is coming soon.

Tasks are user-provided code that is executed on a continuous basis. Unlike Action Handlers that are triggered by an external event or stimulus (e.g. an API invocation, a Topic Event, or a Grain-to-Grain Message), Tasks are continuously scheduled. Tasks are especially useful when repeated work needs to be done. As an example, a task can be created for reading from a passive data source like a database or a Kafka topic.

Tasks, like Action Handlers, run fully-managed within the Grainite cluster. Restarts and failures are automatically handled by the Grainite platform. Grainite provides pre-configured extensions containing Tasks, for example, Tasks that can read changed data in real time from Microsoft SQL Server, Kafka topics, Salesforce and many others. Users can also implement tasks within their application by implementing the com.grainite.api.tasks.Task and com.grainite.api.tasks.TaskInstance interface.

Tasks support flexible sharding - for instance, one may want to read all the partitions of a Kafka topic in parallel. For this, the KafkaReader Task will get information on the number of partitions on the topic, and then start up KafkaReaderInstance TaskInstances - one for each partition. Tasks that are pre-written and provided by Grainite can simply be configured via the application's app.yaml. Grainite manages the parallel invocations and state for each of the Task and TaskInstance invocations. Here is a sample app.yaml snippet showing configuration of a task that reads change data from Microsoft SQLServer.

# Read from SQLServer CDC capture_instances cards and transactions
# This task is provided by Grainite, and users simply add it to their
# applications app.yaml and add config properties. 
#
# In the below example, the task will read from SQLServer and post any
# records received in real time to the cdc_topic defined elsewhere in 
# the app.yaml.
#
# The combine_txn property ensures that changes for all tables belonging
# to the same transaction are sent together.

tasks:
  - task_name: sqlserver_cdc
    type: SQLServerCDCReader
    config:
      connectionUrl: <connection url>
      database: <database name>
      capture_instances: <semicolon separate list of capture instances>
      userid: <user id>
      password: <password>
      combine_txn: true
    output:
      topic: cdc_topic

Last updated