General

What is Grainite?

Grainite is a software platform for building modern, event-driven, microservice-based applications.

What makes Grainite compelling for building & running modern, event-driven, microservice-based applications?

  • Grainite includes event streaming & storage, the ability to run business logic in multiple languages, and data storage and querying. You do not need to combine multiple components (from possibly different vendors).

  • Grainite only needs a K8s cluster and storage. It can be run on-premise or on public clouds. There is no tie-in to any infrastructure or cloud provider.

  • Grainite handles all parallelism & concurrency. There is no need for application developers to concern themselves with multi-threading or locking for concurrency control.

  • Grainite scales horizontally so you can add nodes to deal with more traffic. You can also run business logic in separate compute engines like EKS/ECS/Lambda/functions. Grainite will schedule the execution of the business logic.

What kind of data should be stored in Topics vs. Tables?

Most modern applications need to process incoming messages or events. One property of events is that they are immutable. Events are usually read once, saved, and then processed to produce a new state.

Topics are best for storing these Events. Topics naturally provide order amongst messages, so if one wants to replay the history of events for what-if analysis, that can be easily done. Events stored in Topics are accessed in (some) order. In Grainite, the ordering is provided on a per-key basis. i.e. Events for Customer ID 1 will all be presented in order, so will events from Customer ID 2. However, there is no ordering relationship between events across Customer IDs 1 and 2.

Subscribers/Consumers of Topics may receive events in order and process them in parallel.

Tables are best for storing mutable data. For instance, after processing an event, one may want to record the sum-total summary of all the events so far. This is easily stored within the Grain. Note that each time a new event comes in, this state will be updated. Tables within Grainite are wide-column i.e. each Grain (nominally, a Table Row) may contain arbitrarily large amounts of data. For instance, a Grain for Customer ID 1 might contain a summary status of the customer's account. It may also contain all of their customer support interactions over the last 5 years, all the products they have browsed or liked, all purchases or transactions they have done. This large amount of data stored within the Grain is easily accessible to Grain handlers at low latency and with a high degree of parallelism.

What data can one store in a Grain (within a Table)?

Like most modern distributed data platforms, Grainite provides a flexible schema for storage. Each row in a Grainite Table contains a Grain. The Grain has the following elements in which data may be stored:

  1. Key: The key is the identifier for the Grain. An example key could be a phone number, or a Customer or Product ID.

  2. Value: The value is an array of bytes that can store any serialized value for this Grain. One may store a JSON object, a serialized Java object, a serialized Protocol Buffer object, or any other array of bytes in the value field.

  3. Multiple Sorted Maps: In addition to the Key and the Value on each Grain, the Grain provides several numbered sorted maps. These sorted maps can be accessed with an ID (usually in the range 0-199), and contain keys and values within each map. For instance, one may use Sorted Map 0 to store all the website visits by a customer, where the key of the map is the timestamp of their visit, and the value is the duration and products visited. Sorted Map 1 may be used to store the entire purchase history of the particular customer, allowing for easy access to that data when processing events.

What is the high-level process for designing applications in Grainite?

  • Database design

    1. Identify the entities in your application. Each entity type maps to a Table. Each instance of the entity is stored as a grain in the table. Grain in Grainite corresponds to a row in a relational database table.

    2. Identify the attributes of the entity. All attributes of an entity are contained within it. Note: Grains are not normalized so you store multivalued attributes inside the grain itself.

    3. Identify the attribute which will be the key for the grain.

    4. Identify the layout of the grain in the schema. What will be stored in the value of the grain and what attributes will be stored in the sorted maps. Summary, non-growing information about the entity is stored in the value whereas growing, granular information is stored in the sorted maps.

    5. Decide the format of the data items. JSON & Java Serialized Objects are the most commonly used. Grainite does not impose any format and stores them as byte arrays. The interpretation is left to the application.

  • Interaction design

    1. Identify the communication between entities and between external clients and entities. Microservice-based applications are event-driven so communication between entities is in the form of asynchronous events.

    2. Grains can send events directly to other grains or to a topic from where they will be delivered to the grains that subscribe to the topic. Clients can likewise directly invoke actions on grains or send events to topics that will be delivered to grains that subscribe to the topics.

    3. An event is identified by a key and a payload. The key is the id of the entity or entities that the event is generated for. The payload of the event contains the message for the entity.

  • Action Handler design

    1. Identify the endpoints in each entity for processing the events. Each of these endpoints is an action handler. For example, it could be a method in a Java class.

    2. Create an app.yaml file that specifies the application name, tables, topics, and action handlers.

  • Generate the Grainite artifacts

    1. Run gx gen all --modify to create the Grainite and development constructs. This creates the java classes, the pom.xml file, and a sample client.

  • Build and deploy the app

    1. mvn clean package

    2. gx load

  • Run the application

    1. runapp.sh This will send an event from the client a topic in the app.

The Grainite deployment scripts downloaded during cluster creation include a cluster-reset command for each type of cluster environment. Below is an example for GCP where my_grainite_cluster is the user-specified name of the cluster:

grainite/scripts/bin/gcp-grainite cluster-reset my_grainite_cluster

Since the command wipes all data from the Grainite database instance and is irreversible, a prompt will appear, requiring the name of the cluster to be typed out as confirmation.

Some new features are designated as "Early Access" or "GA." What do these terms mean?

When we ship new features in Grainite releases, we will communicate them with the following designations to indicate their level of maturity:

  • Private Preview: In this phase, we have developed a feature or capability that is included in the product for demonstration purposes and may be directly communicated to customers who have expressed interest.

  • Early Access: Features with this designation are publicly documented in release notes and have some basic usage instructions included. These are only suitable for developing proofs-of-concept, and should only be tested in development environments but not used in production.

  • Beta: Beta features are more stable and have more thorough documentation than Early Access features. Beta features can be used in development environments and enabled via flags in staging with the expectation that they will be fully delivered with the GA designation in a future Grainite release.

  • General Availability (GA): Features and capabilities designated as GA are fully supported and certified for production use. New features released with no designation are implicitly GA upon release.

Last updated