Quickstart guide Last updated: 2020-03-31

Today we're going to record page clicks for a website. We won't store the page's url, instead we're going to use an integer to represent each page. In TSGrid speak, all identifiers (pages, smart meters, IOT devices etc) are referred to as sensors and the sensor id must be an Integer. We will assume you already have a relational database system which stores metadata about each page and a reference to the sensor id used in TSGrid

Installation

TSGrid is a JVM based application (written in Scala). For production use we recommend a stable Linux distribution

System requirements

  • OSX or Linux
  • JDK 8 or 11

Danger

There are known bugs in versions of OpenJDK 11 prior to 11.06. We therefore recommend running on the latest 11.06+ version if using OpenJDK

Download

Download the binary from our Github repository. Look for the releases page and download a TSGrid-xxx.dist.tar.gz or
TSGrid-xxx.dist.zip archive

Binary install

  1. Install a suitable JDK and set the JAVA_HOME environment variable
  2. Extract the TSGrid-xxx.dist.tar.gz or TSGrid-xxx.dist.zip archive
  3. Set the TSGRID_ROOT_DB environment variable e.g. /var/tsgrid_db
  4. Run the bin/tsgrid-server script

You should see output similar to this in your console:

[info] o.h.b.c.nio1.NIO1SocketServerGroup INFO    - Service bound to address /0:0:0:0:0:0:0:0:8080
[info] o.h.server.blaze.BlazeServerBuilder INFO    -
[info]   _   _   _        _ _
[info]  | |_| |_| |_ _ __| | | ___
[info]  | ' \  _|  _| '_ \_  _(_-<
[info]  |_||_\__|\__| .__/ |_|/__/
[info]              |_|
[info] o.h.server.blaze.BlazeServerBuilder INFO    - http4s v0.21.2 on blaze v0.14.11 started at http://[::]:8080/

Note

There are many other deployment options for TSGrid including daemon mode, docker, kubernetes etc

Creating a database

Make a POST request to http://localhost:8080/db passing a JSON document which describes the new database:
Request:
POST /db HTTP/1.1
Host: localhost:8080
Content-Type: application/json

{
    "name": "page-clicks",
    "readingType": "Instant",
    "dataType": "Int",
    "ttlTimestamp": "IngestionTime",
    "resolutions": {
        "Raw": {
            "ttlDays": 5
        },
        "Hourly": {
            "ttlDays": 90
        },
        "Daily": {
            "ttlDays": -1
        }
    },
    "aggregationType": "Sum",
    "aggregationTimezone": "UTC"
}
Response:
Content-Type: application/json
Content-Length: 20

{
  "status": "success"
}

Let's look at what these fields mean:

name The database name. The name will be used as an identifier in subsequent REST calls so it should be short, meaningful and composed of only alphanumeric characters - and _
readingType We are using a reading type of Instant which means each data point has a single timestamp
dataType We use an Int to store a count of page clicks
ttlTimestamp Each reading (including aggregated readings) can have a time to live associated with it. We can use either EventTime (the timestamp when the measurement was taken) or IngestionTime (the time it was inserted into TSGrid)
resolutions The raw data (the data you ingest) will be purged after 5 days. TSGrid will automatically aggregate the raw data into hourly and daily "buckets". The hourly aggregations will be purged after 90 days and the daily aggregations will never be purged
aggregationType TSGrid will Sum the values during the aggregation process so the daily aggregation will represent the total clicks during the day
aggregationTimezone We use a value of UTC so the system will convert all timestamps into UTC. If we query for a "day" of data we will get the total number of page clicks between midnight UTC to midnight UTC (exclusive)

Inserting some data

We will insert 2 readings for each page. Data is inserted into TSGrid by making a POST request to /db/{db_name}/data
Request:
POST /db/page-clicks/data HTTP/1.1
Host: localhost:8080
Content-Type: application/json

[
  {
    "sensorId": 1,
    "time": "2020-01-01T00:00:00Z",
    "value": 1
  },
  {
    "sensorId": 1,
    "time": "2020-01-01T00:01:00Z",
    "value": 1
  },
  {
    "sensorId": 2,
    "time": "2020-01-01T00:00:00Z",
    "value": 1
  },
  {
    "sensorId": 2,
    "time": "2020-01-01T00:01:00Z",
    "value": 1
  }
]
Response:
Content-Type: application/json
Content-Length: 20

{
    "status": "success",
    "numReadingsInserted": 4
}

Pre-aggregating data

TSGrid automatically pre-aggregates data into different resolutions in the background. The schedule is configurable but by default it is set to aggregate data every hour. We don't want to wait an hour so we will force the aggregation to happen by making a POST request to /admin/{db_name}/aggregate

Note

TSGrid can also aggregate on the sensor axis - e.g. taking all the 09:00 readings and rolling them into a single 09:00 reading. This aggregation happens on the fly during the query call.

Request:
POST /admin/page-clicks/aggregate HTTP/1.1
Host: localhost:8080
Content-Type: application/json
Response:
Content-Type: application/json
Content-Length: 20

{
    "status": "success",
    "numReadingsAggregated": 4
}

Warning

/admin/{db_name}/aggregate is a blocking call. This makes it really useful for testing but should not be used when aggregating large volumes of data as the http call will most likely timeout

Querying

Let's fetch our data. Data is queried by making a GET request to /db/{db_name}/query, passing a JSON query payload in the body of the request. TSGrid will return a stream of readings for each sensor, using HTTP chunked encoding

Raw data

Raw data can be queried by passing a resolution of Raw

Request:
GET /db/page-clicks/query HTTP/1.1
Host: localhost:8080
Content-Type: application/json

{
  "sensorIds": [1,2],
  "from": "2020-01-01T00:00:00Z",
  "until": "2020-01-03T00:00:00Z",
  "resolution": "Raw",
  "sensorAggregation": "None"
}

Let's take a look at the query parameters to understand what they mean:

sensorIds Similar to a SQL IN clause. We want to fetch data for both pages
from Page clicks that happened on or after midnight 2020-01-01
until Page clicks that happened before midnight 2020-01-03
resolution We want the raw data we ingested
sensorAggregation TSGrid could aggregate the metrics for both pages together to produce. In this case we want per page metrics so we set each this field to None
Response:
Content-Type: application/json
Transfer-Encoding: chunked

[
    {
        "sensorId": 1,
        "time": "2020-01-01T00:00:00Z",
        "value": 1
    },
    {
        "sensorId": 1,
        "time": "2020-01-01T00:01:00Z",
        "value": 1
    },
    {
        "sensorId": 2,
        "time": "2020-01-01T00:00:00Z",
        "value": 1
    },
    {
        "sensorId": 2,
        "time": "2020-01-01T00:01:00Z",
        "value": 1
    }
]

Pre-aggregated data

Pre-aggregated data can be queried by passing a resolution other than Raw. Aggregated readings will always be returned with from and untiltimestamps

Note

This should not be confused with aggregation on the sensor axis which is specified by the sensorAggregation field in the query (see below). In this case we pass a sensorAggregation property of None and are therefore expecting one daily reading for each sensor

Request:
GET /db/page-clicks/query HTTP/1.1
Host: localhost:8080
Content-Type: application/json

{
  "sensorIds": [1,2],
  "from": "2020-01-01T00:00:00Z",
  "until": "2020-01-03T00:00:00Z",
  "resolution": "Daily",
  "sensorAggregation": "None"
}
Response:
Content-Type: application/json
Transfer-Encoding: chunked

[
    {
        "sensorId": 1,
        "from": "2020-01-01T00:00:00Z",
        "until": "2020-01-02T00:00:00Z",
        "value": 2
    },
    {
        "sensorId": 2,
        "from": "2020-01-01T00:00:00Z",
        "until": "2020-01-02T00:00:00Z",
        "value": 2
    }
]

Next steps

Checkout the full documentation to learn more about types of readings, aggregation modes, timezone support etc