Skip to content

A hypertable is a table that is partitioned by time. SchemaHero supports creating and managing hypertables.

Creating a Hypertable

To create a hypertable, you need to specify the hypertable field in the spec.tables section of the database.

apiVersion: databases.schemahero.io/v1alpha4
kind: Table
metadata:
  name: my-table
spec:
  database: my-database
  hypertable:
    timeColumnName: when
  columns:
    - name: when
      type: timestamp
    - name: value
      type: integer

When creating a hypertable, you need to specify the timeColumnName field. All other fields are optional.

Field Name Description
timeColumnName Specifies the name of the timestamp column to use for time-based partitioning. This column serves as the primary dimension for organizing data in a time-series format.
partitioningColumn Specifies an optional secondary column to partition by, creating space partitions within the hypertable. This is useful for multi-dimensional partitioning, such as by env_id alongside time.
numberPartitions Defines the number of partitions for the secondary partitioning column, if specified. For example, setting this to 4 will create 4 space partitions across the partitioningColumn.
chunkTimeInterval Sets the time interval for each chunk of data in the hypertable (e.g., 1 day, 1 hour). This interval defines how data is organized and distributed over time, allowing TimescaleDB to optimize storage and query performance for recent and older data.
createDefaultIndexes A boolean flag that, when true, automatically creates default indexes on the time and partitioning columns. These indexes improve performance for queries filtering by time and partitioning values.
ifNotExists When set to true, prevents errors if the hypertable already exists. This option is useful for idempotent operations or schema migrations.
partitioningFunc Specifies a hash or custom partitioning function for the secondary partitioning column. This is particularly useful for custom partitioning logic, distributing data based on a specific hash function or application-defined criteria
associatedSchemaName The schema in which any associated tables (e.g., indexes, compressed tables) will be created. If not specified, the default schema is used.
associatedTablePrefix Sets a prefix for associated tables (e.g., compressed or materialized views). This helps distinguish tables generated from this hypertable, which is helpful for organization and managing schema dependencies.
migrateData A boolean option to migrate existing data from a standard PostgreSQL table to a Timescale hypertable. If set to true, TimescaleDB will transform the existing table into a hypertable, retaining all data.
timePartitioningFunc Specifies the time partitioning function to apply to the timeColumnName. By default, this uses TIMESTAMPTZ. Adjust this if your time column needs a specific timezone or time-handling configuration.
replicationFactor Defines the replication factor for data within TimescaleDB’s distributed setup. Setting this replicates each chunk across multiple data nodes, increasing redundancy and availability.
dataNodes Lists the specific data nodes for distributed TimescaleDB setups. This field allows fine-grained control over the distribution of data across nodes, optimizing for specific requirements in availability or locality.
compression Enables native TimescaleDB compression for this hypertable. Specify settings such as the compression interval to manage data size and optimize long-term storage. Compression applies to chunks once they reach the specified interval, reducing storage needs for older data.
retention Sets a retention policy for automatically dropping older data from the hypertable. Define a time interval (e.g., 30 days) to specify how long data should be retained. Retention policies are critical for managing storage costs in high-volume time-series databases.

Compression and Retention fields are objects that contain subfields:

Compression

Field Name Description
segmentBy Defines the column(s) to “segment by” during compression. This means that TimescaleDB will group data by the specified column(s) within each chunk before compressing it. This is particularly useful for time-series data with multiple dimensions (e.g., id) where quick retrieval of a subset of data is required.
interval Defines the duration over which compression should be applied (e.g., 1 month, 3 months). This allows finer control over the compression frequency, particularly useful in scenarios where different data sets or partitions need distinct retention periods. For example, you might set a shorter interval for high-volume tables and a longer interval for less frequently queried tables.

Retention

Field Name Description
interval Specifies the time interval for retaining data (e.g., 30 days, 1 year). Data older than this interval will be automatically deleted based on the defined retention policy. This setting ensures that only data within the specified time window remains in the hypertable, while older data is removed to free up storage.