Glossary

Heartbeat Requests

ZooKeeper sends heartbeat requests to determine whether the Kafka broker is alive.

K Factor

Specify the number of replicas of the data that should exist in the database. The K safety sets the fault tolerance for the Vertica cluster, where K can be set as 0 or 1. Specify a value based on the number of nodes in the cluster, which is measured as 2k+1. For example: the K factor 0 is ideal for a single node cluster. If the K factor is 1, then you need to have a minimum of 3 nodes. Vertica recommends a K-safety value of 1.

K-factor also determines the number of identical instance of the segmented projections to be created for fail safety. Vertica stores table data in projections that are a collection of table columns, which optimizes query execution. The projection data can be divided into multiple segments or can be maintained as a single unsegmented unit. To ensure high availability of data, unsegmented projections are replicated on all the nodes and a segmented projection should have K+1 identical instances (buddies). For example: if the K- factor is specified as 1, then 2 identical instances of the segment are maintained.

Kafka Brokers

One or more servers that are added in a Kafka cluster are called Brokers. These are typically message brokers that act as mediators between two systems, ensuring that messages are delivered to the correct systems.

Kafka Cluster

A group of servers or nodes that are connected to each other to achieve a common objective is a cluster. Each of the servers or nodes in the cluster, will have one instance of Kafka broker running.

Kafka Connect

Kafka Connect is a tool included with Kafka that imports and exports data to Kafka. It is an extensible tool that runs connectors, which implement the custom logic for interacting with an external system. For example, a connector to an RDBMS system captures every change in data made in a table.

Kafka Consumers

The processes that consumes messages from the Kafka topics are called consumers. In this case, the consumer is Vertica.

Kafka Follower

A node that replicates the leader are called followers. If the leader fails, one of the followers will automatically become a leader.

Kafka Leader

A node that handles all read and write requests for a partition is called a leader.

Kafka Partitions

Kafka topics are further broken down into partitions that enables topics to be replicated across multiple brokers for fault tolerance.

Kafka Producers

The processes that publish messages to one or more Kafka topics are called Producers. In this case, the producer is the RDBMS system.

Kafka Replicas

A replica or a copy of a partition is essentially used to prevent loss of data.

Kafka Schema Registry

Kafka producers write data to Kafka topics and Kafka consumers read data from Kafka topics. Schema Registry helps ensure that producers write data with a schema that can be read by consumers, even as producers and consumers evolve their schemas.

Kafka Topics

Kafka stores streams of records in categories called Topics. These topics are further broken down into partitions that can be stored across multiple servers and disks. Topics in Kafka are always multi-subscriber; that is, a topic can have zero, one, or many consumers that subscribe to the data written to it.

Offset

A Kafka topic receives messages across a set of partitions and each partition maintains these messages in a sequential order, which is identified as an offset, also known as a position.

Tick Time

The unit of time used by ZooKeeper that is translated to milliseconds. All time-dependent operations for ZooKeeper are based on tick time.