1. Overview
In this tutorial, we'll show some of the different data types of the Apache Cassandra database. Apache Cassandra supports a rich set of data types, including collection types, native types, tuple types, and user-defined types.
The Cassandra Query Language (CQL) is a simple alternative to Structured Query Language (SQL). It is a declarative language developed to provide communication with its database. Similar to SQL, CQL also stores data in tables and organizes data into rows and columns.
2. Cassandra Database Configuration
Let's create a database using a docker image and connect it to the database using cqlsh. Next, we should create a keyspace:
CREATE KEYSPACE baeldung WITH replication = {'class':'SimpleStrategy', 'replication_factor' : 1};
For the purposes of this tutorial, we created a keyspace with only one copy of the data. Now, let's connect the client session to a keyspace:
USE <code class="language-shell">baeldung;
3. Built-in Data Types
CQL supports a rich set of native data types. These data types come pre-defined, and we can directly refer to any of them.
3.1. Numeric Types
Numeric types are similar to standard types in Java and other languages such as integers or floating-point numbers with different ranges:
Let's create a table with all these data types:
CREATE TABLE numeric_types
(
type1 int PRIMARY KEY,
type2 bigint,
type3 smallint,
type4 tinyint,
type5 varint,
type6 float,
type7 double,
type8 decimal
);
3.2. Text Types
CQL provides two data types for representing text. We can use text or varchar to create a UTF-8 character string. UTF-8 is the more recent and widely used text standard and supports internationalization.
There is also the ascii type to create an ASCII character string. The ascii type is most useful if we are dealing with legacy data that is in ASCII format. The size of the text values is limited by the maximum size of the column. The single-column value size is 2 GB, but recommended is only 1 MB.
Let's create a table with all these data types:
CREATE TABLE text_types
(
primaryKey int PRIMARY KEY,
type2 text,
type3 varchar,
type4 ascii
);
3.3. Date Types
Now, let's talk about date types. Cassandra provides several types which prove quite useful in defining unique partition keys or define ordinary columns:
timeuuid is represented by UUID version 1. We can input integer or string to CQL timestamp, time, and date. Values of the duration type are encoded as 3 signed integers.
The first integer represents the number of months, the second the number of days, and the third the number of nanoseconds.
Let's see an example of create table command:
CREATE TABLE date_types
(
primaryKey int PRIMARY KEY,
type1 timestamp,
type2 time,
type3 date,
type4 timeuuid,
type5 duration
);
3.4. Counter Type
The counter type is used to define counter columns. A counter column is a column whose value is a 64-bit signed integer. We can only perform two operations on the counter column – incrementing and decrementing.
Therefore, we can't set the value to the counter. We can use counters for tracking statistics such as numbers of page views, tweets, log messages, and so on. We can't mix the counter type with other types.
Let's see an example:
CREATE TABLE counter_type
(
primaryKey uuid PRIMARY KEY,
type1 counter
);
3.5. Other Data Types
- boolean is a simple true/false value
- uuid is a Type 4 UUID, which is based entirely on random numbers. We can input UUIDs by using dash-separated sequences of hex digits
- A binary large object (blob) is a colloquial computing term for an arbitrary array of bytes. The CQL blob type stores media or other binary file types. The maximum blob size is 2 GB, but less than 1 MB is recommended.
- inet is the type that represents IPv4 or IPv6 Internet addresses
Again, let's create a table with these types:
CREATE TABLE other_types ( primaryKey int PRIMARY KEY, type1 boolean, type2 uuid, type3 blob, type4 inet );
4. Collection Data Types
Sometimes we want to store data of the same type without generating new columns. Collections can store multiple values. CQL provides three collection types to help us such as lists, sets, and maps.
For instance, we can create a table having a list of textual elements, a list of integers, or a list of some other element types.
4.1. Set
We can store multiple unique values using the set data type. Likewise, in Java, the elements are not stored in order.
Let's create a set:
CREATE TABLE collection_types
(
primaryKey int PRIMARY KEY,
email set<text>
);
4.2. List
In this data type, the values are stored in the form of a list. We can't change the order of the elements. After storing the values in the list, the elements get a particular index. We can retrieve data by using these indexes.
Unlike sets, lists can store duplicate values. Let's add a list to our table:
ALTER TABLE collection_types
ADD scores list<text>;
4.3. Map
Using Cassandra, we can store data in sets of key-value pairs using the map data type. Keys are unique. Because of that, we can sort maps by their keys.
Let's add another column to our table:
ALTER TABLE collection_types
ADD address map<uuid, text>;
5. Tuples
Tuples are a set of different types of elements. These sets have a fixed length:
CREATE TABLE tuple_type
(
primaryKey int PRIMARY KEY,
type1 tuple<int, text, float>
);
6. User-Defined Data Types
Cassandra provides the possibility for creating our own data types. We can create, modify and remove these data types. Firstly, let's create our own type:
CREATE TYPE user_defined_type (
type1 timestamp,
type2 text,
type3 text,
type4 text);
So, now we can create a table with our type:
CREATE TABLE user_type
(
primaryKey int PRIMARY KEY,
our_type user_defined_type
);
7. Conclusion
In this quick tutorial, we explored the basic CQL data types. In addition, we created tables with these data types. After that, we talked about what kind of data they can store.
As always, the full source code of the article is available over on GitHub.
The post CQL Data Types first appeared on Baeldung.