1. Overview
Mapping data collections in an SQL table field is a common approach when we want to store non-relational data within our entity. In Hibernate 6, there are changes to the default mapping mechanisms that make storing such data more efficient on the database side.
In this article, we’ll review those changes. Additionally, we’ll discuss possible approaches to migrating data persisted using Hibernate 5.
2. New Basic Array/Collection Mapping in Hibernate 6.x
Before Hibernate 6, we had unconditional mapping for collections where the type code SqlTypes.VARBINARY was used by default. Under the hood, we serialized the contents with Java serialization. Now, due to changes in the mapping, we can map collections as native array implementations, JSON, or XML.
Let’s review a few popular SQL dialects and see how they map collection-type fields. First, al of all, let’s add the latest Spring Data JPA dependency which already contains a Hibernate 6.x under the hood:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
Also, let’s add the H2 database dependency since we’ll be able to switch dialects and modes and check different databases’ behavior using it:
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
</dependency>
Now let’s create the entity we’ll use in all the cases:
public class User {
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE)
Long id;
List<String> tags;
//getters and setters
}
We’ve created a User entity with the id and a list of user tags.
2.1. PostgreSQL Dialect
In the PostgreSQLDialect we have a supportsStandardArrays() method overridden and this driver supports a native array implementation for collections.
To review this behavior, let’s configure our database:
spring:
datasource:
url: jdbc:h2:mem:mydb;MODE=PostgreSQL
username: sa
password: password
driverClassName: org.h2.Driver
jpa:
database-platform: org.hibernate.dialect.PostgreSQLDialect
show-sql: true
We’ve configured H2 in PostgreSQL mode. Also, we’ve specified a PostgreSQLDialect class as a database platform. We’ve enabled SQL script logging to see the type definitions for the table columns.
Now let’s get a mapping for our User entity and check the SQL type of the tags field:
static int ARRAY_TYPE_CODE = 2003;
@PersistenceContext
EntityManager entityManager;
@Test
void givenPostgresDialect_whenGetUserEntityFieldsTypes_thenExpectedTypeShouldBePresent() {
MappingMetamodelImpl mapping = (MappingMetamodelImpl) entityManager.getMetamodel();
EntityMappingType entityMappingType = mapping
.getEntityDescriptor(User.class.getName())
.getEntityMappingType();
entityMappingType.getAttributeMappings()
.forEach(attributeMapping -> {
if (attributeMapping.getAttributeName().equals("tags")) {
JdbcType jdbcType = attributeMapping.getSingleJdbcMapping().getJdbcType();
assertEquals(ARRAY_TYPE_CODE, jdbcType.getJdbcTypeCode());
}
});
}
We’ve got the JDBC mapping and checked that the tags field has an array JDBC type code. Besides that, if we check the logs we’ll see that for this column the varchar array was chosen as an SQL type:
Hibernate:
create table users (
id bigint not null,
tags varchar(255) array,
primary key (id)
)
2.2. Oracle Dialect
In OracleDialect, we don’t have a supportsStandardArrays() method overridden. Despite this, inside the getPreferredSqlTypeCodeForArray() we have unconditional support for the array type for collections.
Let’s configure our database to test Oracle behavior:
spring:
datasource:
url: jdbc:h2:mem:mydb;MODE=Oracle
username: sa
password: password
driverClassName: org.h2.Driver
jpa:
database-platform: org.hibernate.dialect.OracleDialect
show-sql: true
We switched our database to the Oracle mode and specified the OracleDialect. Now, let’s run the type checking for our User entity:
@Test
void givenOracleDialect_whenGetUserEntityFieldsTypes_thenExpectedTypeShouldBePresent() {
MappingMetamodelImpl mapping = (MappingMetamodelImpl) entityManager.getMetamodel();
EntityMappingType entityMappingType = mapping
.getEntityDescriptor(User.class.getName())
.getEntityMappingType();
entityMappingType.getAttributeMappings()
.forEach(attributeMapping -> {
if (attributeMapping.getAttributeName().equals("tags")) {
JdbcType jdbcType = attributeMapping.getSingleJdbcMapping().getJdbcType();
assertEquals(ARRAY_TYPE_CODE, jdbcType.getJdbcTypeCode());
}
});
}
As expected, we have an array JDBC type code in the tags field. Let’s take a look at what is shown in the logs:
Hibernate:
create table users (
id number(19,0) not null,
tags StringArray,
primary key (id)
)
As we can see, the StringArray SQL type is used for the tags column.
2.3. Custom Dialect
By default, there are no dialects for mapping collections as JSON or XML. Let’s create a custom dialect that uses JSON as a default type for collections typed fields:
public class CustomDialect extends Dialect {
@Override
public int getPreferredSqlTypeCodeForArray() {
return supportsStandardArrays() ? ARRAY : JSON;
}
@Override
protected void registerColumnTypes(TypeContributions typeContributions, ServiceRegistry serviceRegistry) {
super.registerColumnTypes( typeContributions, serviceRegistry );
final DdlTypeRegistry ddlTypeRegistry =
typeContributions.getTypeConfiguration().getDdlTypeRegistry();
ddlTypeRegistry.addDescriptor( new DdlTypeImpl( JSON, "jsonb", this ) );
}
}
We’ve registered support for the JSON type and added it as a default type for collections mapping. Now let’s configure our database:
spring:
datasource:
url: jdbc:h2:mem:mydb;MODE=PostgreSQL
username: sa
password: password
driverClassName: org.h2.Driver
jpa:
database-platform: com.baeldung.arrayscollections.dialects.CustomDialect
We’ve switched our database to the PostgreSQL mode since it supports a jsonb type. Also, we started using our CustomDialect class.
Now we’ll check the type mapping again:
static int JSON_TYPE_CODE = 3001;
@Test
void givenCustomDialect_whenGetUserEntityFieldsTypes_thenExpectedTypeShouldBePresent() {
MappingMetamodelImpl mapping = (MappingMetamodelImpl) entityManager.getMetamodel();
EntityMappingType entityMappingType = mapping
.getEntityDescriptor(User.class.getName())
.getEntityMappingType();
entityMappingType.getAttributeMappings()
.forEach(attributeMapping -> {
if (attributeMapping.getAttributeName().equals("tags")) {
JdbcType jdbcType = attributeMapping.getSingleJdbcMapping().getJdbcType();
assertEquals(JSON_TYPE_CODE, jdbcType.getJdbcTypeCode());
}
});
}
We can see the tags field was mapped as a JSON type. Let’s check the logs:
Hibernate:
create table users (
id bigint not null,
tags jsonb,
primary key (id)
)
As expected, the jsonb column type was used for tags.
3. Migration From Hibernate 5.x to Hibernate 6.x
Different default types are used for collection mappings in Hibernate 5.x and Hibernate 6.x. To migrate to native array or JSON/XML types, we have to read our existing data through the Java serialization mechanism. Then, we need to write it back through the respective JDBC method for the type.
To demonstrate it, let’s create the entity that is expected to be migrated:
@Entity
@Table(name = "migrating_users")
public class MigratingUser {
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE)
private Long id;
@JdbcTypeCode(SqlTypes.VARBINARY)
private List<String> tags;
private List<String> newTags;
// getters, setters
}
We created a newTags field, which is mapped as an array type by default, and explicitly used SqlTypes.VARBINARY for the existing tags field, as it’s the default mapping type in Hibernate version 5.x.
Now, let’s create a repository for our entity:
public interface MigratingUserRepository extends JpaRepository<MigratingUser, Long> {
}
Finally, let’s execute the migration logic:
@Autowired
MigratingUserRepository migratingUserRepository;
@Test
void givenMigratingUserRepository_whenMigrateTheUsers_thenAllTheUsersShouldBeSavedInDatabase() {
prepareData();
migratingUserRepository
.findAll()
.stream()
.peek(u -> u.setNewTags(u.getTags()))
.forEach(u -> migratingUserRepository.save(u));
}
We’ve read all the items from the database, copied their values to the new field, and saved them back into the database. To control the memory consumption we can consider pagination during the reading process. To improve the persistence speed we can use the batching mechanism. After the migration, we can remove the old column from the table.
4. Conclusion
In this tutorial, we reviewed the new collections mapping in Hibernate 6.x. We explored how to use internal array types and JSON fields to store serialized collections. Additionally, we implemented a migration mechanism to the new database schema. With the new mapping, we no longer need to implement it ourselves to support a more efficient data type for our collections.
As always, the code is available over on GitHub.