1. Overview
In this article, we’ll externalize the setup data of an application using CSV files, instead of hardcoding it.
2. A CSV Library
Let’s start by introducing a simple library to work with CSV – the Jackson CSV extension:
<dependency> <groupId>com.fasterxml.jackson.dataformat</groupId> <artifactId>jackson-dataformat-csv</artifactId> <version>2.5.3</version> </dependency>
There are of course a host of available libraries to work with CSVs in the Java ecosystem.
The reason we’re going with Jackson here is that – it’s likely that Jackson is already in use in the application, and the processing we need to read the data is fairly straightforward.
3. The Setup Data
Different projects will need to setup different data.
In this tutorial, we’re going to be setting up User data – basically preparing the system with a few default users.
Here’s the simple CSV file containing the users:
id,username,password,accessToken 1,john,123,token 2,tom,456,test
Note how the first row of the file is the header row – listing out the names of the fields in each row of data.
3. CSV Data Loader
Let’s start by creating a simple data loader to read up data from the CSV files into working memory.
3.1. Load a List of Objects
We’ll implement the loadObjectList() functionality to load a fully parametrized List of specific Object from the file:
public <T> List<T> loadObjectList(Class<T> type, String fileName) { try { CsvSchema bootstrapSchema = CsvSchema.emptySchema().withHeader(); CsvMapper mapper = new CsvMapper(); File file = new ClassPathResource(fileName).getFile(); MappingIterator<T> readValues = mapper.reader(type).with(bootstrapSchema).readValues(file); return readValues.readAll(); } catch (Exception e) { logger.error("Error occurred while loading object list from file " + fileName, e); return Collections.emptyList(); } }
Notes:
- We created the CSVSchema based on first “header” row.
- The implementation is generic enough to handle any type of object.
- If any error occurs, an empty list will be returned.
3.2. Handle Many to Many Relationship
Nested objects are not well supported in Jackson CSV – we’ll need to use an indirect way to load Many to Many relationships.
We’ll represent these similar to simple Join Tables – so naturally we’ll load from disk as a list of arrays:
public List<long[]> loadManyToManyRelationship(String fileName) { try { CsvMapper mapper = new CsvMapper(); CsvSchema bootstrapSchema = CsvSchema.emptySchema().withSkipFirstDataRow(true); mapper.enable(CsvParser.Feature.WRAP_AS_ARRAY); File file = new ClassPathResource(fileName).getFile(); MappingIterator<long[]> readValues = mapper.reader(long[].class).with(bootstrapSchema).readValues(file); return readValues.readAll(); } catch (Exception e) { logger.error( "Error occurred while loading many to many relationship from file = " + fileName, e); return Collections.emptyList(); } }
Here’s how one of these relationships – Roles <-> Privileges – is represented in a simple CSV file:
role_id,privilege_id 1,1 1,2 2,4 3,3
Note how we’re ignoring the header in this implementation, as we don’t really need that information.
4. Setup Data
Now, we’ll use a simple Setup bean to do all the work of setting up privileges, roles and users from CSV files:
@Component public class Setup { ... @PostConstruct private void setupData() { setupRolesAndPrivileges(); setupUsers(); } ... }
4.1. Setup Roles and Privileges
First, let’s load roles and privileges from disk into working memory, and then persist them as part of the setup process:
public List<Privilege> getPrivileges() { return csvDataLoader.loadObjectList(Privilege.class, PRIVILEGES_FILE); } public List<Role> getRoles() { List<Privilege> allPrivileges = getPrivileges(); List<Role> roles = csvDataLoader.loadObjectList(Role.class, ROLES_FILE); List<long[]> rolesPrivileges = csvDataLoader.loadManyToManyRelationship(SetupData.ROLES_PRIVILEGES_FILE); for (long[] rolePrivilege : rolesPrivileges) { Role role = findById(roles, rolePrivilege[0]); Set<Privilege> privileges = role.getPrivileges(); if (privileges == null) { privileges = new HashSet<Privilege>(); } privileges.add(findById(allPrivileges, rolePrivilege[1])); role.setPrivileges(privileges); } return roles; } private <T extends IEntity> T findById(List<T> list, long id) { return list.stream().filter(item -> item.getId() == id).findFirst().get(); }
Then we’ll do the persist work here:
private void setupRolesAndPrivileges() { privilegeRepository.save(setupData.getPrivileges()); roleRepository.save(setupData.getRoles()); }
Note how, after we load both Roles and Privileges into working memory, we load their relationships one by one.
4.2. Setup Initial Users
Next – let’s load the users into memory and persist them:
public List<User> getUsers() { List<Role> allRoles = getRoles(); List<User> users = csvDataLoader.loadObjectList(User.class, SetupData.USERS_FILE); List<long[]> usersRoles = csvDataLoader.loadManyToManyRelationship(SetupData.USERS_ROLES_FILE); for (long[] userRole : usersRoles) { User user = findById(users, userRole[0]); Set<Role> roles = user.getRoles(); if (roles == null) { roles = new HashSet<Role>(); } roles.add(findById(allRoles, userRole[1])); user.setRoles(roles); } return users; }
Next, let’s focus on persisting the users:
private void setupUsers() { List<User> users = setupData.getUsers(); for (User user : users) { setupService.setupUser(user); } }
And here is our SetupService:
@Transactional public void setupUser(User user) { try { setupUserInternal(user); } catch (Exception e) { logger.error("Error occurred while saving user " + user.toString(), e); } } private void setupUserInternal(User user) { user.setPassword(passwordEncoder.encode(user.getPassword())); user.setPreference(createSimplePreference(user)); userRepository.save(user); }
And here is createSimplePreference() method:
private Preference createSimplePreference(User user) { Preference pref = new Preference(); pref.setId(user.getId()); pref.setTimezone(TimeZone.getDefault().getID()); pref.setEmail(user.getUsername() + "@test.com"); return preferenceRepository.save(pref); }
Note how, before we save a user, we create a simple Preference entity for it and persist that first.
5. Test CSV Data Loader
Next, let’s perform a simple unit test on our CsvDataLoader:
We will test loading list of Users, Roles and Privileges:
@Test public void whenLoadingUsersFromCsvFile_thenLoaded() { List<User> users = csvDataLoader. loadObjectList(User.class, CsvDataLoader.USERS_FILE); assertFalse(users.isEmpty()); } @Test public void whenLoadingRolesFromCsvFile_thenLoaded() { List<Role> roles = csvDataLoader. loadObjectList(Role.class, CsvDataLoader.ROLES_FILE); assertFalse(roles.isEmpty()); } @Test public void whenLoadingPrivilegesFromCsvFile_thenLoaded() { List<Privilege> privileges = csvDataLoader. loadObjectList(Privilege.class, CsvDataLoader.PRIVILEGES_FILE); assertFalse(privileges.isEmpty()); }
Next, let’s test loading some Many to Many relationships via the data loader:
@Test public void whenLoadingUsersRolesRelationFromCsvFile_thenLoaded() { List<long[]> usersRoles = csvDataLoader. loadManyToManyRelationship(CsvDataLoader.USERS_ROLES_FILE); assertFalse(usersRoles.isEmpty()); } @Test public void whenLoadingRolesPrivilegesRelationFromCsvFile_thenLoaded() { List<long[]> rolesPrivileges = csvDataLoader. loadManyToManyRelationship(CsvDataLoader.ROLES_PRIVILEGES_FILE); assertFalse(rolesPrivileges.isEmpty()); }
6. Test Setup Data
Finally, let’s perform a simple unit test on our bean SetupData:
@Test public void whenGettingUsersFromCsvFile_thenCorrect() { List<User> users = setupData.getUsers(); assertFalse(users.isEmpty()); for (User user : users) { assertFalse(user.getRoles().isEmpty()); } } @Test public void whenGettingRolesFromCsvFile_thenCorrect() { List<Role> roles = setupData.getRoles(); assertFalse(roles.isEmpty()); for (Role role : roles) { assertFalse(role.getPrivileges().isEmpty()); } } @Test public void whenGettingPrivilegesFromCsvFile_thenCorrect() { List<Privilege> privileges = setupData.getPrivileges(); assertFalse(privileges.isEmpty()); }
7. Conclusion
In this quick article we explored an alternative setup method for the initial data that usually needs to be loaded into a system on startup. This is of course just a simple Proof of Concept and a good base to build upon – not a production ready solution.
We’re also going to use this solution in the Reddit web application tracked by this ongoing case study.