Introduction
In this article we'll be taking a look at how to read and write CSV files in Kotlin, specifically, using Apache Commons.
Apache Commons Dependency
Since we're working with an external library, let's go ahead and import it into our Kotlin project. If you're using Maven, simply include the commons-csv
dependency:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-csv</artifactId>
<version>1.5</version>
</dependency>
Or, if you're using Gradle:
implementation 'org.apache.commons:commons-csv:1.5'
Finally, with the library added to our project, let's define the CSV file we're going to read - students.csv
:
101,John,Smith,90
203,Mary,Jane,88
309,John,Wayne,96
It'll be located under /resources/students.csv
.
Also, since we'll be reading these records into custom objects, let's make a data class:
data class Student (
val studentId: Int,
val firstName: String,
val lastName: String,
val score: Int
)
Reading a CSV File in Kotlin
Let's first read this file using a BufferedReader
, which accepts a Path
to the resource we'd like to read:
val bufferedReader = new BufferedReader(Paths.get("/resources/students.csv"));
Then, once we've read the file into the buffer, we can use the buffer itself to initialize a CSVParser
instance:
val csvParser = CSVParser(bufferedReader, CSVFormat.DEFAULT);
Given how volatile the CSV format can be - to remove the guesswork, you'll have to specify the CSVFormat
when initializing the parser. This parser, initialized this way, can only then be used for this CSV format.
Since we're following the textbook example of the CSV format, and we're using the default separator, a comma (,
) - we'll pass in CSVFormat.DEFAULT
as the second argument.
Now, the CSVParser
is an Iterable
, that contains CSVRecord
instances. Each line is a CSV record. Naturally, we can then iterate over the csvParser
instance and extract records from it:
for (csvRecord in csvParser) {
val studentId = csvRecord.get(0);
val studentName = csvRecord.get(1);
val studentLastName = csvRecord.get(2);
var studentScore = csvRecord.get(3);
println(Student(studentId, studentName, studentLastName, studentScore));
}
For each CSVRecord
, you can get its respective cells using the get()
method, and passing in the index of the cell, starting at 0
. Then, we can simply use these in the constructor of our Student
data class.
This code results in:
Student(studentId=101, firstName=John, lastName=Smith, score=90)
Student(studentId=203, firstName=Mary, lastName=Jane, score=88)
Student(studentId=309, firstName=John, lastName=Wayne, score=96)
Though, this approach isn't great. We need to know the order of the columns, as well as how many columns there are to use the get()
method, and changing anything in the CSV file's structure totally breaks our code.
Reading a CSV File with Headers in Kotlin
It's reasonable to know what columns exist, but a little less so in which order they're in.
Usually, CSV files have a header line that specifies the names of the columns, such as StudentID
, FirstName
, etc. When constructing the CSVParser
instance, following the Builder Design Pattern, we can specify whether the file we're reading has a header row or not, in the CSVFormat
.
By default, the CSVFormat
assumes that the file doesn't have a header. Let's first add a header row to our CSV file:
StudentID,FirstName,LastName,Score
101,John,Smith,90
203,Mary,Jane,88
309,John,Wayne,96
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
Now, let's initialize the CSVParser
instance, and set a couple of optional options in the CSVFormat
along the way:
val bufferedReader = new BufferedReader(Paths.get("/resources/students.csv"));
val csvParser = CSVParser(bufferedReader, CSVFormat.DEFAULT
.withFirstRecordAsHeader()
.withIgnoreHeaderCase()
.withTrim());
This way, the first record (row) in the file will be treated as the header row, and the values in that row will be used as the column names.
We've also specified that the header case doesn't mean much to us, turning the format into a case-insensitive one.
Finally, we've also told the parser to trim the records, which removes redundant whitespaces from the starts and ends of values if there are any. Some of the other options that you can fiddle around with are options such as:
CSVFormat.DEFAULT
.withDelimiter(',')
.withQuote('"')
.withRecordSeparator("\r\n")
These are used if you'd like to change the default behavior, such as set a new delimiter, specify how to treat quotes since they can oftentimes break the parsing logic and specify the record separator, present at the end of each record.
Finally, once we've loaded the file in and parsed it with these settings, you can retrieve CSVRecord
s as previously seen:
for (csvRecord in csvParser) {
val studentId = csvRecord.get("StudentId");
val studentName = csvRecord.get("FirstName);
val studentLastName = csvRecord.get("LastName);
var studentScore = csvRecord.get("Score);
println(Student(studentId, studentName, studentLastName, studentScore));
}
This is a much more forgiving approach, since we don't need to know the order of the columns themselves. Even if they get changed at any given time, the CSVParser
's got us covered.
Running this code also results in:
Student(studentId=101, firstName=John, lastName=Smith, score=90)
Student(studentId=203, firstName=Mary, lastName=Jane, score=88)
Student(studentId=309, firstName=John, lastName=Wayne, score=96)
Writing a CSV File in Kotlin
Similar to reading files, we can also write CSV files using Apache Commons. This time around, we'll be using the CSVPrinter
.
Just how the CSVReader
accepts a BufferedReader
, the CSVPrinter
accepts a BufferedWriter
, and the CSVFormat
we'd like it to use while writing the file.
Let's create a BufferedWriter
, and instantiate a CSVPrinter
instance:
val writer = new BufferedWriter(Paths.get("/resources/students.csv"));
val csvPrinter = CSVPrinter(writer, CSVFormat.DEFAULT
.withHeader("StudentID", "FirstName", "LastName", "Score"));
The printRecord()
method of the CSVPrinter
instance is used to write out records. It accepts all the values for that record and prints it out in a new line. Calling the method over and over allows us to write many records. You can either specify each value in a list, or simply pass in a list of data.
There's no need to use the printRecord()
method for the header row itself, since we've already specified it with the withHeader()
method of the CSVFormat
. Without specifying the header there, we would've had to print out the first row manually.
In general, you can use the csvPrinter
like this:
csvPrinter.printRecord("123", "Jane Maggie", "100");
csvPrinter.flush();
csvPrinter.close();
Don't forget to flush()
and close()
the printer after use.
Since we're working with a list of students here, and we can't just print the record like this, we'll loop through the student list, put their info into a new list and print that list of data using the printRecord()
method:
val students = listOf(
Student(101, "John", "Smith", 90),
Student(203, "Mary", "Jane", 88),
Student(309, "John", "Wayne", 96)
);
for (student in students) {
val studentData = Arrays.asList(
student.studentId,
student.firstName,
student.lastName,
student.score)
csvPrinter.printRecord(studentData);
}
csvPrinter.flush();
csvPrinter.close();
This results in a CSV file, that contains:
StudentID,FirstName,LastName,Score
101,John,Smith,90
203,Mary,Jane,88
309,John,Wayne,96
Conclusion
In this tutorial, we've gone over how to read and write CSV files in Kotlin, using the Apache Commons library.