Introduction
In an increasingly connected ecosystem of software systems, communication between them has become even more paramount. In turn, several technologies have been developed to package data being transferred or shared between these many and different systems.
The eXtensible Markup Language, popularly known as XML, is one of the ways to package data to be transferred. XML is a document formatting language that was developed in the 1990s since HTML does not allow the definition of new text elements, i.e. it is not extensible. In addition to being extensible, data in XML is self-describing, making it human readable and easy to comprehend.
In this post, we will explore XML manipulation in Java using the Jackson library.
Advantages and Disadvantages of XML
XML is still popular and in use in some systems since it has some advantages, but also newer technologies have come up to cater for some of its shortcomings.
Some of the advantages of XML include:
- XML is not tied to a single platform or programming language and can be used on many different systems easily. This makes it suitable for facilitating communication between systems with different hardware and software configurations.
- The data contained in an XML document can be validated using a document type definition (DTD), or XML schema. This is a set of markup declarations that define the building blocks of an XML document.
- Through its support for Unicode, XML can contain information written in any language or format without losing any information or contents in the process.
- Through its compatibility with HTML, it is easy to read and display data contained in an XML document by using HTML.
- The information stored in an XML document can be modified at any given time without affecting the presentation of the data through other mediums such as HTML.
Some of the shortcomings of XML that have been resolved in new technologies include:
- The syntax is quite redundant and verbose as compared to other formats, such as JSON, which is short and straight to the point.
- Due to its syntax and verbose nature, XML documents are usually large, which may result in extra storage and transportation costs.
- It has no support for arrays.
XML Libraries
Manipulating XML in Java can be a tedious process, so to ease the process and hasten development there are various libraries we can use. They include:
- Eaxy which is a small and simple library for building, manipulating, parsing and searching XML.
- Java Architecture for XML Binding (JAXB) is a framework for mapping Java classes to XML representations through marshalling Java objects into XML and unmarshalling XML into Java objects. It is part of the Java SE platform.
- Jackson is a library for handling JSON in Java systems and now has support for XML from version 2.
- DOM4J is a memory-efficient library for parsing XML, XPath, and XSLT (eXtensible Stylesheet Language).
- JDom which is an XML parsing library with support for XPath and XSLT.
What is Jackson?
The Jackson project is a collection of data processing tools for the Java language and the JVM platform. It supports a wide range of data formats such as CSV, Java Properties, XML, and YAML through extension components that support the specific language.
The Jackson XML component is meant for reading and writing XML data by emulating how JAXB works, although not conclusively.
In this article we will use the Jackson library to serialize Java objects into XML and deserialize them back into Java objects.
Project Setup
First, let us set up a fresh Maven project:
$ mvn archetype:generate -DgroupId=com.stackabuse -DartifactId=xmltutorial -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false
With our project generated, let us add the Jackson dependency in our pom.xml
file. Delete the existing dependencies section and replace it with:
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- Jackson dependency for XML manipulation -->
<dependency>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-xml</artifactId>
<version>2.9.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<!--
This plugin configuration will enable Maven to include the project dependencies
in the produced jar file.
It also enables us to run the jar file using `java -jar command`
-->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.stackabuse.App</mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
We can now test the project we have set up by running the following commands:
$ mvn package
$ java -jar target/java -jar target/xmltutorial-1.0.jar
The output should be Hello World!
printed on our terminal showing that our project is ready for the next step of the project.
Java Object Serialization into XML
Java objects have attributes and methods to manipulate these attributes. In relation to an XML document, the elements in the document can be mapped to attributes of a Java object.
In the serialization process, an object's attributes are converted into XML elements and stored in an XML document.
We will use a PhoneDetails
class that will define information about a particular phone model, such as its name, display size, and internal storage capacity. In our class, these will be attributes, but in our XML document, these details will be contained in tags or elements.
Let us start by defining the PhoneDetails
class that will be used to generate our objects:
public class PhoneDetails {
private String name;
private String displaySize;
private String memory;
// getters and setters
}
With our object set, let us modify our App.java
and add a function to handle the serialization to XML:
/**
* This function writes serializes the Java object into XML and writes it
* into an XML file.
*/
public static void serializeToXML() {
try {
XmlMapper xmlMapper = new XmlMapper();
// serialize our Object into XML string
String xmlString = xmlMapper.writeValueAsString(new PhoneDetails("OnePlus", "6.4", "6/64 GB"));
// write to the console
System.out.println(xmlString);
// write XML string to file
File xmlOutput = new File("serialized.xml");
FileWriter fileWriter = new FileWriter(xmlOutput);
fileWriter.write(xmlString);
fileWriter.close();
} catch (JsonProcessingException e) {
// handle exception
} catch (IOException e) {
// handle exception
}
}
public static void main(String[] args) {
System.out.println("Serializing to XML...");
serializeToXML();
}
Let us package and run our project once again:
$ mvn package
$ java -jar target/xmltutorial-1.0.jar
The output on the terminal is:
<PhoneDetails><name>OnePlus</name><displaySize>6.4</displaySize><memory>6/64 GB</memory></PhoneDetails>
In the root folder of our project, the serialized.xml
file is created containing this information. We have successfully serialized our Java object into XML and written it into an XML file.
In our serializeToXML()
function, we create an XmlMapper
object, which is a child class to the ObjectMapper
class used in JSON serialization. This class converts our Java Object into an XML output that we can now write to file.
Deserialization from XML
Jackson also allows us to read the contents of an XML file and deserialize the XML String back into a Java object. In our example, we will read an XML document containing details about a phone, and use Jackson to extract this data and use it to create Java objects containing the same information.
First, let us create an XML document matching our class to read from. Create to_deserialize.xml
with the following contents:
<PhoneDetails>
<name>iPhone</name>
<displaySize>6.2</displaySize>
<memory>3/64 GB</memory>
</PhoneDetails>
Let us add a deserializeFromXML()
function to deserialize the XML file above into a Java object:
public static void deserializeFromXML() {
try {
XmlMapper xmlMapper = new XmlMapper();
// read file and put contents into the string
String readContent = new String(Files.readAllBytes(Paths.get("to_deserialize.xml")));
// deserialize from the XML into a Phone object
PhoneDetails deserializedData = xmlMapper.readValue(readContent, PhoneDetails.class);
// Print object details
System.out.println("Deserialized data: ");
System.out.println("\tName: " + deserializedData.getName());
System.out.println("\tMemory: " + deserializedData.getMemory());
System.out.println("\tDisplay Size: " + deserializedData.getDisplaySize());
} catch (IOException e) {
// handle the exception
}
}
public static void main(String[] args) {
System.out.println("Deserializing from XML...");
deserializeFromXML();
}
We package and run our project as usual and the output is:
Deserializing from XML...
Deserialized data:
Name: iPhone
Memory: 3/64 GB
Display Size: 6.2
Our XML file has been successfully deserialized and all the data has been extracted through the help of the Jackson library.
Jackson Annotations
Annotations are used to add metadata to our Java code and they have no direct effect on the execution of the code they are attached to. They are used to give instructions to the compiler during compile time and runtime.
Jackson uses annotations for various functions such as defining whether we are mapping to XML or JSON, defining the order of attributes and fields in our output or their names.
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
These annotations are usually applied in our Java POJOs (Plain Old Java Objects). For instance, we can annotate our PhoneDetails
class as follows:
public class PhoneDetails {
@JsonProperty("phone_name")
private String name;
@JsonProperty("display_size")
private String displaySize;
@JsonProperty("internal_memory")
private String memory;
// rest of the code remains as is
}
The @JsonProperty
annotation helps define the name of the fields in our XML file. With this annotation added, the tags in our XML output and input files will have to resemble the strings in the annotation as follows:
<PhoneDetails>
<phone_name>OnePlus</phone_name>
<display_size>6.4</display_size>
<internal_memory>6/64 GB</internal_memory>
</PhoneDetails>
Another notable annotation is the @JacksonXmlText
that indicates that an element should be displayed as plain text without any tags or another element containing it.
The @JacksonXmlProperty
annotation can be used to control the details of the attribute or element being displayed. Such details can include the namespace of the element. Namespaces are a way of assigning elements to a particular group.
One main use for namespaces is to avoid conflicts when using similar tags in the document, they help isolate tags by a group to remove any ambiguity that may arise as XML documents scale.
The order of the properties can also be specified using a @JsonPropertyOrder
annotation. For instance, to reverse the order of the elements in the XML document output, the annotation is used as follows:
@JsonPropertyOrder({ "internal_memory", "display_size", "phone_name" })
public class PhoneDetails {
@JsonProperty("phone_name")
private String name;
@JsonProperty("display_size")
private String displaySize;
@JsonProperty("internal_memory")
private String memory;
...
The output of serialization to XML will now be:
<PhoneDetails>
<internal_memory>6/64 GB</internal_memory>
<display_size>6.4</display_size>
<phone_name>OnePlus</phone_name>
</PhoneDetails>
If there are fields in Java objects that we do not wish to be serialized, we can use the @JsonIgnore
annotation and the fields will be omitted during serialization and deserialization.
Jackson annotations are useful in defining and controlling the process of serialization and deserialization across various formats such as XML, JSON, and YAML. Some annotations work for all formats and some are tied to a specific type of file.
More Jackson annotations and their uses can be found in this official wiki on GitHub.
Manipulating Nested Elements and Lists in XML
Having learned about annotations, let us enhance our XML file to add nested elements and loops and modify our code to serialize and deserialize the following updated structure:
<PhoneDetails>
<internal_memory>3/64 GB</internal_memory>
<display_size>6.2</display_size>
<phone_name>iPhone X</phone_name>
<manufacturer>
<manufacturer_name>Apple</manufacturer_name>
<country>USA</country>
<other_phones>
<phone>iPhone 8</phone>
<phone>iPhone 7</phone>
<phone>iPhone 6</phone>
</other_phones>
</manufacturer>
</PhoneDetails>
In this new structure, we have introduced a nested Manufacturer
element which also includes a list of elements. With our current code, we cannot extract or create the new nested section.
To fix this, a new class to handle the nested element is required, and to that effect, this is part of our new Manufacturer
class:
// define the order of elements
@JsonPropertyOrder({ "manufacturer_name", "country", "other_phones" })
public class Manufacturer {
@JsonProperty("manufacturer_name")
private String name;
@JsonProperty("country")
private String country;
// new annotation
@JacksonXmlElementWrapper(localName="other_phones")
private List<String> phone;
...
It is quite similar to our PhoneDetails
class but we have now introduced a new annotation: @JacksonXmlElementWrapper
. The purpose of this annotation is to define whether a collection of elements uses or does not use a wrapper element, and can be used to dictate the wrapper elements local name and namespace.
In our example, we use the annotation to define the element that contains a list of elements and the tag to be used for that element. This will be used when serializing and deserializing our XML files.
This change in our XML structure and introduction of this class requires us to modify our PhoneDetails
class to reflect:
// existing code remains
public class PhoneDetails {
// existing code remains
@JsonProperty("manufacturer")
private Manufacturer manufacturer;
// standard getters and setters for the new element
...
Our PhoneDetails
object will now be able to include information about a phone's manufacturer.
Next, we update our serializeToXML()
method:
public static void serializeToXML() {
try {
XmlMapper xmlMapper = new XmlMapper();
// create a list of other phones
List<String> otherPhones = Arrays.asList("OnePlus 6T", "OnePlus 5T", "OnePlus 5");
// create the manufacturer object
Manufacturer manufacturer = new Manufacturer("OnePlus", "China", otherPhones);
// serialize our new Object into XML string
String xmlString = xmlMapper
.writeValueAsString(new PhoneDetails("OnePlus", "6.4", "6/64 GB", manufacturer));
// write to the console
System.out.println(xmlString);
// write XML string to file
File xmlOutput = new File("serialized.xml");
FileWriter fileWriter = new FileWriter(xmlOutput);
fileWriter.write(xmlString);
fileWriter.close();
} catch (JsonProcessingException e) {
// handle the exception
} catch (IOException e) {
// handle the exception
}
}
The result of serializing the new PhoneDetails
object with the Manufacturer
information is:
Serializing to XML...
<PhoneDetails><internal_memory>6/64 GB</internal_memory><display_size>6.4</display_size><phone_name>OnePlus</phone_name><manufacturer><manufacturer_name>OnePlus</manufacturer_name><country>China</country><other_phones><phones>OnePlus 6T</phones><phones>OnePlus 5T</phones><phones>OnePlus 5</phones></other_phones></manufacturer></PhoneDetails>
It works! Our deserializeFromXML()
function, on the other hand, does not need a major update since the PhoneDetails
class, when deserialized, will also include manufacturer information.
Let us add the following code to print out the manufacturer's details just to be sure:
// existing code remains
// Print object details
System.out.println("Deserialized data: ");
System.out.println("\tName: " + deserializedData.getName());
System.out.println("\tMemory: " + deserializedData.getMemory());
System.out.println("\tDisplay Size: " + deserializedData.getDisplaySize());
System.out.println("\tManufacturer Name: " + deserializedData.getManufacturer().getName());
System.out.println("\tManufacturer Country: " + deserializedData.getManufacturer().getCountry());
System.out.println("\tManufacturer Other Phones: " + deserializedData.getManufacturer().getPhone().toString());
// existing code remains
The output:
Deserializing from XML...
Deserialized data:
Name: iPhone X
Memory: 3/64 GB
Display Size: 6.2
Manufacturer Name: Apple
Manufacturer Country: USA
Manufacturer Other Phones: [iPhone 8, iPhone 7, iPhone 6]
The deserialization process is seamless and the new manufacturer details have been extracted from our updated XML file.
Conclusion
In this post, we have learned about XML and how to serialize data into XML documents as well as deserializing to extract data from XML documents.
We have also learned about annotations and how Jackson uses annotations in the serialization and deserialization process.
XML is still widely used in various systems that we may interact with from time to time, therefore, to interact with them we will need to serialize and deserialize XML documents from time to time. We can also consume XML APIs in our Java projects while exposing REST endpoints and use Jackson to convert XML input to JSON output.
The source code for this post is available on GitHub for reference.