In my previous articles, I covered how to read and write JSON in Java as well as in Spring Boot. In this article, you'll learn how to read and write XML using different Java APIs.

Let us first look at an XML document and how it is structured.

XML Document

An XML document consists of elements (also known as tags) similar to HTML. Each element has an opening and a closing tag along with the content. Every XML must have exactly one root element — one tag that wraps the remaining tags. Tag names are can-sensitive, which means XML differentiates between capital and non-capital letters. Each element can have any number of nested child elements.

Unlike HTML, XML doesn't have a pre-defined set of tags. This gives complete freedom to developers to define any tag they want to use in the document. A valid XML file is well-formed and must contain a link to an XML schema.

Let us look at the below XML document that contains user information:

user.xml

<?xml version="1.0" encoding="UTF-8" ?>
<user id="1">
    <name>John Doe</name>
    <email>john.doe@example.com</email>
    <roles>
        <role>Member</role>
        <role>Admin</role>
    </roles>
    <admin>true</admin>
</user>

As you can see above, user.xml file starts with <?xml> known as XML prolog. Another important thing to notice is that each element is wrapped in its own tag e.g. <name>John Deo</name>. Since roles is an array, we have to specify each array element using the nested role tag.

Read and Write XML with JAXB

JAXB stands for Java Architecture for XML Binding and provides a convenient way for manipulating XML in Java. It is Java standard that defines an API for reading and writing Java objects to and from XML documents.

Starting from Java 6, JAXB is a part of the Java Development Kit (JDK). So there is no need to include any 3rd-party dependency to use JAXB in projects using Java 6 and higher.

In the following sections, you'll learn how to use JAXB to do the following:

  1. Marshalling — Convert a Java Object into XML.
  2. Unmarshalling — Convert XML content into a Java Object.

Before we discuss marshalling and unmarshalling in detail, let us first create a simple Java class named User.java that represents a user described in the above user.xml file:

User.java

@XmlRootElement
public class User {

    private int id;
    private String name;
    private String email;
    private String[] roles;
    private boolean admin;

    public User() {
    }

    public User(int id, String name, String email, String[] roles, boolean admin) {
        this.id = id;
        this.name = name;
        this.email = email;
        this.roles = roles;
        this.admin = admin;
    }

    public int getId() {
        return id;
    }

    @XmlAttribute
    public void setId(int id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    @XmlElement
    public void setName(String name) {
        this.name = name;
    }

    public String getEmail() {
        return email;
    }

    @XmlElement
    public void setEmail(String email) {
        this.email = email;
    }

    public String[] getRoles() {
        return roles;
    }

    @XmlElementWrapper(name = "roles")
    @XmlElement(name = "role")
    public void setRoles(String[] roles) {
        this.roles = roles;
    }

    public boolean isAdmin() {
        return admin;
    }

    @XmlElement
    public void setAdmin(boolean admin) {
        this.admin = admin;
    }

    @Override
    public String toString() {
        return "User{" +
                "id=" + id +
                ", name='" + name + '\'' +
                ", email='" + email + '\'' +
                ", roles=" + Arrays.toString(roles) +
                ", admin=" + admin +
                '}';
    }
}

As you can see above, we have annotated the class attributes with different JAXB annotations. These annotations serve a specific purpose while converting a Java object to and from XML.

  • @XmlRootElement — This annotation is used to specify the root element of the XML document. It maps a class or an enum type to an XML element. By default, it uses the class name or enum as the root element name. However, you can customize the name by explicitly setting the name attribute i.e. @XmlRootElement(name = "person").
  • @XmlAttribute — This annotation maps a Java object property to an XML element derived from the property name. To specify a different XML property name, you can pass the name parameter to the annotation declaration.
  • @XmlElement — This annotation maps a Java object property to an XML element derived from the property name. The name of the XML element mapped can be customized by using the name parameter.
  • @XmlElementWrapper — This annotation generates a wrapper element around the XML representation, an array of String in our case. You must explicitly specify elements of the collection by using the @XmlElement annotation.

Marshalling — Convert Java Object to XML

Marshalling in JAXB refers to converting a Java object to an XML document. JAXB provides the Marshaller class for this purpose.

All you need to do is create a new instance of JAXBContext by calling the newInstance() static method with a reference to the User class. You can then call the createUnmarshaller() method to create an instance of Marshaller. The Marshaller class provides several marshal() overloaded methods to turn a Java object into a file, an output stream, or output directly to the console.

Here is an example that demonstrates how to convert a User object into an XML document called user2.xml:

try {
    // create XML file
    File file = new File("user2.xml");

    // create an instance of `JAXBContext`
    JAXBContext context = JAXBContext.newInstance(User.class);

    // create an instance of `Marshaller`
    Marshaller marshaller = context.createMarshaller();

    // enable pretty-print XML output
    marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);

    // create user object
    User user = new User(2, "Tom Deo", "tom.doe@example.com",
            new String[]{"Member", "Moderator"}, false);

    // convert user object to XML file
    marshaller.marshal(user, file);

} catch (JAXBException ex) {
    ex.printStackTrace();
}

Now if you run the above code, you should see an XML file called user2.xml created in the root directory with the following contents:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<user id="2">
    <admin>false</admin>
    <email>tom.doe@example.com</email>
    <name>Tom Deo</name>
    <roles>
        <role>Member</role>
        <role>Moderator</role>
    </roles>
</user>

The Marshall class also provides an overloaded method to output the generated XML document on the console as shown below:

// print XML to console
marshaller.marshal(user, System.out);

Unmarshalling — Convert XML to Java Object

Unmarshalling is very much similar to the marshalling process we discussed above. Except that, this time, we will be using the Unmarshaller class to convert an XML document to a Java object.

The following example demonstrates the JAXB's ability to read the above user.xml XML file to create a User object:

try {
    // XML file path
    File file = new File("user.xml");

    // create an instance of `JAXBContext`
    JAXBContext context = JAXBContext.newInstance(User.class);

    // create an instance of `Unmarshaller`
    Unmarshaller unmarshaller = context.createUnmarshaller();

    // convert XML file to user object
    User user = (User) unmarshaller.unmarshal(file);

    // print user object
    System.out.println(user);

} catch (JAXBException ex) {
    ex.printStackTrace();
}

The above code will output the following:

User{id=1, name='John Doe', email='john.doe@example.com', roles=[Member, Admin], admin=true}

By default, the unmarshal() method returns an object. So we have to explicitly typecast it to the correct type (User in our case). There are several other unmarshal() overloaded methods provided by Unmarshaller that you can use to read an XML document from different sources like a URL, a reader, or a writer.

Read and Write XML using DOM Parser

DOM (Document Object Model) XML parser is another ways of reading and writing XML in Java. It is an older API that defines an interface for accessing and updating the style, structure, and contents of XML documents. XML parsers that support DOM implement this interface.

The DOM parser parses the XML document to create a tree-like structure. Everything in the DOM of an XML document is a node. So you have to traverse node by node to retrieve the required values.

The DOM defines several Java interfaces to represent an XML document. Here are the most commonly used interfaces:

  • Node — The base datatype of the DOM.
  • Element — Represents an individual element in the DOM.
  • Attr — Represents an attribute of an element.
  • Text — The actual content of an Element or Attr.
  • Document — Represents the entire XML document. A Document object is often referred to as a DOM tree.

Write XML to File using DOM Parser

To create an XML file using the DOM parser, you first create an instance of the Document class using DocumentBuilder. Then define all the XML content — elements, attributes, values — with Element and Attr classes.

In the end, you use the Transformer class to output the entire XML document to an output stream, usually a file or a string.

Here is an example that creates a simple XML file using the DOM parser:

try {
    // create new `Document`
    DocumentBuilder builder = DocumentBuilderFactory.newInstance()
            .newDocumentBuilder();
    Document dom = builder.newDocument();

    // first create root element
    Element root = dom.createElement("user");
    dom.appendChild(root);

    // set `id` attribute to root element
    Attr attr = dom.createAttribute("id");
    attr.setValue("1");
    root.setAttributeNode(attr);

    // now create child elements (name, email, phone)
    Element name = dom.createElement("name");
    name.setTextContent("John Deo");
    Element email = dom.createElement("email");
    email.setTextContent("john.doe@example.com");
    Element phone = dom.createElement("phone");
    phone.setTextContent("800 456-4578");

    // add child nodes to root node
    root.appendChild(name);
    root.appendChild(email);
    root.appendChild(phone);

    // write DOM to XML file
    Transformer tr = TransformerFactory.newInstance().newTransformer();
    tr.setOutputProperty(OutputKeys.INDENT, "yes");
    tr.transform(new DOMSource(dom), new StreamResult(new File("file.xml")));

} catch (Exception ex) {
    ex.printStackTrace();
}

Now, if you execute the above code, you'd see the following file.xml file created with default UTF-8 encoded:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<user id="1">
<name>John Deo</name>
<email>john.doe@example.com</email>
<phone>800 456-4578</phone>
</user>

If you want to output the XML document to the console, just pass StreamResult with System.out as an argument, as shown below:

// output XML document to console
tr.transform(new DOMSource(dom), new StreamResult(System.out));

Read XML from File using DOM Parser

A DOM parser can also read and parse an XML file in Java. By default, the DOM parser reads the entire XML file into memory; then parses it to create a tree structure for easy traversal or manipulation.

Let us look at the below example that reads and parses the XML file, we have just created above, using DOM XML parser:

try {
    // parse XML file to build DOM
    DocumentBuilder builder = DocumentBuilderFactory.newInstance()
            .newDocumentBuilder();
    Document dom = builder.parse(new File("file.xml"));

    // normalize XML structure
    dom.normalizeDocument();

    // get root element
    Element root = dom.getDocumentElement();

    // print attributes
    System.out.println("ID: " + root.getAttribute("id"));

    // print elements
    System.out.println("Name: " + root.getElementsByTagName("name").item(0).getTextContent());
    System.out.println("Email: " + root.getElementsByTagName("email").item(0).getTextContent());
    System.out.println("Phone: " + root.getElementsByTagName("phone").item(0).getTextContent());

} catch (Exception ex) {
    ex.printStackTrace();
}

Here is the output of the above program:

ID: 1
Name: John Deo
Email: john.doe@example.com
Phone: 800 456-4578

Note: DOM Parser is best for reading and parsing small XML files as it loads the whole file into the memory. For larger XML files that contain a lot of data, you should consider using the SAX (Simple API for XML) parser. SAX doesn't load the entire file into memory, which makes it faster than the DOM parser.

Summary

Although XML is not widely used as a data exchange format in modern systems, it is still used by old services on the web as a primary source of data exchange. This is also true for many file formats that store data in XML-formatted files.

Java provides multiple ways to read and write XML files. In this article, we looked at JAXB and DOM parsers for reading and writing XML data to and from a file.

JAXB is a modern replacement for old XML parsers like DOM and SAX. It provides methods to read and write Java objects to and from a file. We can easily define the relationship between XML elements and object attributes using JAXB annotations.

If you want to read and write JSON files, check out how to read and write JSON in Java guide for JSON file read and write examples.

✌️ Like this article? Follow me on Twitter and LinkedIn. You can also subscribe to RSS Feed.