Formatting Strings in Java

Introduction

There are multiple ways of formatting Strings in Java. Some of them are old-school and borrowed directly from old classics (such as printf from C) while others are more in the spirit of object-oriented programming, such as the MessageFormat class.

In this article, we'll gloss over several of these approaches. We'll show some specifics of how each of the techniques can be used and in which circumstances. Using this knowledge, you'll know how to approach formatting Strings and which of the techniques to use.

System.out.printf()

Let's start with the old classic, printf(). As mentioned before, printf() comes from the C programming language and stands for print formatted. Under the hood, printf() uses java.util.Formatter, which we'll talk about later.

The way printf() works can be explained by its arguments. The most common way of using printf() is as follows:

System.out.printf(String format, String... arguments);

We can see that the method expects a format and a vararg arguments. The format argument defines the way you want the String to be formatted - a template for the final result.

For example, you might want to print a decimal number with precisely seven decimal places or a number in hexadecimal representation. Or, you might have a predefined message for greeting users, but would like to format it to include the username.

The arguments vararg conveniently expects the arguments (i.e. values) for the template String. For example, if the template has placeholders for two numbers, the printf() method will also be expecting two numbers as arguments:

System.out.printf("%d %d", 42, 23);

We've put two %d symbols in the template String. These two symbols represent placeholders for a certain type of value. For instance, the %d is a placeholder for a decimal numeric value. Since we have two of them, we have to pass two arguments that correspond with numeric values, such as 42 and 23.

Running this code will yield:

42 23

Format Specifiers

With printf(), you can print values such as numbers, Strings, dates, etc. To let the method know what exactly you're trying to print, you need to provide a format specifier for each of the values. Let's take a look at an example:

System.out.printf("Hello, %s!", "reader");

If executed, this code will print Hello, reader to the console. The %s symbol represents a format specifier for Strings, similar to how %d represents a format specifier for decimal numbers.

There are many format specifiers we can use. Here are some common ones:

  • %c - Character
  • %d - Decimal number (base 10)
  • %e - Exponential floating-point number
  • %f - Floating-point number
  • %i - Integer (base 10)
  • %o - Octal number (base 8)
  • %s - String
  • %u - Unsigned decimal (integer) number
  • %x - Hexadecimal number (base 16)
  • %t - Date/time
  • %n - Newline

If we want to print, for example, a character and an octal number, we would use %c and %o specifiers, respectively. You might notice something unusual: the newline specifier. If you're not used to printf()'s behavior from C, it might seem a bit weird to have to specify things like this.

Well, printf() doesn't write a newline by default. In fact, it does almost nothing by default. Basically, if you want something to happen, you have to make it happen yourself.

That is to say - if we have multiple printf() statements without a newline specifier:

System.out.printf("Hello, %s!", "Michael Scott");
System.out.printf("Hello, %s!", "Jim");
System.out.printf("Hello, %s!", "Dwight");

The result would be:

Hello, Michael Scott!Hello, Jim!Hello, Dwight!

Though, if we include the newline character:

System.out.printf("Hello, %s!%n", "Michael Scott");
System.out.printf("Hello, %s!%n", "Jim");
System.out.printf("Hello, %s!%n", "Dwight");

Then the result would be:

Hello, Michael Scott!
Hello, Jim!
Hello, Dwight!

Note: %n is a special format that can be either \r\n or just \n. \n is the actual newline symbol, while the \r is the carriage return symbol. Typically, it's advised to use \n since it works as expected on all systems, unlike %n which can be understood as either of the two. More on this later.

Escape Characters

In addition to the format specifiers outlined above, there is another type of formatting symbols: Escape Characters.

Let's imagine that we want to print a " symbol using printf(). We can try something like:

System.out.printf(""");

If you try running this, your compiler will most definitely throw an exception. If you look closely, even the code that highlights the code on this page will highlight ); as being a String, and not the closed bracket of the method.

What happened was that we tried printing a symbol that has a special, reserved meaning. The quotation mark is used for denoting the beginning and end of a String.

We've started and ended a String "", after which we've opened another one " but haven't closed it. This makes printing reserved characters like this impossible, using this approach.

The way to bypass this is by escaping. To print special characters (such as ") directly we need to escape its effects first, and in Java that means prefixing it with a backslash (\). To legally print a quotation mark in Java we would do the following:

System.out.printf("\"");

The combination of \ and " specifically tells the compiler that we'd like to insert the " character in that place and that it should treat the " as a concrete value, not a reserved symbol.

Applying the escape character \ can invoke different effects based on the subsequent one. Passing a regular character (non-reserved) won't do anything and \ will be treated as a value.

Though, certain combinations (also called commands) have a different meaning to the compiler:

  • \b - Insert backspace
  • \f - Next line's first character starts to the right of current line's last character
  • \n - Insert newline
  • \r - Insert carriage return
  • \t - Insert tab
  • \\ - Insert backslash
  • %% - Insert percentage sign

Thus, you would use \n for printing a line separator to the console, effectively starting any new content from the beginning of the next line. Similarly, to add tabs you would use the \t specifier.

You might have noticed %% as the last combination.

Why is this? Why isn't \% simply used?

The % character is already an escape character specifically for the printf() method. Followed by characters like d, i, f, etc., the formatter at runtime knows how to treat these values.

The \ character, however, is meant for the compiler. It tells it where and what to insert. The \% command simply isn't defined and we use the % escape character to escape the effect of the subsequent % character - if that makes sense.

To the compiler, the % isn't a special character, but \ is. Also, it's convention that special characters escape themselves. \ escapes \ and % escapes %.

Basic Usage

Let's format a String with multiple arguments of different types:

System.out.printf("The quick brown %s jumps %d times over the lazy %s.\n", "fox", 2, "dog");

The output will be:

The quick brown fox jumps 2 times over the lazy dog.

Float and Double Precision

With printf(), we can define custom precision for floating point numbers:

double a = 35.55845;
double b = 40.1245414;

System.out.printf("a = %.2f b = %.4f", a, b);

Since %f is used for floats, we can use it to print doubles. However, by adding a .n, where n is the number of decimal places, we can define custom precision.

Running this code yields:

a = 35.56
b = 40.1245

Format Padding

We can also add padding, including the passed String:

System.out.printf("%10s\n", "stack");

Here, after the % character, we've passed a number and a format specifier. Specifically, we want a String with 10 characters, followed by a newline. Since stack only contains 5 characters, 5 more are added as padding to "fill up" the String to the character target:

     stack

You can also add right-padding instead:

System.out.printf("%-10s\n", "stack");

Locale

We can also pass a Locale as the first argument, formatting the String according to it:

System.out.printf(Locale.US, "%,d\n", 5000);
System.out.printf(Locale.ITALY, "%,d\n", 5000);

This would produce two differently-formatted integers:

5,000
5.000

Argument Index

If no argument index is provided, the arguments will simply follow the order of presence in the method call:

System.out.printf("First argument is %d, second argument is %d", 2, 1);

This would result in:

First argument is 2, argument number is 1

However, after the % escape character and before the format specifier, we can add another command. $n will specify the argument index:

System.out.printf("First argument is %2$d, second argument is %1$d", 2, 1);

Here, 2$ is located between % and d. 2$ specifies that we'd like to attach the second argument from the list of arguments to this specifier. Similarly, the 1$ specifies that we'd like to attach the first argument from the list to the other specifier.

Running this code results in:

First argument is 1, second argument is 2

You can point both specifiers to the same argument. In our case, that would mean that we only use a single argument provided in the list. That's perfectly fine - though we still have to supply all the arguments present in the template String:

System.out.printf("First argument is %2$d, second argument is %2$d", 2, 1);

This will result in:

First argument is 1, second argument is 1

System.out.format()

Before talking about System.out.format(), let's briefly focus on System.out.

All UNIX systems have three main pipes - standard input pipe (stdin), standard output pipe (stdout) and standard error pipe (stderr). The out field corresponds to the stdout pipe and is of PrintStream type.

This class has many different methods for printing formatted text-based representations to a stream, some of which are format() and printf().

According to the documentation, they both behave in exactly the same way. This means that there is no difference between the two, and can be used for the same results. All that we've said so far about printf() also works for format().

Both printf() and System.out.format() print to the stdout pipe, which is typically aimed at the console/terminal.

String.format()

Another way of formatting Strings is with String.format() method which internally also uses java.util.Formatter, which we'll explore in the next section.

The main advantage of String.format() over printf() is its return type - it returns a String. Instead of simply printing the contents on the standard output pipe and having no return type (void) like printf() does, String.format() is used to format a String that can be used or reused in the future:

String formattedString = String.format("Local time: %tT", Calendar.getInstance());

You can now do whatever you'd like to the formattedString. You can print it, you can save it into a file, you can alter it or persist it a database. Printing it would result in:

Local time: 16:01:42

The String.format() method uses the exact same underlying principle as the printf() method. Both internally use the Formatter class to actually format the Strings. Thus, everything said for printf() also applies to the String.format() method.

Using printf(), String.format() or Formatter is essentially the same thing. The only thing that differs is the return type - printf() prints to the standard output stream (typically your console) and String.format() returns a formatted String.

That being said, String.format() is more versatile as you can actually use the result in more than just one way.

The Formatter class

Since all of the methods above inherently call the Formatter, knowing just one means you know all of them.

The usage of Formatter is quite similar to other techniques shown before. The biggest difference is that to use it, one needs to instantiate a Formatter object:

Formatter f = new Formatter();
f.format("There are %d planets in the Solar System. Sorry, Pluto", 8);
System.out.println(f);

This begs the question:

Why wouldn't I always just use the previous methods, since they're more concise?

There is one more important distinction that makes the Formatter class quite flexible:

StringBuilder sb = new StringBuilder();
Formatter formatter = new Formatter(sb);

formatter.format("%d, %d, %d...\n", 1, 2, 3);

Instead of working only with Strings, Formatter can also work with StringBuilder which makes it possible to (re)use both classes efficiently.

In fact, Formatter is able to work with any class that implements the Appendable interface. One such example is the aforementioned StringBuilder, but other examples include classes such as BufferedWriter, FileWriter, PrintStream, PrintWriter, StringBuffer, etc. The full list can be found in the documentation.

Finally, all format specifiers, escape characters, etc. are also valid for the Formatter class as this is the main logic for formatting Strings in all three cases: String.format(), printf(), and Formatter.

MessageFormat

Finally, let's show one final formatting technique that doesn't use Formatter under the hood.

MessageFormat was made to produce and provide concatenated messages in a language-neutral way. This means that the formatting will be the same, regardless of whether you're using Java, Python, or some other language that supports MessageFormat.

MessageFormat extends the abstract Format class, just how DateFormat and NumberFormat do. The Format class is meant to format locale-sensitive objects into Strings.

Let's see a nice example, courtesy of MessageFormat's documentation.

int planet = 7;
String event = "a disturbance in the Force";

String result = MessageFormat.format(
	"At {1, time} on {1, date}, there was {2} on planet {0, number, integer}.",
	planet, new Date(), event
);

Code credit: Oracle Docs

The output is:

At 11:52 PM on May 4, 2174, there was a disturbance in the Force on planet 7.

Instead of percentage specifiers that we've seen so far, here we're using curly brackets for each of the arguments. Let's take the first argument, {1, time}. The number 1 represents the index of the argument that should be used in its place. In our case, the arguments are planet, new Date(), and event.

The second part, time, refers to the type of the value. Top-level format types are number, date, time, and choice. For each of the values, a more specific selection can be made, such as with{0, number, integer} which says that the value should be treated not only as a number, but also as an integer.

The complete set of format types and subtypes can be found in the documentation.

Conclusion

In this article, we've glossed over a fair number of ways to format Strings in core Java.

Each of the techniques that we've shown has its own reason for existence. printf(), for example, is reminiscent of the old-school C method of the same name from.

Other approaches, such as Formatter or MessageFormat offer a more modern approach that exploits some benefits of object-oriented programming.

Each technique has specific use-cases, so hopefully, you'll be able to know when to use each in the future.

Author image
About Luka Čupić
Croatia