Strip Non-Numeric Characters from a String in JavaScript

Strip Non-Numeric Characters from a String in JavaScript

Introduction

Let's say you're taking input from a user and you're expecting them to submit a number. You'd be wise to expect them to enter something other than a number. After all, users aren't very good at following instructions. So what do you do with the input? One option would be to strip all non-numeric characters from the string like spaces, newlines, commas, and periods. In this Byte, we'll see a few ways in which you can do that.

Why Remove Non-Numeric Characters?

You might be wondering why one would need to remove non-numeric characters from a string. Well, in programming and data analysis, clean and consistent data is king. Non-numeric characters mixed with numeric data will almost certainly cause issues or inconsistencies in your processing. For example, if you're working on a project that requires numerical input, and a user enters "123abc", your app likely wont' be able to process it, at least not correctly.

How to Remove Non-Numeric Characters from a String

In JavaScript, there are several ways to remove non-numeric characters from a string. We'll explore a few of these methods in the following sections.

Using Split, Filter, and Join

I'll start by saying that this is not the recommended way to do this (see the next section for that), but it does serve as an interesting exercise for new programmers.

One way to remove all non-numeric characters from a string is to filter them out manually. We can do this with the following steps:

  1. Turn our string into an array using split('')
  2. Filter out any characters that aren't a digit by comparing them to the string equivalents using filter(...)
  3. Join the resulting array back into a string with join('')

Together, the code would look like this:

let str = "123abc456def";
let newStr = str
             .split('')  // ['1','2','3', ... 'd','e','f']
             .filter(s => s >= '0' && s <= '9') // ['1','2','3','4','5','6']
             .join('')   // "123456"

console.log(newStr);  // Output: "123456"

As you can see, we can apply array manipulations to achieve what we need. However, this is much more verbose and error prone than what it could be. The next section will explain a much more intuitive and simpler method.

Using the Replace Method

The replace() method in JavaScript is used to return a new string with some or all matches of a pattern replaced by a given string. We can use this method to replace all non-numeric characters in a string with an empty string, i.e. "". Here's how it's done:

let str = "123abc456def";
let newStr = str.replace(/\D/g, "");

console.log(newStr);  // Output: "123456"

In this code snippet, we use the \D character in a regular expression (regex) to match any character that's not a digit. The g flag ("global") is used to match all occurances, as opposed to just the first one.

Alternatively, you could also use the [^0-9] pattern to match any character that's not a digit. The ^ symbol inside the square brackets negates the pattern, meaning it will match anything that's not in the range 0-9.

Removing Specific Non-Numeric Characters

Sometimes we might not want to remove all non-numeric characters from a string, but just certain non-numeric characters. In this case, we can modify our regular expression to only target the specific characters we want to remove.

For example, let's say we want to remove all occurrences of the characters 'a', 'b', and 'c' from our string. Here's how we can do that:

Get free courses, guided projects, and more

No spam ever. Unsubscribe anytime. Read our Privacy Policy.

let str = "123abc456def";
let newStr = str.replace(/[abc]/g, "");
console.log(newStr);  // "123456def"

In this example, the regular expression /[abc]/g matches any occurrence of 'a', 'b', or 'c', and the replace() method replaces these characters with an empty string, effectively removing them.

Removing Non-Numeric Characters Except Certain Symbols

As for another variation, we want to keep certain non-numeric characters while removing all others. This is common when working with strings that represent things like phone numbers or monetary values, where symbols like '+' or '$' are important to the formatting of the number.

For example, let's say we want to remove all non-numeric characters from a string, except for '+', '-', and '.'. Keeping these characters would preserve the sign and decimal place of the number.

We can do this with the following code:

let str = "+123.(456789)";
let newStr = str.replace(/[^0-9\+\-\.]/g, "");
console.log(newStr);  // "+123.456789"

In this case, the regular expression /[^0-9\+\-\.]/g matches any character that is not a number, '+', '-', or '.'. Again, the replace() method then replaces these characters with an empty string.

Unicode and Non-ASCII Characters

When dealing with strings in JavaScript, you need to remember that not all characters are created equal. JavaScript uses Unicode, a standard that includes a much wider range of characters than ASCII. This includes non-numeric characters like emojis, accented letters, and characters from non-Latin scripts.

If your string includes these types of characters, you'll need to change your regular expression to handle them. For example, to remove all non-numeric characters except for emoji, you could use the following code:

let str = "123๐Ÿ™‚456๐Ÿ™ƒ789";
let newStr = str.replace(/[^\d\p{So}]/gu, "");
console.log(newStr);  // "123๐Ÿ™‚456๐Ÿ™ƒ789"

In this case, the regular expression /[^\d\p{So}]/gu uses the \p{So} Unicode property escape to match any character that is not a number or a symbol.

Note: Unicode property escapes are a relatively new feature in JavaScript and might not be supported in all environments. Be sure to check compatibility before using them in production code.

Conclusion

In this Byte, we used regular expressions in JavaScript to show how they can be used to remove non-numeric characters from strings. We've seen how to remove specific characters, how to keep certain symbols, and even how to handle Unicode and non-ASCII characters.

Last Updated: September 12th, 2023
Was this helpful?
Project

React State Management with Redux and Redux-Toolkit

# javascript# React

Coordinating state and keeping components in sync can be tricky. If components rely on the same data but do not communicate with each other when...

David Landup
Uchechukwu Azubuko
Details

ยฉ 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms