Working with JavaScript's Built-In String Functions

Introduction

When working with any programming language, you'll probably need some functionality that is not integrated into that language natively. Thus, you'll either implement them yourself, or turn to using various modules or libraries.

This directly affects the efficiency of your application (more memory usage, more HTTP requests, etc.). To avoid this, developers working on advancing programming languages have integrated functions within the languages to help avoid having to use external libraries for common tasks.

Getting acquainted with these built-in functions is considered fundamental knowledge of a language, and you can still get pretty far with just the built-in functions. Of course, you'll most likely end up using some modules/libraries for certain tasks.

In this beginner-oriented guide, we'll take a look at the built-in functions of JavaScript pertaining to Strings.

JavaScript's Data Types, Structures and Objects with Built-in Functions

In JavaScript, there are eight data types:

  1. String
  2. Number
  3. Boolean
  4. Null
  5. Undefined
  6. Symbol
  7. BigInt
  8. Object

However, not every data type has a built-in function. They're only defined on: String, Number and Boolean.

When it comes to Data Structures in JavaScript, the seven most used structures are:

  1. Array
  2. Stack
  3. Queue
  4. Linked List
  5. Tree
  6. Graph
  7. Hashtable

Similar to data types, in Data Structures, built-in functions are only defined on an Array. Finally, Objects in JavaScript also have built-in functions, such as Date, RegExp and Math.

In this guide, we'll be focusing on strings specifically.

Built-in String Functions in JavaScript

A string is, as previously mentioned, one of eight data types in JavaScript. It is in essence as an array(string) of characters.

Additionally, it's worth noting that strings are immutable - once a string object is created, it can't be changed. Any string-changing functions will create a new string object and return it, instead of modifying the original one.

Given the fact that strings are just arrays - you can treat them as arrays, and retrieve elements through the array[index] notation.

That being said, let's start out with the built-in functions pertaining to strings.

toString()

toString() is one of the most commonly-used functions pertaining to strings. It belongs to all Objects and returns a string-representation of the object, effectively converting an object of any type into it's string representation:

let x = 100;
console.log(x.toString()); // Output: 100

toString() will behave differently with each object, depending on its implementation of the function - what it means to represent that object as a string. Additionally, take note that if you change any element in an arithmetic operation to a string - JavaScript will infer this as an attempt at concatenation:

let x = 100;
let y = 200;
   
let z1 = x+y;
let z2 = x.toString() + y;
   
console.log(z1); // Output: 300 
console.log(z2); // Output: 100200

Here, z1 is of type Number since we're adding to variables of type Number together and z2 is of type String since the first variable is of type String and y is being internally transformed to String and appended to x. If you'd like to convert an arithmetic result into a string - make sure to perform the conversion in the end.

concat()

concat() adds two strings together and returns a new string:

let x = "some ";
let y = "string";
   
console.log(x.concat(y)); // Output: some string

It essentially performs the same operation as:

let x = "some";
let y = "string";
   
console.log(x+y); // Output: some string

It's actually advised to prefer the concat() function instead of the operands, due to performance benefits. You won't gain much from concatenating a single string, however - you will gain on performance for large numbers of strings. Let's benchmark it real quick:

console.time('Concatenating with Operator');
concatWithOperator();
console.timeEnd('Concatenating with Operator');

console.time('Concatenating with Function');
concatWithFunction();
console.timeEnd('Concatenating with Function');

function concatWithOperator() {
    let result = "";
    for (let i = 0; i < 10000; i++) {
      result = result += i;
    }
}

function concatWithFunction() {
    let result = "";
    for (let i = 0; i < 10000; i++) {
      result = result.concat(i);
    }
}

This results in:

Concatenating with Operator: 3.232ms
Concatenating with Function: 1.509ms

The function is around two times faster on this code. It's also worth noting the official statement from MDN, regarding the performance benefits:

It is strongly recommended that the assignment operators (+, +=) are used instead of the concat() method.

Which might seem odd, given the fact that concat() outperforms the operators in the tests. What gives? Well, benchmarking code like this isn't as easy as simply running it and observing the results.

Your browser, its version, as well as the optimizer it uses may vary from machine to machine, and properties like those really impact the performance. For instance, we've used different strings in the concatenation, the ones generated from iteration. If we were to use the same string, an optimizer such as Google's V8 would further optimize the usage of the string.

As a rule of thumb - test and verify your own code instead of taking advice at face value.

toLocaleUpperCase() and toUpperCase()

toLocaleUpperCase() converts the given string to an uppercase one, abiding by the locale used on the machine compiling the code. Additionally, you can specify the locale via a string argument:

let word = "Straße";

console.log(word.toUpperCase()) // STRASSE
console.log(word.toLocaleUpperCase('de-DE')) // STRASSE

In most cases, toUpperCase() and toLocaleUpperCase() will return the same result, even if you don't supply the locale specifier. It's only with non-standard unicode mappings, such as with Turkish, that the toLocaleUpperCase() will produce different results.

toLocaleLowerCase() and toLowerCase()

toLocaleLowerCase() performs much the same as toLocaleUpperCase(), but converts the string to a lowercase one. Similarly, toLowerCase() is locale-agnostic. Though, be aware that certain information is lost when converting between uppercase and lowercase.

For instance, if we convert 'Straße' to uppercase, and then back to lowercase, you will lose certain information:

let word = "Straße";

upperCase = word.toLocaleUpperCase('de-DE')

console.log(upperCase) // STRASSE
console.log(upperCase.toLocaleLowerCase('de-DE')) // Strasse

Again, this is because in this instance, German does follow standard Unicode mapping, so toLocaleLowerCase() produces the same result as toLowerCase() - which just changes each character to its lowercase counterpart.

substring()

substring(start, end) returns a string, containing the characters starting from the start index of original string up until the end-1 index of original string.

let x = "this is some string";
   
console.log(x.substring(3, 7)); // Output: s is

As you can see, the end index is not inclusive, so the string outputted is from start to end-1.

Additionally, this of course returns a new string, so you can either capture it by assigning it to a new reference variable, or just use it as input to a new function. The original string stays unchanged:

let x = "this is some string";
let y = x.substring(3, 7);
   
console.log(x); // Output: this is some string
console.log(y); // Output: s is

If you try to substring() with an end beyond length of the string - you'll simply substring all the existing characters until the end:

let x = "this is some string";
console.log(x.substring(10, 25)); // Output: me string

This function is particularly useful for truncating output or input, or even checking whether a string starts with or contains a given substring.

substr(start, length)

Similar to substring(), the substr() function is generated by taking certain characters from an original string. Here we specify the start index and the size of the desired substring, which is length, instead of the concrete end-point:

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

let x = "this is some string";
   
console.log(x.substr(3, 4)); // Output: s is

If the length is beyond the scope of a string, you simply substring until the end:

let x = "hello";
console.log(x.substr(3, 10)); // Output: lo

split()

The split(separator, limit) function splits a string into an array of strings using the separator provided and into a limit number of parts.

let x = "this is some string";
   
console.log(x.split(" ", 4)); // Output: ['this', 'is', 'some', 'string']

This can be useful for parsing CSV lines if you're not using any external libraries, as they're comma-separated values, which is easily splittable via the split() function. However, when dealing with CSV files, you'll want to perform validation if they're not properly formatted.

Typically, you'll be using libraries for this as they make things much easier.

If you'd like to read more about reading CSV files in JavaScript, read our guides on Reading and Writing CSV with node-csv and Reading and Writing CSV with Node.js.

charAt() and string[index]

The charAt(index) function returns the character at the specified index.

let x = "abc123";
   
console.log(x.charAt(2)); // Output: c

You can use this to iterate through a string and retrieve its contents, for instance:

let x = "some string";

for (let i = 0; i < x.length; i++) {
    console.log(x.charAt(i));
}

Which results in:

s
o
m
e
 
s
t
r
i
n
g

What's the difference between x.charAt(y) and x[y]?

There are a couple of reasons why you might prefer charAt() over the array notation:

let x = "some string";

// There is no element 5.7
console.log(x[5.7]);

// 5.7 gets rounded down to 5
console.log(x.charAt(5.7));

// The array notation makes it appear as if we can just assign
// new values to elements, even though strings are immutable
x[5] = 'g';
console.log(x);

// With charAt(), it's much more obvious that
// this line doesn't make sense and will throw an exception
x.charAt(5) = 'g';

However, a double-edged sword is hidden in the implementation of the charAt() function - it evaluates the given index and processes it.

That's why 5.7 was rounded down to 5. It will also do this processing step for inputs that might not actually be valid, but will give the illusion of code that runs just fine:

let x = "some string";

console.log(x.charAt(true));
console.log(x.charAt(NaN));
console.log(x.charAt(undefined));
console.log(x.charAt([]))
console.log(x.charAt(""))

true is converted to 1, while false would be converted to 0. NaN, undefined, an empty array and an empty string are also converted into 0, so this runs just fine, even though it intuitively shouldn't:

o
s
s
s
s

On the other hand, using the more modern array notation:

console.log(x[true]);
console.log(x[NaN]);
console.log(x[undefined]);
console.log(x[[]]);
console.log(x[""]);

These produce a more intuitive result, denoting a failure of input:

undefined
undefined
undefined
undefined
undefined

indexOf()

indexOf(character) returns the index value of the first occurrence of the specified character:

let x = "aaabbb";
   
console.log(x.indexOf("b")); // Output: 3

If the character doesn't exist, -1 is returned:

let x = "some string";

console.log(x.indexOf('h')); // Output: -1

You can optionally also skip the first n characters by specifying an fromIndex as the second argument:

let x = "aaabbb";
   
console.log(x.indexOf("b", 4)); // Output: 4

Here, we skip the first 3 characters (0-based indexing), and start counting on the 4th. Incidentally, the 4th character is a 'b' we're searching for, so the index is returned.

lastIndexOf()

lastIndexOf(character) returns the index value of the last occurrence of the specified character:

let x = "aaabbb";
    
conosle.log(x.lastIndexOf("b")); // Output: 5

Much the same rules apply as for the indexOf() function:

let x = "aaabbb";
   
console.log(x.lastIndexOf("b")); // Output: 5
console.log(x.lastIndexOf("b", 3)); // Output: 3
console.log(x.lastIndexOf("g")); // Output: -1

The method counts backwards from the end of the string, but if we supply a fromIndex argument here, the index is counted from the left. In our case:

//       012345
let x = "aaabbb";
//          ↑ lastIndexOf() start

And the lastIndexOf() counts from 3 to 0, as we've set the fromIndex to be 3.

The search(string) function searches for a string and if found, returns the index of the beginning of the found string:

let x = "JavaScript, often abbreviated as JS, is a programming language that conforms to the ECMAScript specification. JavaScript is high-level, often just-in-time compiled, and multi-paradigm.";
    
console.log(x.search("programming")); // Output: 42

In case of multiple strings fitting the search keyword, such as 'JavaScript', only the starting index of the first matching case is returned:

let x = "JavaScript, often abbreviated as JS, is a programming language that conforms to the ECMAScript specification. JavaScript is high-level, often just-in-time compiled, and multi-paradigm.";
    
console.log(x.search("JavaScript")); // Output: 0

Conclusion

JavaScript is a wide-spread language, prevalent on the web, and getting acquainted with the fundamental built-in functions will help you avoid using unnecessary external libraries, when you can achieve a result in vanilla JS.

In this guide, we've taken a look at the built-in functions of strings - one of the most common data types available in JavaScript.

Last Updated: September 12th, 2021
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms