Extracting Meta Tag Information Using JavaScript

Introduction

When building, analyzing, or scraping web pages, it's often necessary to extract meta tag information. These tags provide data about the HTML document, like descriptions, keywords, author information, and more.

In this Byte, we'll explain how to extract this data using JavaScript.

Retrieving Meta Tag Data

To retrieve meta tag data, we can use the querySelector() method in JavaScript. This method returns the first Element within the document that matches the specified selector, or group of selectors.

Here's an example:

let metaDescription = document.querySelector("meta[name='description']")
                      .getAttribute("content");
console.log(metaDescription);

In this code, we're querying for a meta tag with the name 'description' and then getting the 'content' attribute of that tag. The console will output the description of the page.

Working with Open Graph (OG) Meta Tags

Open Graph meta tags are used to enrich the "preview" of a webpage on social media or in a messenger. They allow you to specify the title, description, and image that will be used when your page is shared.

To fetch the Open Graph title of a page, you can use the following code:

let ogTitle = document.querySelector("meta[property='og:title']")
              .getAttribute("content");
console.log(ogTitle);

This code fetches the Open Graph title of the page and prints it to the console.

Fetching Data from All Document Meta Tags

If you want to fetch data from all the meta tags in a document, you can use the getElementsByTagName() method, which returns a live HTMLCollection of elements with the given tag name.

Here's how you can do it:

let metaTags = document.getElementsByTagName("meta");

for (var i = 0; i < metaTags.length; i++) {
    console.log(metaTags[i].getAttribute("name") + " : " + metaTags[i].getAttribute("content"));
}

This code will output the "name" and "content" attributes of all the meta tags in the document.

Retrieving Meta Tags Using Node.js

Up until this point we've seen how to extract the meta tag data using JS in-browser. We know this because all examples have used the document object, which is only available in browser environments. Let's now see how you can do this from a different JS runtime, like Node.

Get free courses, guided projects, and more

No spam ever. Unsubscribe anytime. Read our Privacy Policy.

Assuming you have Node and npm on your machine, install the axios and cheerio libraries:

$ npm install axios cheerio

Link: To learn more about how to use the Axios library, read our article, Making Asynchronous HTTP Requests in JavaScript with Axios.

To learn more about Cheerio.js, see our guide, Build a Web-Scraped API with Express and Cheerio.

Load the libraries into your script using the require command:

const axios = require('axios');
const cheerio = require('cheerio');

And now we'll use Axios to fetch the web page we're interested in. It returns a promise, so make sure you handle it properly with async/await or a .then() block.

try {
    const response = await axios.get('https://example.com');
    
    // Extract the page data here...
} catch (error) {
    console.error(error);
}

Now we're going to use Cheerio.js to extract the meta tags from the HTML we've fetched. If you've ever worked with jQuery, you'll notice how similar Cheerio.js is.

try {
    const response = await axios.get('https://example.com');
    const $ = cheerio.load(response.data);
    const metaTags = $('meta');

    metaTags.each((i, tag) => {
        const name = $(tag).attr('name');
        const content = $(tag).attr('content');
        console.log(`Meta name: ${name}, content: ${content}`);
    });
} catch (error) {
    console.error(error);
}

What we've done here is load the HTML response into Cheerio, and then grabbed all the meta tags. We looped through each and printed out the "name" and "content" attributes. You could easily modify this code to capture other attributes or structure the data as needed.

Conclusion

In this Byte, we've explored how to extract meta tag information from a webpage using JavaScript. We covered how to retrieve specific meta tag data, work with Open Graph tags, and fetch data from all meta tags in a document.

We also saw how to extract meta tag information from other JavaScript runtimes, like Node.js using the Axios and Cheerio.js libraries.

Last Updated: August 18th, 2023
Was this helpful?
Project

React State Management with Redux and Redux-Toolkit

# javascript# React

Coordinating state and keeping components in sync can be tricky. If components rely on the same data but do not communicate with each other when...

David Landup
Uchechukwu Azubuko
Details

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms