Data Visualization in Python with Matplotlib and Pandas - Getting Started with Pandas

Getting Started with Pandas

David Landup
David Landup

Pandas is an open-source Python package that provides numerous tools for data analysis. The package comes with several data structures that can be used for many different data manipulation tasks. It also has a variety of methods that can be invoked for data analysis, which come in handy when working on Data Science and Machine Learning problems.

It can present data in a way that is very intuitive and suitable for data analysis, via its Series and DataFrame data structures. The DataFrame is a fundamental and key data structure in the framework, and you'll spend a lot of time working with them.

Additionally, Pandas has a variety of ways to work with different types of I/O operations very seamlessly. It can read data from a variety of formats, such as CSV, XSLX, JSON, etc.

Pandas Data Structures

Pandas has two main data structures for data storage:

  1. Series
  2. DataFrame

Let's go over those two first.

Series

A series is similar to a one-dimensional array. It can store data of any type. The values of a Pandas Series are mutable but the size of a Series is immutable and cannot be changed.

The first element in the series is assigned the index of 0, while the last element is at index N-1, where N is the total number of elements in the series.

Start course to continue
Lessson 3/10
You must first start the course before tracking progress.
Mark completed

© 2013-2022 Stack Abuse. All rights reserved.

DisclosurePrivacyTerms