Assume you have a CSV (Comma Separated Values) file containing a dataset and would like to load it into memory for data manipulation with Python and Pandas. The process of importing a CSV is pretty straightforward.
Although Python has a built-in csv module for reading CSV files, the chance is you will need to perform additional analysis on the imported data. That's why it's advised that you use a Pandas
DataFrame to read CSV files in your code.
In order to use Pandas, you first need to import the
import pandas as pd
Assuming the CSV file called
dataset-file.csv is located in the same directory as your Python script, you can import it without any additional hassle:
df = pd.read_csv('dataset-file.csv')
And that's it! You have the CSV stored as the DataFrame object -
df. You can now proceed to further data analysis. Just make sure you've imported the right file:
This will prompt give the first four rows of the dataset you've imported, so you can make sure it isn't corrupted in any way.
If your CSV file is not in the same directory as the Python file you're working with, just replace
dataset-file.csv (file name) with the relative or the absolute path to the file.
Advice: For an in-depth guide on reading and writing CSV files, including handling NaNs on loading, setting column names, skipping rows, etc. - read our "Reading and Writing CSV Files in Python with Pandas"!