Pandas introduced data frames and series to Python and is an essential part of using Python for data analysis. A data frame is essentially a table that has rows and columns. Each column is a series and represents a variable, and each row is an observation, which represents an entry.

By default, both data frames and series are indexed with numbers (starting at 0). This can be changed if desired by passing the “index = [your_desired_index]” parameter in the method.

In order to use elements from Pandas you need to import it! Importing pandas as “pd” is common. It makes is so you don’t have to type out “pandas” every time, and instead can just type “pd” when wanting to use a Pandas data structure, method, or function.

import pandas as pd

Now that we know to always import Pandas, check our pages on Pandas data structures.

Common Methods and Operations

For all details about Pandas, check out the official documentation on the official website.

For the full list of attributes and methods available to be used with series, see the official Pandas documentation which can be found here.

For the full list of attributes and methods available to be used with data frames, see the official Pandas documentation which can be found here.

Method/Function Description
.sum() Returns the sum value
.mean() Returns the average value
.std() Returns the standard deviation
.min() Returns the minimum value
.max() Returns the maximum value
.len() Returns the length
.abs() Returns the absolute value
.corr() Returns the a correlation matrix with r values
.value_counts() Returns the frequencies of unique values
.groupby(“your_series“) Groups data frame by unique series values
.describe() Returns the sample size, mean, standard deviation,
minimum value, 25th percentile value, 50th percentile value,
75th percentile value, and the maximum value
.apply() Applies a function to the data
.concat() Combines data frame by stacking
.merge() Combines data frame by merging
on a key (a column with unique values)
.head() Shows first 5 rows. Pass a number between “()”
for more or fewer rows.
.tail() Shows last 5 rows. Pass a number between “()”
for more or fewer rows.