Data Selection with loc[], iloc[] and at[] for DataFrames in Pandas
How to select data in a DataFrame in Pandas with loc, iloc and at
The loc, iloc and at (and ix, which is deprecated) methods provide flexible ways to access and manipulate data in a DataFrames and are useful for data exploration, data cleaning, and data manipulation tasks in data analysis and machine learning workflows.
Below you will find a summary of how to use these methods with short explanation and code examples.
For all examples we use the same, simple DataFrame:
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}, index=['a', 'b', 'c'])
# A B C
# a 1 4 7
# b 2 5 8
# c 3 6 9
loc
The loc function is used to index and select data using labels ("label-based"): You have to specify row and column labels for the data selection.
# Select a single row by label
df.loc['a']
# A 1
# B 4
# C 7
# Select multiple rows by label
df.loc[['a', 'b']]
# A B C
# a 1 4 7
# b 2 5 8
# Select a single value by row and column label
df.loc['a', 'A']
# 1
# Select multiple values by row and column labels
df.loc[['a', 'b'], ['A', 'B']]
# A B
# a 1 4
# b 2 5
iloc
The iloc function is used to index and select data using integers. It is "integer-based" and is used to select data by specifying rows and columns based on their row and column indices.
# Select a single row by index
df.iloc[0]
# A 1
# B 4
# C 7
# Select multiple rows by index
df.iloc[[0, 1]]
# A B C
# a 1 4 7
# b 2 5 8
# Select a single value by row and column index
df.iloc[0, 0]
# 1
# Select multiple values by row and column indices
df.iloc[[0, 1], [0, 1]]
# A B
# a 1 4
# b 2 5
at
The at function is used to access a single value at a given row and column label. It is faster than using the loc function to access a single value.
# Access a single value by row and column label
df.at['a', 'A']
# 1
ix
The ix function is a deprecated indexer that used to combine the functionality of loc and iloc for label-based and integer-based indexing.
# Select a single row by label or index
print(df.ix[0])
# Select multiple rows by label or index
print(df.ix[['a', 'b']])
# Access a single value by row and column label or index
print(df.ix[0, 'A'])
# Access multiple values by row and column label or index
print(df.ix[['a', 'b'], ['A', 'B']])
Summary
- if you want to index and select data based on row and column labels, use loc
- If you want to index and select data based on row and column indices, use iloc.
- If you want to access a single value in the DataFrame, use at