How to Use Aggregate Functions in Pandas
Adding a column in Pandas
import pandas as pd
food_order = pd.DataFrame([
[1, "pizza", 12, 10],
[2, "pasta", 13, 8],
[3, "lunch box", 12.30, 12],
[4, "salad", 14, 8,],
[5, "soup", 13, 6],
[6, "burger", 15, 9],
[7, "dessert", 16, 7]
], columns= ["id", "Food", "Time", "Price"])
food_order DataFrame will be used for the examples below.
Adding a column to an existing DataFrame:
You can adda column to an existing Pandas DataFrame.
food_order["Amount"] = [1, 3, 2, 2, 2, 3, 1]
print(food_order)
Lambda Functions in Pandas
You can use lambda functions to add a new column:
food_order["Order"] = food_order.apply(lambda row: "Budget" if row["Price"] < 10 else "Expensive", axis=1)
print(food_order)
The function above adds a new column called "Order". The "Order" column shows whether the price is expensive or budget.
.groupby() method
Let's change the DataFrame and test the groupby method:
import pandas as pd
food_order = pd.DataFrame([
[1, "pizza", 12, 10],
[2, "pasta", 13, 8],
[3, "lunch box", 12, 12],
[4, "pizza", 14, 8],
[5, "pasta", 13, 6],
[6, "lunch box", 15, 9],
[7, "dessert", 16, 7]
], columns= ["id", "Food", "Time", "Price"])
new_table = food_order.groupby(["Food"])["Price"].mean().reset_index()
print(new_table)
The groupby method in the example above displays two columns (Food and Price). The price column shows the average price of the food with the mean() method. You can find the result below:
.agg() method
You can alternatively use the .agg() method to apply one or more aggregation functions to your grouped data. This allows you to calculate different statistics for each column in a single step, making your data summaries more concise and powerful.
grouped = food_order.groupby('Food').agg({
'Price': ['sum', 'mean']
})
print(grouped)
In the example above, the agg() method calculates the total and average prices for each food category.