Merging, Joining and Concatenating
Last Updated: 31th August 2025
- Merging is the process of combining two or more datasets based on a common key or column.
- Joining is the process of joining two datasets based on a common key or column.
- Types of Join: Inner, Left, Right, Outer
inner→ only matchesleft→ keep all from left DataFrameright→ keep all from right DataFrameouter→ keep all, fill missing with NaN
- Concatenating is the process of combining two or more datasets vertically or horizontally.
Let Data is like this
import pandas as pd
employees = pd.DataFrame({
"emp_id": [1, 2, 3, 4],
"name": ["Amit", "Raj", "Sachin", "Suraj"],
"dept_id": [101, 102, 101, 103]
})
departments = pd.DataFrame({
"dept_id": [101, 102, 104],
"dept_name": ["IT", "HR", "Finance"]
})
merge()
# INNER JOIN
result = pd.merge(employees, departments, on="dept_id", how="inner")
print(result)
join() index based joins
df1 = pd.DataFrame({"A": [10, 20, 30]}, index=["x", "y", "z"])
df2 = pd.DataFrame({"B": [100, 200, 300]}, index=["x", "y", "a"])
print(df1.join(df2, how="outer"))`
concat()
# Row wise
df1 = pd.DataFrame({"id": [1, 2], "name": ["A", "B"]})
df2 = pd.DataFrame({"id": [3, 4], "name": ["C", "D"]})
print(pd.concat([df1, df2]))
# Column wise
df3 = pd.DataFrame({"score": [85, 90]})
print(pd.concat([df1, df3], axis=1))