Merging, Joining and Concatenating

Last Updated: 31th August 2025


  • Merging is the process of combining two or more datasets based on a common key or column.
  • Joining is the process of joining two datasets based on a common key or column.
    • Types of Join: Inner, Left, Right, Outer
    • inner→ only matches
    • left → keep all from left DataFrame
    • right→ keep all from right DataFrame
    • outer→ keep all, fill missing with NaN
  • Concatenating is the process of combining two or more datasets vertically or horizontally.

Let Data is like this

import pandas as pd

employees = pd.DataFrame({
    "emp_id": [1, 2, 3, 4],
    "name": ["Amit", "Raj", "Sachin", "Suraj"],
    "dept_id": [101, 102, 101, 103]
})

departments = pd.DataFrame({
    "dept_id": [101, 102, 104],
    "dept_name": ["IT", "HR", "Finance"]
})

merge()

# INNER JOIN
result = pd.merge(employees, departments, on="dept_id", how="inner")
print(result)

join() index based joins

df1 = pd.DataFrame({"A": [10, 20, 30]}, index=["x", "y", "z"])
df2 = pd.DataFrame({"B": [100, 200, 300]}, index=["x", "y", "a"])

print(df1.join(df2, how="outer"))`

concat()

# Row wise
df1 = pd.DataFrame({"id": [1, 2], "name": ["A", "B"]})
df2 = pd.DataFrame({"id": [3, 4], "name": ["C", "D"]})

print(pd.concat([df1, df2]))

# Column wise
df3 = pd.DataFrame({"score": [85, 90]})
print(pd.concat([df1, df3], axis=1))