Reading Data in Pandas
Last Updated: 28th August 2025
What you'll learn
- How to load data from CSV, Excel, JSON, SQL
- Power parameters:
index_col,usecols,dtype,nrows,skiprows,na_values
Hinglish Tip 🗣: Data Pandas me laana sabse pehla step hota hai. Pehle sahi tarike se read karo, fir clean & analyze.
📥 CSV — pd.read_csv()
import pandas as pd
# Basic read
df = pd.read_csv("data.csv")
print(df)
import pandas as pd
# With useful parameters
df = pd.read_csv(
"data.csv",
index_col="ID", # make 'ID' the index
usecols=["ID", "Name", "Age"], # only these columns
nrows=1000, # only first 1000 rows
dtype={"ID": "int64", "Age": "int16"},
skiprows=2, # skip first 2 lines (e.g., notes)
na_values=["NA", "N/A", "-"] # treat these as NaN
)
print(df)
Note Use
usecolsandnrowsin read_csv() if you don’t need all columns/rows and Usechunksizefor large files,for example:
import pandas as pd
df = pd.read_csv("data.csv", chunksize=1000)
for chunk in df:
print(chunk)
📊 Excel — pd.read_excel()
import pandas as pd
# Basic read
df = pd.read_excel("data.xlsx")
print(df)
import pandas as pd
df = pd.read_excel(
"data.xlsx",
sheet_name="Sheet1",
index_col=0,
usecols="A:D", # Excel-style range OR list of names
dtype={"Age": "int16"},
na_values=["NA", ""]
)
print(df)
🧾 JSON — pd.read_json()
import pandas as pd as pd
# For records-oriented JSON: [{...}, {...}]
df = pd.read_json("data.json")
# If your JSON is line-delimited (one JSON per line)
df = pd.read_json("data_lines.json", lines=True)
print(df)
🗂 SQL — pd.read_sql()
import sqlite3
import pandas as pd
conn = sqlite3.connect("mydb.sqlite")
# Option 1: read a full table
df_table = pd.read_sql("SELECT * FROM students", conn)
# Option 2: custom query with WHERE
df_query = pd.read_sql(
"SELECT id, name, marks FROM students WHERE marks >= ?",
conn,
params=(80,)
)
conn.close()
print(df_query)
⚙️ Important Parameters
index_col: Column to use as index.usecols: Columns to keep.dtype: Data type for specific columns.nrows: Number of rows to read.skiprows: Number of rows to skip.na_values: Values to consider as NaN.header: Row number to use as header.sheet_name: Name of sheet to read.
Hinglish Tip 🗣:
index_colaurusecolsse speed aur memory dono bachte hain!