📃 Date & Time Manipulation
Last Updated : 31th August 2025
1. Converting to Date & Time
Let Data is like this
import pandas as pd
data = {
"Date": ["2025-01-01", "2025-02-15", "2025-03-20", "2025-04-05", "2025-04-08"],
"timeStamp":["2025-08-29 14:30:45","2025-08-29 07:15:45","2025-08-29 10:00:00",
"2025-08-30 16:45:30","2025-08-30 09:00:00"],
"mixed": ["2025-08-29","29/08/2025 14:30","2025/08/29 07:15:45","05-04-2025 17:00","invalid_date"]
}
df = pd.DataFrame()
df["Date"] = pd.to_datetime(df["Date"])
df["timeStamp"] = pd.to_datetime(df["timeStamp"])
df["mixed"] = pd.to_datetime(df["mixed"], errors='coerce')
df
2. Extracting Date and Time Components (.dt accessor)
df["Year"] = df["Date"].dt.year
df["Month"] = df["Date"].dt.month
df["Day"] = df["Date"].dt.day
df["Weekday"] = df["Date"].dt.day_name() # Monday, Tuesday...
df["Quarter"] = df["Date"].dt.quarter
df["Format"] = df["Date"].dt.strftime("%d-%m-%Y")
df["Hour"] = df["timeStamp"].dt.hour
df["Minute"] = df["timeStamp"].dt.minute
df["Second"] = df["timeStamp"].dt.second
df
3. Time Difference
df["next_time"] = df["timeStamp"] + pd.Timedelta(hours=5)
df["time_diff"] = df["next_time"] - df["timeStamp"]
df[["timeStamp", "next_time", "time_diff"]]
4. Working with Time Zones
# Localize to UTC
df["timestamp_utc"] = df["timeStamp"].dt.tz_localize("UTC")
# Convert to IST (Asia/Kolkata)
df["timestamp_ist"] = df["timestamp_utc"].dt.tz_convert("Asia/Kolkata")
# Parse automatically (dayfirst=True) and invalid dates become NaT
df["mixed"] = pd.to_datetime(df["timeStamp"], dayfirst=True, errors="coerce")
df
5. Filtering by Date Range
df_filtered = df[(df["Date"] >= "2025-01-01") & (df["Date"] <= "2025-03-20")]
df_filtered
6. Generating Date Ranges
# freq="D" → daily, "ME" → month-end, "H" → hourly, "W" → weekly
date_range_daily = pd.date_range(start="2025-01-01", end="2025-12-31", freq="D")
date_range_daily[:5] # first 5 dates
Let Consider Another Data is like this
data_sales = {
"date": ["2025-01-01", "2025-01-02", "2025-01-05", "2025-01-07", "2025-01-10"], # missing 3,4,6,8,9
"sales": [100, 120, 90, 150, 200]
}
df_sales = pd.DataFrame(data_sales)
df_sales["date"] = pd.to_datetime(df_sales["date"])
df_sales.set_index("date", inplace=True)
df_sales
7. Resampling Data
# Weekly average sales
weekly_avg = df_sales["sales"].resample("W").mean()
print("Weekly Average:\n", weekly_avg)
# Monthly total
monthly_total = df_sales["sales"].resample("ME").sum()
print("Monthly Total:\n", monthly_total)
8. asfreq() Function use to fill/check missing dates.
# Force daily frequency (missing days become NaN)
df_asfreq = df_sales.asfreq("D")
print("After asfreq('D'):\n", df_asfreq)
# Fill missing dates (optional)
df_filled = df_asfreq.ffill()
print("After forward fill:\n", df_filled)
9. Shifting and Percentage Change
# Shift sales down by 1 (previous day)
df_sales["prev_day_sales"] = df_sales["sales"].shift(1)
# Difference from yesterday
df_sales["change"] = df_sales["sales"] - df_sales["prev_day_sales"]
# Percentage change from yesterday
df_sales["pct_change"] = df_sales["sales"].pct_change() * 100
df_sales