📃 Date & Time Manipulation

Last Updated : 31th August 2025


1. Converting to Date & Time

Let Data is like this

import pandas as pd

data = {
    "Date": ["2025-01-01", "2025-02-15", "2025-03-20", "2025-04-05", "2025-04-08"],
    "timeStamp":["2025-08-29 14:30:45","2025-08-29 07:15:45","2025-08-29 10:00:00",
                 "2025-08-30 16:45:30","2025-08-30 09:00:00"],
    "mixed": ["2025-08-29","29/08/2025 14:30","2025/08/29 07:15:45","05-04-2025 17:00","invalid_date"]
}

df = pd.DataFrame()

df["Date"] = pd.to_datetime(df["Date"])
df["timeStamp"] = pd.to_datetime(df["timeStamp"])
df["mixed"] = pd.to_datetime(df["mixed"], errors='coerce')

df

2. Extracting Date and Time Components (.dt accessor)

df["Year"] = df["Date"].dt.year
df["Month"] = df["Date"].dt.month
df["Day"] = df["Date"].dt.day
df["Weekday"] = df["Date"].dt.day_name()  # Monday, Tuesday...
df["Quarter"] = df["Date"].dt.quarter
df["Format"] = df["Date"].dt.strftime("%d-%m-%Y")

df["Hour"] = df["timeStamp"].dt.hour
df["Minute"] = df["timeStamp"].dt.minute
df["Second"] = df["timeStamp"].dt.second

df

3. Time Difference

df["next_time"] = df["timeStamp"] + pd.Timedelta(hours=5)
df["time_diff"] = df["next_time"] - df["timeStamp"]

df[["timeStamp", "next_time", "time_diff"]]

4. Working with Time Zones

# Localize to UTC
df["timestamp_utc"] = df["timeStamp"].dt.tz_localize("UTC")

# Convert to IST (Asia/Kolkata)
df["timestamp_ist"] = df["timestamp_utc"].dt.tz_convert("Asia/Kolkata")

# Parse automatically (dayfirst=True) and invalid dates become NaT
df["mixed"] = pd.to_datetime(df["timeStamp"], dayfirst=True, errors="coerce")

df

5. Filtering by Date Range

df_filtered = df[(df["Date"] >= "2025-01-01") & (df["Date"] <= "2025-03-20")]

df_filtered

6. Generating Date Ranges

# freq="D" → daily, "ME" → month-end, "H" → hourly, "W" → weekly
date_range_daily = pd.date_range(start="2025-01-01", end="2025-12-31", freq="D")

date_range_daily[:5]  # first 5 dates

Let Consider Another Data is like this

data_sales = {
    "date": ["2025-01-01", "2025-01-02", "2025-01-05", "2025-01-07", "2025-01-10"], # missing 3,4,6,8,9
    "sales": [100, 120, 90, 150, 200]
}
df_sales = pd.DataFrame(data_sales)
df_sales["date"] = pd.to_datetime(df_sales["date"])
df_sales.set_index("date", inplace=True)

df_sales

7. Resampling Data

# Weekly average sales
weekly_avg = df_sales["sales"].resample("W").mean()
print("Weekly Average:\n", weekly_avg)

# Monthly total
monthly_total = df_sales["sales"].resample("ME").sum()
print("Monthly Total:\n", monthly_total)

8. asfreq() Function use to fill/check missing dates.

# Force daily frequency (missing days become NaN)
df_asfreq = df_sales.asfreq("D")
print("After asfreq('D'):\n", df_asfreq)

# Fill missing dates (optional)
df_filled = df_asfreq.ffill()
print("After forward fill:\n", df_filled)

9. Shifting and Percentage Change

# Shift sales down by 1 (previous day)
df_sales["prev_day_sales"] = df_sales["sales"].shift(1)

# Difference from yesterday
df_sales["change"] = df_sales["sales"] - df_sales["prev_day_sales"]

# Percentage change from yesterday
df_sales["pct_change"] = df_sales["sales"].pct_change() * 100

df_sales