NumPy Statistical Functions
Last Updated: 09 Nov 2025
Statistical functions in NumPy are used to analyze datasets — they help find mean, median, standard deviation, variance, percentile, correlation, etc.
Minimum, Maximum, Argmin, Argmax
arr = np.array([15, 30, 10, 25, 40])
print("Min:", np.min(arr))
print("Max:", np.max(arr))
print("Index of Min:", np.argmin(arr))
print("Index of Max:", np.argmax(arr))
Hinglish Tip:
argmin/argmaxindex batata hai jahan min ya max value mili.
Sum, Cumulative Sum, Product, and Cumulative Product
print("Total Sum:", np.sum(arr))
# 2D example
arr_2d = np.array([[1, 2], [3, 4]])
print("Row-wise Sum:", np.sum(arr_2d, axis=1)) # [3 7]
print("Column-wise Sum:", np.sum(arr_2d, axis=0)) # [4 6]
print("Product:", np.prod(arr))
print("Cumulative Sum:", np.cumsum(arr))
print("Cumulative Product:", np.cumprod(arr))
Hinglish Tip:
axis=0→ column-wiseaxis=1→ row-wise- Cumulative = total step-by-step add (or multiply)
Mean, Median, and Average
import numpy as np
arr = np.array([10, 20, 30, 40, 50])
print("Mean:", np.mean(arr))
print("Median:", np.median(arr))
print("Average:", np.average(arr))
Difference:
mean()→ simple averagemedian()→ middle valueaverage()→ supports weights
values = np.array([10, 20, 30])
weights = np.array([1, 2, 3])
print("Weighted Avg:", np.average(values, weights=weights))
Hinglish: “Weighted average tab use hoti hai jab kuch values zyada important hoti hain.”
Standard Deviation and Variance
Used to measure spread of data — how much it varies from mean.
arr = np.array([1, 2, 3, 4, 5])
print("Standard Deviation:", np.std(arr))
print("Variance:", np.var(arr))
Hinglish: “Std batata hai data kitna spread hai — agar std zyada → data scattered hai.”
Percentile & Quantile
Percentile shows relative position of data values.
arr = np.array([10, 20, 30, 40, 50, 60])
print("25th Percentile:", np.percentile(arr, 25))
print("50th Percentile (Median):", np.percentile(arr, 50))
print("75th Percentile:", np.percentile(arr, 75))
# Quantile uses 0–1 instead of 0–100
print("Q1:", np.quantile(arr, 0.25))
Correlation & Covariance
- Used to find relationship between two datasets.
- Correlation → strength (1 = perfect positive, -1 = perfect negative)
- Covariance → direction only
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])
print("Correlation Coefficient:\n", np.corrcoef(x, y))
print("\nCovariance Matrix:\n", np.cov(x, y))
Hinglish: “Correlation strong → dono arrays ek saath badh rahe hain.”
Axis-Based Statistics
arr = np.array([[10, 20, 30],
[40, 50, 60]])
print("Column-wise Mean:", np.mean(arr, axis=0)) # [25. 35. 45.]
print("Row-wise Mean:", np.mean(arr, axis=1)) # [20. 50.]
Use Cases
# Find hottest city from sensor data
temps = np.array([28, 32, 30, 35, 29])
hottest_idx = np.argmax(temps)
print(f"Hottest city: {hottest_idx} → {temps[hottest_idx]}°C")