NumPy Statistical Functions

Last Updated: 09 Nov 2025


Statistical functions in NumPy are used to analyze datasets — they help find mean, median, standard deviation, variance, percentile, correlation, etc.


Minimum, Maximum, Argmin, Argmax

arr = np.array([15, 30, 10, 25, 40])

print("Min:", np.min(arr))
print("Max:", np.max(arr))
print("Index of Min:", np.argmin(arr))
print("Index of Max:", np.argmax(arr))

Hinglish Tip: argmin/argmax index batata hai jahan min ya max value mili.


Sum, Cumulative Sum, Product, and Cumulative Product

print("Total Sum:", np.sum(arr))

# 2D example
arr_2d = np.array([[1, 2], [3, 4]])
print("Row-wise Sum:", np.sum(arr_2d, axis=1))     # [3 7]
print("Column-wise Sum:", np.sum(arr_2d, axis=0))  # [4 6]

print("Product:", np.prod(arr))
print("Cumulative Sum:", np.cumsum(arr))
print("Cumulative Product:", np.cumprod(arr))

Hinglish Tip:

  • axis=0column-wise
  • axis=1row-wise
  • Cumulative = total step-by-step add (or multiply)

Mean, Median, and Average

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

print("Mean:", np.mean(arr))
print("Median:", np.median(arr))
print("Average:", np.average(arr))

Difference:

  • mean() → simple average
  • median() → middle value
  • average() → supports weights
values = np.array([10, 20, 30])
weights = np.array([1, 2, 3])
print("Weighted Avg:", np.average(values, weights=weights))

Hinglish: “Weighted average tab use hoti hai jab kuch values zyada important hoti hain.”


Standard Deviation and Variance

Used to measure spread of data — how much it varies from mean.

arr = np.array([1, 2, 3, 4, 5])

print("Standard Deviation:", np.std(arr))
print("Variance:", np.var(arr))

Hinglish: “Std batata hai data kitna spread hai — agar std zyada → data scattered hai.”


Percentile & Quantile

Percentile shows relative position of data values.

arr = np.array([10, 20, 30, 40, 50, 60])

print("25th Percentile:", np.percentile(arr, 25))
print("50th Percentile (Median):", np.percentile(arr, 50))
print("75th Percentile:", np.percentile(arr, 75))

# Quantile uses 0–1 instead of 0–100
print("Q1:", np.quantile(arr, 0.25))

Correlation & Covariance

  • Used to find relationship between two datasets.
  • Correlation → strength (1 = perfect positive, -1 = perfect negative)
  • Covariance → direction only
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])

print("Correlation Coefficient:\n", np.corrcoef(x, y))
print("\nCovariance Matrix:\n", np.cov(x, y))

Hinglish: “Correlation strong → dono arrays ek saath badh rahe hain.”


Axis-Based Statistics

arr = np.array([[10, 20, 30],
                [40, 50, 60]])

print("Column-wise Mean:", np.mean(arr, axis=0))  # [25. 35. 45.]
print("Row-wise Mean:", np.mean(arr, axis=1))     # [20. 50.]

Use Cases

# Find hottest city from sensor data
temps = np.array([28, 32, 30, 35, 29])
hottest_idx = np.argmax(temps)
print(f"Hottest city: {hottest_idx} → {temps[hottest_idx]}°C")