NumPy Statistical Functions

Last Updated: 09 Nov 2025

Statistical functions in NumPy are used to analyze datasets — they help find mean, median, standard deviation, variance, percentile, correlation, etc.

Minimum, Maximum, Argmin, Argmax

arr = np.array([15, 30, 10, 25, 40])

print("Min:", np.min(arr))
print("Max:", np.max(arr))
print("Index of Min:", np.argmin(arr))
print("Index of Max:", np.argmax(arr))

Hinglish Tip: argmin/argmax index batata hai jahan min ya max value mili.

Sum, Cumulative Sum, Product, and Cumulative Product

print("Total Sum:", np.sum(arr))

# 2D example
arr_2d = np.array([[1, 2], [3, 4]])
print("Row-wise Sum:", np.sum(arr_2d, axis=1))     # [3 7]
print("Column-wise Sum:", np.sum(arr_2d, axis=0))  # [4 6]

print("Product:", np.prod(arr))
print("Cumulative Sum:", np.cumsum(arr))
print("Cumulative Product:", np.cumprod(arr))

axis=0 → column-wise
axis=1 → row-wise
Cumulative = total step-by-step add (or multiply)

Mean, Median, and Average

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

print("Mean:", np.mean(arr))
print("Median:", np.median(arr))
print("Average:", np.average(arr))

Difference:

mean() → simple average
median() → middle value
average() → supports weights

values = np.array([10, 20, 30])
weights = np.array([1, 2, 3])
print("Weighted Avg:", np.average(values, weights=weights))

Hinglish: “Weighted average tab use hoti hai jab kuch values zyada important hoti hain.”

Standard Deviation and Variance

Used to measure spread of data — how much it varies from mean.

arr = np.array([1, 2, 3, 4, 5])

print("Standard Deviation:", np.std(arr))
print("Variance:", np.var(arr))

Hinglish: “Std batata hai data kitna spread hai — agar std zyada → data scattered hai.”

Percentile & Quantile

Percentile shows relative position of data values.

arr = np.array([10, 20, 30, 40, 50, 60])

print("25th Percentile:", np.percentile(arr, 25))
print("50th Percentile (Median):", np.percentile(arr, 50))
print("75th Percentile:", np.percentile(arr, 75))

# Quantile uses 0–1 instead of 0–100
print("Q1:", np.quantile(arr, 0.25))

Correlation & Covariance

Used to find relationship between two datasets.
Correlation → strength (1 = perfect positive, -1 = perfect negative)
Covariance → direction only

x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])

print("Correlation Coefficient:\n", np.corrcoef(x, y))
print("\nCovariance Matrix:\n", np.cov(x, y))

Hinglish: “Correlation strong → dono arrays ek saath badh rahe hain.”

Axis-Based Statistics

arr = np.array([[10, 20, 30],
                [40, 50, 60]])

print("Column-wise Mean:", np.mean(arr, axis=0))  # [25. 35. 45.]
print("Row-wise Mean:", np.mean(arr, axis=1))     # [20. 50.]

Note : Use the “nan-safe” versions np.nansum(), np.nanmean(), np.nanstd(),np.nanmin(), and np.nanmax() etc when working with missing data.

← Previous Next →