NumPy Structured Arrays

Last Updated: 09 Nov 2025


Structured arrays let you store heterogeneous data (like CSV, sensor logs, database rows) in a single NumPy arrayfast, memory-efficient, vectorized.

Hinglish Tip: “Structured array = ek hi array me naam, umar, salary — jaise Excel sheet!”


1. Define dtype with Named Fields

import numpy as np

# Define schema: name (string), age (int), salary (float)
dt = np.dtype([
    ('name', 'U10'),     # Unicode string, max 10 chars
    ('age', 'i4'),       # 32-bit integer
    ('salary', 'f8')     # 64-bit float
])

# Create array
employees = np.array([
    ('Alice', 25, 50000.0),
    ('Bob',   30, 75000.0),
    ('Raj',   28, 65000.0)
], dtype=dt)

print(employees)

Output:

[('Alice', 25, 50000.) ('Bob', 30, 75000.) ('Raj', 28, 65000.)]

2. Access by Field Name

print("Names:", employees['name'])
print("Ages:", employees['age'])
print("Salaries:", employees['salary'])

3. Vectorized Operations

# 10% bonus for all
employees['salary'] *= 1.10
print(employees['salary'])

# Filter: age > 28
print(employees[employees['age'] > 28])

Example : Sensor Log

sensor_dt = np.dtype([
    ('timestamp', 'f8'),      # Unix time
    ('temp', 'f4'),           # Celsius
    ('humidity', 'f4'),       # %
    ('valid', '?')            # bool
])

log = np.array([
    (1739000000.0, 23.5, 45.0, True),
    (1739000060.0, 24.1, 44.5, True),
    (1739000120.0, -999.0, 0.0, False)  # invalid
], dtype=sensor_dt)

# Mask invalid
valid = log[log['valid']]
print("Valid readings:", valid[['temp', 'humidity']])

4. Nested & Complex dtype

complex_dt = np.dtype([
    ('id', 'i4'),
    ('position', 'f4', (3,)),   # 3D point
    ('active', '?')
])

point = np.array([(1, [10.0, 20.0, 30.0], True)], dtype=complex_dt)
print(point['position'])