Dataset: An introduction

Processing data

The hyperspectral data is stored and processed using Dataset.


Note that the Dataset class will only accept a numpy array of dimensions 3 or 4. The array should be formatted as:

(x, y, spectrum) or (x, y, z, spectrum)

Below is an example of instantiating a Dataset object with a 4d random numpy array.

import numpy as np
import hypers as hp

test_data = np.random.rand(40, 40, 4, 512)
X = hp.Dataset(test_data)

Dataset properties

The Dataset object has several useful attributes and methods for immediate analysis:

# Data properties:
X.shape                            # Shape of the hyperspectral array
X.ndim                             # Number of dimensions (3 or 4)
X.n_features                       # Number of spectral points (features)
X.n_samples                        # Total number of pixels (samples)

# To access the mean image/spectrum of the dataset:

# To access the image/spectrum in a specific pixel/spectral range:
X.spectrum[10:20, 10:20, :, :]     # Returns spectrum within chosen pixel range
X.image[..., 100:200]              # Returns image averaged between spectral bands

# To access the scree plot (as an array) that explains the variance contribution:

# To view and interact with the data:
X.view()                           # Opens a hyperspectral viewer

The Dataset object also supports arithmetic operations in the following manner:

import numpy as np
import hypers as hp

test_data = np.random.rand(50, 50, 512)
spectral_array = np.random.rand(512)

X = hp.Dataset(test_data)

# For arithmetic operations with a constant (int or float)
# This will be performed element-wise (on every single spectral band at every single pixel)
X *= 2
X /= 2
X += 2
X -= 2

# For arithmetic operations with a spectrum (spectral_array must have the same size as the spectra in Dataset)
# This will be performed on every single spectrum at every pixel
X *= spectral_array
X /= spectral_array
X += spectral_array
X -= spectral_array

To view the full list of methods and attributes that the Process class contains, see Dataset.