Fitting with a neural network

This notebook is available at arjunsavel/cortecs

Fitting with a neural network#

In the Quickstart tutorial, we worked through how to use the PCA-based approach implemented in cortecs. Now, let’s walk through how to use the neural network-based approach.

Setting up the objects#

[1]:
import numpy as np
import tensorflow as tf
import os
import random
import sys

sys.path.insert(0, os.path.abspath("../../src"))

import cortecs
from cortecs.opac.opac import *
from cortecs.fit.fit import *
from cortecs.eval.eval import *


# set the seed a few ways to enforce reproducibility
seed = 42
np.random.seed(seed)
tf.random.set_seed(seed)
os.environ["PYTHONHASHSEED"] = str(seed)
random.seed(seed)

keras.utils.set_random_seed(seed)

# note that setting this flag helps with reproducibility, but it may result in a performance hit.
tf.config.experimental.enable_op_determinism()
2024-03-13 00:05:35.256954: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-03-13 00:05:35.257005: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-03-13 00:05:35.258206: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-03-13 00:05:36.200041: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/home/docs/checkouts/readthedocs.org/user_builds/cortecs/checkouts/latest/src/cortecs/opac/io.py:11: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from tqdm.autonotebook import tqdm

We’ll be using the same Opac object as in the Quickstart.

[2]:
T_filename = "temperatures.npy"
P_filename = "pressures.npy"
wl_filename = "wavelengths.npy"

cross_sec_filename = "absorb_coeffs_C2H4.npy"

load_kwargs = {
    "T_filename": T_filename,
    "P_filename": P_filename,
    "wl_filename": wl_filename,
}
opac_obj = Opac(cross_sec_filename, loader="platon", load_kwargs=load_kwargs)

This time, we specify that we’d like to fit the object with the neural_net method.

[3]:
fitter = Fitter(opac_obj, method="neural_net")
fitter
[3]:
<cortecs.fit.fit.Fitter at 0x7f4a2a1b85d0>

We won’t actually fit the entire opacity, though—that would take about 15 hours on a laptop. Instead, let’s just fit a single one using a lower-level API.

[4]:
%%time
res = cortecs.fit.fit_neural_net.fit_neural_net(
    fitter.opac.cross_section[:, :, -2],
    fitter.opac.T,
    fitter.opac.P,
    None,
    n_layers=3,
    n_neurons=8,
    activation="sigmoid",
    learn_rate=0.04,
    loss="mean_squared_error",
    epochs=4000,
    verbose=0,
    sequential_model=None,
    plot=True,
)
CPU times: user 1min 10s, sys: 4.22 s, total: 1min 14s
Wall time: 1min 22s
../../_images/pages_fitting_nn_9_1.png

This code shouldn’t take more than 30 seconds or so to run.

The above plot tells us how the model performs as it iterates on its weights. Note that while enforcing the seeds above enforces reproducibility from run to run on a single machine, there still may be some operating system-dependent differences in the plots and results shown here.

[5]:
res = cortecs.fit.fit_neural_net.fit_neural_net(
    fitter.opac.cross_section[:, :, -2],
    fitter.opac.T,
    fitter.opac.P,
    None,
    n_layers=3,
    n_neurons=8,
    activation="sigmoid",
    learn_rate=0.04,
    loss="mean_squared_error",
    epochs=4000,
    verbose=0,
    sequential_model=None,
    plot=True,
)
history, neural_network = res
../../_images/pages_fitting_nn_11_0.png
[6]:
P_unraveled = unravel_data(fitter.opac.P, fitter.opac.T, None, tileboth=True)
T_unraveled = unravel_data(fitter.opac.T, fitter.opac.P, None, tileboth=False)
input_array = np.column_stack([T_unraveled, P_unraveled])

npres = len(fitter.opac.P)
ntemp = len(fitter.opac.T)

predictions = neural_network.predict(input_array)
plt.imshow(
    100
    * (predictions.reshape(ntemp, npres) - fitter.opac.cross_section[:, :, -1])
    / predictions.reshape(ntemp, npres)
)
plt.colorbar()
13/13 [==============================] - 0s 1ms/step
[6]:
<matplotlib.colorbar.Colorbar at 0x7f4a12d65350>
../../_images/pages_fitting_nn_12_2.png
[7]:
P_unraveled = unravel_data(fitter.opac.P, fitter.opac.T, None, tileboth=True)
T_unraveled = unravel_data(fitter.opac.T, fitter.opac.P, None, tileboth=False)
input_array = np.column_stack([T_unraveled, P_unraveled])

npres = len(fitter.opac.P)
ntemp = len(fitter.opac.T)

predictions = neural_network.predict(input_array)
plt.imshow(
    100
    * (predictions.reshape(ntemp, npres) - fitter.opac.cross_section[:, :, -1])
    / predictions.reshape(ntemp, npres)
)
plt.colorbar()
13/13 [==============================] - 0s 1ms/step
[7]:
<matplotlib.colorbar.Colorbar at 0x7f4a12c76d90>
../../_images/pages_fitting_nn_13_2.png

Awesome! Looks like we have accuracy at better than ~5%. We can tune the architecture of the neural network to achieve greater accuracy, but keep in mind that a larger network means more memory — that is, less efficient compression.

We can save the weights for this individual neural network with the built-in API.

[8]:
save_neural_net("test_nn", res)

Great! We’ve saved our network to the disk. Let’s make sure that evaluating this on the fly (without the neural network API — just the weights and the biases) works out.

First, we load in the weights and the biases.

[9]:
with open("test_nn.pkl", "rb") as f:
    all_weights, all_biases = pickle.load(f)

Next, we call the eval_neural_net method to evaluate the neural net with those weights and biases.

[10]:
n_layers = len(all_weights)
eval_neural_net(100, 1e-4, n_layers=n_layers, weights=all_weights, biases=all_biases)
[10]:
Array([-48.58145], dtype=float32)

Great! And let’s check that this matches the prediction we made earlier with the same neural net.

[11]:
predictions[0]
[11]:
array([-48.581448], dtype=float32)

Finally, let’s do a speed test.

[12]:
%%timeit
eval_neural_net(2000, 1e-4, n_layers=n_layers, weights=all_weights, biases=all_biases)
424 µs ± 1.98 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Again, this is slower than simply accessing the array. This is a trade-off you’ll have to make depending on your architecture (CPU vs. GPU), desired accuracy, and desired memory performance.

This tutorial has walked through how to use the neural network-based approach to fitting opacity data. We did so using a single neural network architecture — i.e., we only used n_layers=3, n_neurons=8, etc. These model “hyperparameters” in principle have an optimal value for a given dataset, given constraints on model size. We will explore this idea further in the Optimizing Fits tutorial.

We can also quickly check the compression factor of of our result.

[13]:
nbytes = 0.0
for j in all_biases:
    nbytes += j.nbytes
for j in all_weights:
    nbytes += j.nbytes
opac_obj.cross_section.nbytes / (nbytes * len(opac_obj.wl))
[13]:
4.406779661016949
[ ]:

[ ]: