Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am new to tensorflow and deep learning. I am trying to train a simple network and I get NaN loss on first epoch. I inspected the weights and they had become nans too. I tried to reduce the learning rate to 1e-8. Even that doesnt help. Please let me know what I am doing wrong.

import tensorflow as tf
import numpy as np

a = tf.constant(
    np.array([
        [ 8, 51,  1, 30,  3, 30],
        [ 1,  5,  2,  1,  1,  1],
        [11, 29,  1,  1,  1,  1],
        [ 1, 43,  1, 44, 27, 45],
        [ 1,  1,  1,  1,  1, 19],
        ])
)
l = tf.constant(np.array([[2], [1], [1], [2], [3]]))
model = tf.keras.Sequential([
    tf.keras.layers.Dense(3, activation='softmax', input_shape=[6])
])
optimizer = tf.keras.optimizers.Adam(lr=1e-8)
model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer)
print(model.summary())
history = model.fit(a,l, epochs=1, verbose=2)
question from:https://stackoverflow.com/questions/65949504/tensoflow-keras-nan-loss-with-sparse-categorical-crossentropy

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
536 views
Welcome To Ask or Share your Answers For Others

1 Answer

Your shape of l is not the right shape for categorical cross-entropy. Also, your labels must range from 0 to 2 and not from 1 to 3.

import tensorflow as tf
import numpy as np

a = tf.constant(
    np.array([
        [ 8, 51,  1, 30,  3, 30],
        [ 1,  5,  2,  1,  1,  1],
        [11, 29,  1,  1,  1,  1],
        [ 1, 43,  1, 44, 27, 45],
        [ 1,  1,  1,  1,  1, 19],
        ])
)
l = tf.constant(np.array([1, 0, 0, 1, 2]))
model = tf.keras.Sequential([
    tf.keras.layers.Dense(3, activation='softmax', input_shape=[6])
])
optimizer = tf.keras.optimizers.Adam(lr=1e-8)
model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer)
print(model.summary())
history = model.fit(a,l, epochs=3, verbose=2)
Model: "sequential_11"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_39 (Dense)             (None, 3)                 21        
=================================================================
Total params: 21
Trainable params: 21
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/3
1/1 - 0s - loss: 0.2769
Epoch 2/3
1/1 - 0s - loss: 0.2769
Epoch 3/3
1/1 - 0s - loss: 0.2769

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...