I have the following toy example to play with the TextCategorizer
CNN model in sPacy. Given the distribution of my input dataset, I am expecting the test results to converge to 0.75, 0, and 0.5 respectively. However this never happens; rather each result either converges to 1.0 or 0.0, seemingly at random. Can anyone help explain to me what is happening here and if I can fix my model to make it deterministic?
Note: changing the architecture to "bow"
(bag of words) does give the expected results, so I guess the issue is somehow related to the CNN layers ...
Example code:
import spacy
num_epochs = 100
drop = 0.0
learn_rate = 0.01
architecture = "simple_cnn"
model = spacy.load('en_core_web_sm')
model.add_pipe(model.create_pipe("textcat", config={"architecture": architecture}))
model.get_pipe('textcat').add_label("green")
data = [('apple 1', {"cats": {"green": 1.0}}),
('apple 2', {"cats": {"green": 1.0}}),
('apple 3', {"cats": {"green": 1.0}}),
('apple 4', {"cats": {"green": 0.0}}),
('banana 1', {"cats": {"green": 0.0}}),
('banana 2', {"cats": {"green": 0.0}}),
('banana 3', {"cats": {"green": 0.0}}),
('banana 4', {"cats": {"green": 0.0}}),
('dz1909 1', {"cats": {"green": 0.0}}),
('dz1909 2', {"cats": {"green": 0.0}}),
('dz1909 3', {"cats": {"green": 1.0}}),
('dz1909 4', {"cats": {"green": 1.0}}),
]
sgd = model.begin_training()
sgd.learn_rate = learn_rate
for i in range(num_epochs):
print("epoch: ", i)
print("apple", model("apple").cats, "should converge to 0.75")
print("banana", model("banana").cats, "should converge to 0")
print("dz1909", model("dz1909").cats, "should converge to 0.5")
for doc, annot in data:
model.update([doc], [annot], drop=drop, sgd=sgd)
question from:https://stackoverflow.com/questions/65865385/adam-optimizer-in-spacy-giving-strange-results