I'm implementing a VAE with a Binary Images Dataset (pixels are b or w), where in a image every pixel has a meaning (belonging to a class). Searching online I found that the best implementation is to use as the last activation function the Sigmoid, and binary crossentropy as loss function, correct me if I'm wrong.
When I try to generate an image from the latent space using random coordinates, or some that I obtained encoding an image in input, I may obtain blurry images, that it's normal, but I want only 0 and 1 as values (because i want to know if an element belongs to that class or not).
So my question is: there are some standard procedures in order to have only binary images outputs or to train the model to have this result (maybe changing the loss or something), or the model has to be implemented in this way and in order to have a binary image I just have to set a threshold (0.5) to the pixel of the images in output as the only solution?
question from:https://stackoverflow.com/questions/65907175/image-genaration-in-variational-autoencoder-having-a-binary-images-dataset