Why does my Keras model learn to recognize the background?
$begingroup$
I'm trying to train this Keras implementation of Deeplabv3+ on Pascal VOC2012, using the pretrained model (which was also trained on that dataset).
I got weird results with the accuracy quickly converging to 1.0:
5/5 [==============================] - 182s 36s/step - loss: 26864.4418 - acc: 0.7669 - val_loss: 19385.8555 - val_acc: 0.4818
Epoch 2/3
5/5 [==============================] - 77s 15s/step - loss: 42117.3555 - acc: 0.9815 - val_loss: 69088.5469 - val_acc: 0.9948
Epoch 3/3
5/5 [==============================] - 78s 16s/step - loss: 45300.6992 - acc: 1.0000 - val_loss: 44569.9414 - val_acc: 1.0000
Testing the model also gives 100% accuracy.
I decided to plot predictions on the same set of random images before and after training, and found that the model is encouraged to say everything is just background (that's the 1st class in Pascal VOC2012).
I'm quite new to deep learning and would need help to figure out where this could come from.
I thought that perhaps it could be my loss function, which I defined as:
def image_categorical_cross_entropy(y_true, y_pred):
"""
:param y_true: tensor of shape (batch_size, height, width) representing the ground truth.
:param y_pred: tensor of shape (batch_size, height, width) representing the prediction.
:return: The mean cross-entropy on softmaxed tensors.
"""
return tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=y_pred, labels=y_true))
I am a bit uncertain on whether my tensors have the right shape. I'm using TF's dataset API to load .tfrecord
files, and my annotation tensor is of shape (batch_size, height, width)
. Would (batch_size, height, width, 21)
be what's needed? Other errors from inside the model arise when I try to separate the annotation image into a tensor containing 21 images (one for each class):
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [12,512,512,21] vs. [12,512,512]
[[Node: metrics/acc/Equal = Equal[T=DT_INT64, _device="/job:localhost/replica:0/task:0/device:GPU:0"](metrics/acc/ArgMax, metrics/acc/ArgMax_1)]]
[[Node: training/Adam/gradients/bilinear_upsampling_2_1/concat_grad/Slice_1/_13277 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:1", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_62151_training/Adam/gradients/bilinear_upsampling_2_1/concat_grad/Slice_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:1"]()]]
Thank you for your help!
python deep-learning keras tensorflow
$endgroup$
|
show 1 more comment
$begingroup$
I'm trying to train this Keras implementation of Deeplabv3+ on Pascal VOC2012, using the pretrained model (which was also trained on that dataset).
I got weird results with the accuracy quickly converging to 1.0:
5/5 [==============================] - 182s 36s/step - loss: 26864.4418 - acc: 0.7669 - val_loss: 19385.8555 - val_acc: 0.4818
Epoch 2/3
5/5 [==============================] - 77s 15s/step - loss: 42117.3555 - acc: 0.9815 - val_loss: 69088.5469 - val_acc: 0.9948
Epoch 3/3
5/5 [==============================] - 78s 16s/step - loss: 45300.6992 - acc: 1.0000 - val_loss: 44569.9414 - val_acc: 1.0000
Testing the model also gives 100% accuracy.
I decided to plot predictions on the same set of random images before and after training, and found that the model is encouraged to say everything is just background (that's the 1st class in Pascal VOC2012).
I'm quite new to deep learning and would need help to figure out where this could come from.
I thought that perhaps it could be my loss function, which I defined as:
def image_categorical_cross_entropy(y_true, y_pred):
"""
:param y_true: tensor of shape (batch_size, height, width) representing the ground truth.
:param y_pred: tensor of shape (batch_size, height, width) representing the prediction.
:return: The mean cross-entropy on softmaxed tensors.
"""
return tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=y_pred, labels=y_true))
I am a bit uncertain on whether my tensors have the right shape. I'm using TF's dataset API to load .tfrecord
files, and my annotation tensor is of shape (batch_size, height, width)
. Would (batch_size, height, width, 21)
be what's needed? Other errors from inside the model arise when I try to separate the annotation image into a tensor containing 21 images (one for each class):
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [12,512,512,21] vs. [12,512,512]
[[Node: metrics/acc/Equal = Equal[T=DT_INT64, _device="/job:localhost/replica:0/task:0/device:GPU:0"](metrics/acc/ArgMax, metrics/acc/ArgMax_1)]]
[[Node: training/Adam/gradients/bilinear_upsampling_2_1/concat_grad/Slice_1/_13277 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:1", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_62151_training/Adam/gradients/bilinear_upsampling_2_1/concat_grad/Slice_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:1"]()]]
Thank you for your help!
python deep-learning keras tensorflow
$endgroup$
5
$begingroup$
Quite a few items to consider here, I almost don't know where to start (1) are you using a sample size of 5 for training??? (2) What, if any, pre-processing are you doing to your images? I have a feeling that the answer lies within this and (3) you'd have to provide a lot more info on your model. How many labeled samples do you have? How many possible categories? Do you have a balanced training set? (4) your accuracy of 1.0 basically means nothing because your loss is super-high and increasing. Your loss should decrease as your accuracy improves.
$endgroup$
– I_Play_With_Data
Oct 3 '18 at 15:54
$begingroup$
(1) I'm using batches of size 12 but this is kind of irrelevant I think. I only showed 3 small epochs of only 5 steps here because that's just how quickly it converges. (2) My preprocessing consists of some augmentation and rescaling (possibly cropping) to 512x512 for every image and its associated annotation. (3) there are about 11,500 labeled images in Pascal VOC 2012. Granted most papers reach 85%+ mIOU on this dataset, I would assume it's balanced. There's 20 different categories in this dataset plus one for the background or « ambiguous », for a total of 21.
$endgroup$
– Matt
Oct 3 '18 at 16:19
$begingroup$
I'm curios. Did you find the reason for your model's results?
$endgroup$
– Mark.F
Dec 13 '18 at 10:49
3
$begingroup$
If you shared your code, it would be possible to find the mistake.
$endgroup$
– Dmytro Prylipko
Jan 11 at 15:03
1
$begingroup$
The fact that a pre-trained model finds a way to get 100% accuracy within 3 epochs, using the same data as was originally used, makes me think the bug is that your training labels are wrong, perhaps all set to the label that corresponds to background. In any case, have a look at this issue thread, where people discuss their problems and solutions for fine-tuning the model. The model isn't necessarily broken, and the batchnorm bug in Tensorflow can be addressed.
$endgroup$
– n1k31t4
Mar 2 at 13:28
|
show 1 more comment
$begingroup$
I'm trying to train this Keras implementation of Deeplabv3+ on Pascal VOC2012, using the pretrained model (which was also trained on that dataset).
I got weird results with the accuracy quickly converging to 1.0:
5/5 [==============================] - 182s 36s/step - loss: 26864.4418 - acc: 0.7669 - val_loss: 19385.8555 - val_acc: 0.4818
Epoch 2/3
5/5 [==============================] - 77s 15s/step - loss: 42117.3555 - acc: 0.9815 - val_loss: 69088.5469 - val_acc: 0.9948
Epoch 3/3
5/5 [==============================] - 78s 16s/step - loss: 45300.6992 - acc: 1.0000 - val_loss: 44569.9414 - val_acc: 1.0000
Testing the model also gives 100% accuracy.
I decided to plot predictions on the same set of random images before and after training, and found that the model is encouraged to say everything is just background (that's the 1st class in Pascal VOC2012).
I'm quite new to deep learning and would need help to figure out where this could come from.
I thought that perhaps it could be my loss function, which I defined as:
def image_categorical_cross_entropy(y_true, y_pred):
"""
:param y_true: tensor of shape (batch_size, height, width) representing the ground truth.
:param y_pred: tensor of shape (batch_size, height, width) representing the prediction.
:return: The mean cross-entropy on softmaxed tensors.
"""
return tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=y_pred, labels=y_true))
I am a bit uncertain on whether my tensors have the right shape. I'm using TF's dataset API to load .tfrecord
files, and my annotation tensor is of shape (batch_size, height, width)
. Would (batch_size, height, width, 21)
be what's needed? Other errors from inside the model arise when I try to separate the annotation image into a tensor containing 21 images (one for each class):
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [12,512,512,21] vs. [12,512,512]
[[Node: metrics/acc/Equal = Equal[T=DT_INT64, _device="/job:localhost/replica:0/task:0/device:GPU:0"](metrics/acc/ArgMax, metrics/acc/ArgMax_1)]]
[[Node: training/Adam/gradients/bilinear_upsampling_2_1/concat_grad/Slice_1/_13277 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:1", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_62151_training/Adam/gradients/bilinear_upsampling_2_1/concat_grad/Slice_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:1"]()]]
Thank you for your help!
python deep-learning keras tensorflow
$endgroup$
I'm trying to train this Keras implementation of Deeplabv3+ on Pascal VOC2012, using the pretrained model (which was also trained on that dataset).
I got weird results with the accuracy quickly converging to 1.0:
5/5 [==============================] - 182s 36s/step - loss: 26864.4418 - acc: 0.7669 - val_loss: 19385.8555 - val_acc: 0.4818
Epoch 2/3
5/5 [==============================] - 77s 15s/step - loss: 42117.3555 - acc: 0.9815 - val_loss: 69088.5469 - val_acc: 0.9948
Epoch 3/3
5/5 [==============================] - 78s 16s/step - loss: 45300.6992 - acc: 1.0000 - val_loss: 44569.9414 - val_acc: 1.0000
Testing the model also gives 100% accuracy.
I decided to plot predictions on the same set of random images before and after training, and found that the model is encouraged to say everything is just background (that's the 1st class in Pascal VOC2012).
I'm quite new to deep learning and would need help to figure out where this could come from.
I thought that perhaps it could be my loss function, which I defined as:
def image_categorical_cross_entropy(y_true, y_pred):
"""
:param y_true: tensor of shape (batch_size, height, width) representing the ground truth.
:param y_pred: tensor of shape (batch_size, height, width) representing the prediction.
:return: The mean cross-entropy on softmaxed tensors.
"""
return tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=y_pred, labels=y_true))
I am a bit uncertain on whether my tensors have the right shape. I'm using TF's dataset API to load .tfrecord
files, and my annotation tensor is of shape (batch_size, height, width)
. Would (batch_size, height, width, 21)
be what's needed? Other errors from inside the model arise when I try to separate the annotation image into a tensor containing 21 images (one for each class):
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [12,512,512,21] vs. [12,512,512]
[[Node: metrics/acc/Equal = Equal[T=DT_INT64, _device="/job:localhost/replica:0/task:0/device:GPU:0"](metrics/acc/ArgMax, metrics/acc/ArgMax_1)]]
[[Node: training/Adam/gradients/bilinear_upsampling_2_1/concat_grad/Slice_1/_13277 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:1", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_62151_training/Adam/gradients/bilinear_upsampling_2_1/concat_grad/Slice_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:1"]()]]
Thank you for your help!
python deep-learning keras tensorflow
python deep-learning keras tensorflow
edited Oct 5 '18 at 17:31
Matt
asked Oct 3 '18 at 15:06
MattMatt
715
715
5
$begingroup$
Quite a few items to consider here, I almost don't know where to start (1) are you using a sample size of 5 for training??? (2) What, if any, pre-processing are you doing to your images? I have a feeling that the answer lies within this and (3) you'd have to provide a lot more info on your model. How many labeled samples do you have? How many possible categories? Do you have a balanced training set? (4) your accuracy of 1.0 basically means nothing because your loss is super-high and increasing. Your loss should decrease as your accuracy improves.
$endgroup$
– I_Play_With_Data
Oct 3 '18 at 15:54
$begingroup$
(1) I'm using batches of size 12 but this is kind of irrelevant I think. I only showed 3 small epochs of only 5 steps here because that's just how quickly it converges. (2) My preprocessing consists of some augmentation and rescaling (possibly cropping) to 512x512 for every image and its associated annotation. (3) there are about 11,500 labeled images in Pascal VOC 2012. Granted most papers reach 85%+ mIOU on this dataset, I would assume it's balanced. There's 20 different categories in this dataset plus one for the background or « ambiguous », for a total of 21.
$endgroup$
– Matt
Oct 3 '18 at 16:19
$begingroup$
I'm curios. Did you find the reason for your model's results?
$endgroup$
– Mark.F
Dec 13 '18 at 10:49
3
$begingroup$
If you shared your code, it would be possible to find the mistake.
$endgroup$
– Dmytro Prylipko
Jan 11 at 15:03
1
$begingroup$
The fact that a pre-trained model finds a way to get 100% accuracy within 3 epochs, using the same data as was originally used, makes me think the bug is that your training labels are wrong, perhaps all set to the label that corresponds to background. In any case, have a look at this issue thread, where people discuss their problems and solutions for fine-tuning the model. The model isn't necessarily broken, and the batchnorm bug in Tensorflow can be addressed.
$endgroup$
– n1k31t4
Mar 2 at 13:28
|
show 1 more comment
5
$begingroup$
Quite a few items to consider here, I almost don't know where to start (1) are you using a sample size of 5 for training??? (2) What, if any, pre-processing are you doing to your images? I have a feeling that the answer lies within this and (3) you'd have to provide a lot more info on your model. How many labeled samples do you have? How many possible categories? Do you have a balanced training set? (4) your accuracy of 1.0 basically means nothing because your loss is super-high and increasing. Your loss should decrease as your accuracy improves.
$endgroup$
– I_Play_With_Data
Oct 3 '18 at 15:54
$begingroup$
(1) I'm using batches of size 12 but this is kind of irrelevant I think. I only showed 3 small epochs of only 5 steps here because that's just how quickly it converges. (2) My preprocessing consists of some augmentation and rescaling (possibly cropping) to 512x512 for every image and its associated annotation. (3) there are about 11,500 labeled images in Pascal VOC 2012. Granted most papers reach 85%+ mIOU on this dataset, I would assume it's balanced. There's 20 different categories in this dataset plus one for the background or « ambiguous », for a total of 21.
$endgroup$
– Matt
Oct 3 '18 at 16:19
$begingroup$
I'm curios. Did you find the reason for your model's results?
$endgroup$
– Mark.F
Dec 13 '18 at 10:49
3
$begingroup$
If you shared your code, it would be possible to find the mistake.
$endgroup$
– Dmytro Prylipko
Jan 11 at 15:03
1
$begingroup$
The fact that a pre-trained model finds a way to get 100% accuracy within 3 epochs, using the same data as was originally used, makes me think the bug is that your training labels are wrong, perhaps all set to the label that corresponds to background. In any case, have a look at this issue thread, where people discuss their problems and solutions for fine-tuning the model. The model isn't necessarily broken, and the batchnorm bug in Tensorflow can be addressed.
$endgroup$
– n1k31t4
Mar 2 at 13:28
5
5
$begingroup$
Quite a few items to consider here, I almost don't know where to start (1) are you using a sample size of 5 for training??? (2) What, if any, pre-processing are you doing to your images? I have a feeling that the answer lies within this and (3) you'd have to provide a lot more info on your model. How many labeled samples do you have? How many possible categories? Do you have a balanced training set? (4) your accuracy of 1.0 basically means nothing because your loss is super-high and increasing. Your loss should decrease as your accuracy improves.
$endgroup$
– I_Play_With_Data
Oct 3 '18 at 15:54
$begingroup$
Quite a few items to consider here, I almost don't know where to start (1) are you using a sample size of 5 for training??? (2) What, if any, pre-processing are you doing to your images? I have a feeling that the answer lies within this and (3) you'd have to provide a lot more info on your model. How many labeled samples do you have? How many possible categories? Do you have a balanced training set? (4) your accuracy of 1.0 basically means nothing because your loss is super-high and increasing. Your loss should decrease as your accuracy improves.
$endgroup$
– I_Play_With_Data
Oct 3 '18 at 15:54
$begingroup$
(1) I'm using batches of size 12 but this is kind of irrelevant I think. I only showed 3 small epochs of only 5 steps here because that's just how quickly it converges. (2) My preprocessing consists of some augmentation and rescaling (possibly cropping) to 512x512 for every image and its associated annotation. (3) there are about 11,500 labeled images in Pascal VOC 2012. Granted most papers reach 85%+ mIOU on this dataset, I would assume it's balanced. There's 20 different categories in this dataset plus one for the background or « ambiguous », for a total of 21.
$endgroup$
– Matt
Oct 3 '18 at 16:19
$begingroup$
(1) I'm using batches of size 12 but this is kind of irrelevant I think. I only showed 3 small epochs of only 5 steps here because that's just how quickly it converges. (2) My preprocessing consists of some augmentation and rescaling (possibly cropping) to 512x512 for every image and its associated annotation. (3) there are about 11,500 labeled images in Pascal VOC 2012. Granted most papers reach 85%+ mIOU on this dataset, I would assume it's balanced. There's 20 different categories in this dataset plus one for the background or « ambiguous », for a total of 21.
$endgroup$
– Matt
Oct 3 '18 at 16:19
$begingroup$
I'm curios. Did you find the reason for your model's results?
$endgroup$
– Mark.F
Dec 13 '18 at 10:49
$begingroup$
I'm curios. Did you find the reason for your model's results?
$endgroup$
– Mark.F
Dec 13 '18 at 10:49
3
3
$begingroup$
If you shared your code, it would be possible to find the mistake.
$endgroup$
– Dmytro Prylipko
Jan 11 at 15:03
$begingroup$
If you shared your code, it would be possible to find the mistake.
$endgroup$
– Dmytro Prylipko
Jan 11 at 15:03
1
1
$begingroup$
The fact that a pre-trained model finds a way to get 100% accuracy within 3 epochs, using the same data as was originally used, makes me think the bug is that your training labels are wrong, perhaps all set to the label that corresponds to background. In any case, have a look at this issue thread, where people discuss their problems and solutions for fine-tuning the model. The model isn't necessarily broken, and the batchnorm bug in Tensorflow can be addressed.
$endgroup$
– n1k31t4
Mar 2 at 13:28
$begingroup$
The fact that a pre-trained model finds a way to get 100% accuracy within 3 epochs, using the same data as was originally used, makes me think the bug is that your training labels are wrong, perhaps all set to the label that corresponds to background. In any case, have a look at this issue thread, where people discuss their problems and solutions for fine-tuning the model. The model isn't necessarily broken, and the batchnorm bug in Tensorflow can be addressed.
$endgroup$
– n1k31t4
Mar 2 at 13:28
|
show 1 more comment
1 Answer
1
active
oldest
votes
$begingroup$
Your model is overfitting. Each epoch only has 5 images. The model is "memorizing" the answer for each image.
In order to minimize the chance of overfitting, increase the number of images. There should be several thousand example images for each category of object.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f39117%2fwhy-does-my-keras-model-learn-to-recognize-the-background%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Your model is overfitting. Each epoch only has 5 images. The model is "memorizing" the answer for each image.
In order to minimize the chance of overfitting, increase the number of images. There should be several thousand example images for each category of object.
$endgroup$
add a comment |
$begingroup$
Your model is overfitting. Each epoch only has 5 images. The model is "memorizing" the answer for each image.
In order to minimize the chance of overfitting, increase the number of images. There should be several thousand example images for each category of object.
$endgroup$
add a comment |
$begingroup$
Your model is overfitting. Each epoch only has 5 images. The model is "memorizing" the answer for each image.
In order to minimize the chance of overfitting, increase the number of images. There should be several thousand example images for each category of object.
$endgroup$
Your model is overfitting. Each epoch only has 5 images. The model is "memorizing" the answer for each image.
In order to minimize the chance of overfitting, increase the number of images. There should be several thousand example images for each category of object.
answered 4 hours ago
Brian SpieringBrian Spiering
4,2331129
4,2331129
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f39117%2fwhy-does-my-keras-model-learn-to-recognize-the-background%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
5
$begingroup$
Quite a few items to consider here, I almost don't know where to start (1) are you using a sample size of 5 for training??? (2) What, if any, pre-processing are you doing to your images? I have a feeling that the answer lies within this and (3) you'd have to provide a lot more info on your model. How many labeled samples do you have? How many possible categories? Do you have a balanced training set? (4) your accuracy of 1.0 basically means nothing because your loss is super-high and increasing. Your loss should decrease as your accuracy improves.
$endgroup$
– I_Play_With_Data
Oct 3 '18 at 15:54
$begingroup$
(1) I'm using batches of size 12 but this is kind of irrelevant I think. I only showed 3 small epochs of only 5 steps here because that's just how quickly it converges. (2) My preprocessing consists of some augmentation and rescaling (possibly cropping) to 512x512 for every image and its associated annotation. (3) there are about 11,500 labeled images in Pascal VOC 2012. Granted most papers reach 85%+ mIOU on this dataset, I would assume it's balanced. There's 20 different categories in this dataset plus one for the background or « ambiguous », for a total of 21.
$endgroup$
– Matt
Oct 3 '18 at 16:19
$begingroup$
I'm curios. Did you find the reason for your model's results?
$endgroup$
– Mark.F
Dec 13 '18 at 10:49
3
$begingroup$
If you shared your code, it would be possible to find the mistake.
$endgroup$
– Dmytro Prylipko
Jan 11 at 15:03
1
$begingroup$
The fact that a pre-trained model finds a way to get 100% accuracy within 3 epochs, using the same data as was originally used, makes me think the bug is that your training labels are wrong, perhaps all set to the label that corresponds to background. In any case, have a look at this issue thread, where people discuss their problems and solutions for fine-tuning the model. The model isn't necessarily broken, and the batchnorm bug in Tensorflow can be addressed.
$endgroup$
– n1k31t4
Mar 2 at 13:28