How does Sigmoid activation work in multi-class classification problems

I know that for a problem with multiple classes we usually use softmax, but can we also use sigmoid? I have tried to implement digit classification with sigmoid at the output layer, it works. What I don't understand is how does it work?

edited Oct 6 '18 at 19:56

Media

6,81052057

asked Oct 6 '18 at 8:41

bharath chandra

add a comment |

edited Oct 6 '18 at 19:56

Media

6,81052057

asked Oct 6 '18 at 8:41

bharath chandra

add a comment |

edited Oct 6 '18 at 19:56

Media

6,81052057

asked Oct 6 '18 at 8:41

bharath chandra

machine-learning neural-network deep-learning multiclass-classification activation-function

edited Oct 6 '18 at 19:56

Media

6,81052057

asked Oct 6 '18 at 8:41

bharath chandra

edited Oct 6 '18 at 19:56

Media

6,81052057

asked Oct 6 '18 at 8:41

bharath chandra

edited Oct 6 '18 at 19:56

Media

6,81052057

edited Oct 6 '18 at 19:56

Media

6,81052057

edited Oct 6 '18 at 19:56

Media

6,81052057

asked Oct 6 '18 at 8:41

bharath chandra

asked Oct 6 '18 at 8:41

bharath chandra

asked Oct 6 '18 at 8:41

bharath chandra

add a comment |

3 Answers
3

active

oldest

votes

softmax() will give you the probability distribution which means all output will sum to 1. While, sigmoid() will make sure the output value of neuron is between 0 to 1.

In case of digit classification and sigmoid(), you will have output of 10 output neurons between 0 to 1. Then, you can take biggest one of them and classify as that digit.

answered Oct 6 '18 at 19:01

Preet

1411

$begingroup$
So what you are saying is both works same? So softmax calculates the probability of one neuron with respect of all others and then returns neuron that has maximum probability whereas when using sigmoid it generates output for each neuron independently and the neuron that has maximum output is returned. Please correct me if I am wrong..
$endgroup$
– bharath chandra
Oct 7 '18 at 3:00

add a comment |

If your task is a kind of classification that the labels are mutually exclusive, each input just has one label, you have to use Softmax. If the inputs of your classification task have multiple labels for an input, your classes are not mutually exclusive and you can use Sigmoid for each output. For the former case, you should choose the output entry with the maximum value as the output. For the latter case, for each class, you have an activation value which belongs to the last sigmoid. If each activation is more than 0.5 you can say that entry exists in the input.

answered Oct 6 '18 at 19:55

Media

6,81052057

$begingroup$
Yes sir, but my intention is to know how they work within the network. For example, consider a training example using softmax I got expected value 3 when it the actual output is 4 so this can be compared and the weights can be adjusted, but when using sigmoid I always get the output between 0 to 1 how can I compare this with the actual output which can anything between 0 to 9. I am getting an accuracy of 98% when using sigmoid and 99% when using softmax but I don't understand how sigmoid is working.
$endgroup$
– bharath chandra
Oct 6 '18 at 23:22

$begingroup$
I didn't understand.
$endgroup$
– Media
Oct 7 '18 at 6:56

add a comment |

@bharath chandra A Softmax function will never give 3 as output. It will always output real values between 0 and 1. A Sigmoid function also gives output between 0 and 1. The difference is that in the former one, the sum of all the outputs will be equal to 1 (due to mutually exclusive nature) while in the latter case, the sum of all the outputs need not necessarily be equal to 1 (due to independent nature).

answered 51 secs ago

PS Nayak

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f39264%2fhow-does-sigmoid-activation-work-in-multi-class-classification-problems%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

softmax() will give you the probability distribution which means all output will sum to 1. While, sigmoid() will make sure the output value of neuron is between 0 to 1.

In case of digit classification and sigmoid(), you will have output of 10 output neurons between 0 to 1. Then, you can take biggest one of them and classify as that digit.

answered Oct 6 '18 at 19:01

Preet

1411

$begingroup$
So what you are saying is both works same? So softmax calculates the probability of one neuron with respect of all others and then returns neuron that has maximum probability whereas when using sigmoid it generates output for each neuron independently and the neuron that has maximum output is returned. Please correct me if I am wrong..
$endgroup$
– bharath chandra
Oct 7 '18 at 3:00

add a comment |

softmax() will give you the probability distribution which means all output will sum to 1. While, sigmoid() will make sure the output value of neuron is between 0 to 1.

In case of digit classification and sigmoid(), you will have output of 10 output neurons between 0 to 1. Then, you can take biggest one of them and classify as that digit.

answered Oct 6 '18 at 19:01

Preet

1411

$begingroup$
So what you are saying is both works same? So softmax calculates the probability of one neuron with respect of all others and then returns neuron that has maximum probability whereas when using sigmoid it generates output for each neuron independently and the neuron that has maximum output is returned. Please correct me if I am wrong..
$endgroup$
– bharath chandra
Oct 7 '18 at 3:00

add a comment |

softmax() will give you the probability distribution which means all output will sum to 1. While, sigmoid() will make sure the output value of neuron is between 0 to 1.

In case of digit classification and sigmoid(), you will have output of 10 output neurons between 0 to 1. Then, you can take biggest one of them and classify as that digit.

answered Oct 6 '18 at 19:01

Preet

1411

softmax() will give you the probability distribution which means all output will sum to 1. While, sigmoid() will make sure the output value of neuron is between 0 to 1.

In case of digit classification and sigmoid(), you will have output of 10 output neurons between 0 to 1. Then, you can take biggest one of them and classify as that digit.

answered Oct 6 '18 at 19:01

Preet

1411

answered Oct 6 '18 at 19:01

Preet

1411

answered Oct 6 '18 at 19:01

Preet

1411

answered Oct 6 '18 at 19:01

Preet

1411

$begingroup$
So what you are saying is both works same? So softmax calculates the probability of one neuron with respect of all others and then returns neuron that has maximum probability whereas when using sigmoid it generates output for each neuron independently and the neuron that has maximum output is returned. Please correct me if I am wrong..
$endgroup$
– bharath chandra
Oct 7 '18 at 3:00

add a comment |

$begingroup$
So what you are saying is both works same? So softmax calculates the probability of one neuron with respect of all others and then returns neuron that has maximum probability whereas when using sigmoid it generates output for each neuron independently and the neuron that has maximum output is returned. Please correct me if I am wrong..
$endgroup$
– bharath chandra
Oct 7 '18 at 3:00

So what you are saying is both works same? So softmax calculates the probability of one neuron with respect of all others and then returns neuron that has maximum probability whereas when using sigmoid it generates output for each neuron independently and the neuron that has maximum output is returned. Please correct me if I am wrong..

– bharath chandra
Oct 7 '18 at 3:00

add a comment |

answered Oct 6 '18 at 19:55

Media

6,81052057

$begingroup$
Yes sir, but my intention is to know how they work within the network. For example, consider a training example using softmax I got expected value 3 when it the actual output is 4 so this can be compared and the weights can be adjusted, but when using sigmoid I always get the output between 0 to 1 how can I compare this with the actual output which can anything between 0 to 9. I am getting an accuracy of 98% when using sigmoid and 99% when using softmax but I don't understand how sigmoid is working.
$endgroup$
– bharath chandra
Oct 6 '18 at 23:22

$begingroup$
I didn't understand.
$endgroup$
– Media
Oct 7 '18 at 6:56

add a comment |

answered Oct 6 '18 at 19:55

Media

6,81052057

$begingroup$
Yes sir, but my intention is to know how they work within the network. For example, consider a training example using softmax I got expected value 3 when it the actual output is 4 so this can be compared and the weights can be adjusted, but when using sigmoid I always get the output between 0 to 1 how can I compare this with the actual output which can anything between 0 to 9. I am getting an accuracy of 98% when using sigmoid and 99% when using softmax but I don't understand how sigmoid is working.
$endgroup$
– bharath chandra
Oct 6 '18 at 23:22

$begingroup$
I didn't understand.
$endgroup$
– Media
Oct 7 '18 at 6:56

add a comment |

answered Oct 6 '18 at 19:55

Media

6,81052057

answered Oct 6 '18 at 19:55

Media

6,81052057

answered Oct 6 '18 at 19:55

Media

6,81052057

answered Oct 6 '18 at 19:55

Media

6,81052057

answered Oct 6 '18 at 19:55

Media

6,81052057

$begingroup$
Yes sir, but my intention is to know how they work within the network. For example, consider a training example using softmax I got expected value 3 when it the actual output is 4 so this can be compared and the weights can be adjusted, but when using sigmoid I always get the output between 0 to 1 how can I compare this with the actual output which can anything between 0 to 9. I am getting an accuracy of 98% when using sigmoid and 99% when using softmax but I don't understand how sigmoid is working.
$endgroup$
– bharath chandra
Oct 6 '18 at 23:22

$begingroup$
I didn't understand.
$endgroup$
– Media
Oct 7 '18 at 6:56

add a comment |

$begingroup$
Yes sir, but my intention is to know how they work within the network. For example, consider a training example using softmax I got expected value 3 when it the actual output is 4 so this can be compared and the weights can be adjusted, but when using sigmoid I always get the output between 0 to 1 how can I compare this with the actual output which can anything between 0 to 9. I am getting an accuracy of 98% when using sigmoid and 99% when using softmax but I don't understand how sigmoid is working.
$endgroup$
– bharath chandra
Oct 6 '18 at 23:22

$begingroup$
I didn't understand.
$endgroup$
– Media
Oct 7 '18 at 6:56

Yes sir, but my intention is to know how they work within the network. For example, consider a training example using softmax I got expected value 3 when it the actual output is 4 so this can be compared and the weights can be adjusted, but when using sigmoid I always get the output between 0 to 1 how can I compare this with the actual output which can anything between 0 to 9. I am getting an accuracy of 98% when using sigmoid and 99% when using softmax but I don't understand how sigmoid is working.

– bharath chandra
Oct 6 '18 at 23:22

I didn't understand.

– Media
Oct 7 '18 at 6:56

add a comment |

answered 51 secs ago

PS Nayak

add a comment |

answered 51 secs ago

PS Nayak

add a comment |

answered 51 secs ago

PS Nayak

answered 51 secs ago

PS Nayak

answered 51 secs ago

PS Nayak

answered 51 secs ago

PS Nayak

answered 51 secs ago

PS Nayak

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Gfyuki