Pytorch : Loss function for binary classification
$begingroup$
Fairly newbie to Pytorch & neural nets world.Below is a code snippet from a binary classification being done using a simple 3 layer network :
n_input_dim = X_train.shape[1]
n_hidden = 100 # Number of hidden nodes
n_output = 1 # Number of output nodes = for binary classifier
# Build the network
model = nn.Sequential(
nn.Linear(n_input_dim, n_hidden),
nn.ELU(),
nn.Linear(n_hidden, n_output),
nn.Sigmoid())
x_tensor = torch.from_numpy(X_train.values).float()
tensor([[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000],
[ -1.0000, -1.0000, -1.0000, ..., 0.1538, 5.0000, 0.1538],
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, 6.0000, 0.2381],
...,
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000],
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000],
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000]])
y_tensor = torch.from_numpy(Y_train).float()
tensor([0., 0., 1., ..., 0., 0., 0.])
#Loss Computation
loss_func = nn.BCELoss()
#Optimizer
learning_rate = 0.0001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
train_loss =
iters = 500
for i in range(iters):
y_pred = model(x_tensor)
loss = loss_func(y_pred, y_tensor)
print " Loss in iteration :"
print (i, loss.item())
optimizer.zero_grad()
loss.backward()
optimizer.step()
train_loss.append(loss.item())
In the above case , what i'm not sure about is loss is being computed on y_pred which is a set of probabilities ,computed from the model on the training data with y_tensor (which is binary 0/1).
Is this way of loss computation fine in Classification problem in pytorch? Shouldn't loss be computed between two probabilities set ideally ? If this is fine , then does loss function , BCELoss over here , scales the input in some manner ?
Any insights towards this will be highly appreciated
loss-function pytorch
New contributor
$endgroup$
add a comment |
$begingroup$
Fairly newbie to Pytorch & neural nets world.Below is a code snippet from a binary classification being done using a simple 3 layer network :
n_input_dim = X_train.shape[1]
n_hidden = 100 # Number of hidden nodes
n_output = 1 # Number of output nodes = for binary classifier
# Build the network
model = nn.Sequential(
nn.Linear(n_input_dim, n_hidden),
nn.ELU(),
nn.Linear(n_hidden, n_output),
nn.Sigmoid())
x_tensor = torch.from_numpy(X_train.values).float()
tensor([[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000],
[ -1.0000, -1.0000, -1.0000, ..., 0.1538, 5.0000, 0.1538],
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, 6.0000, 0.2381],
...,
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000],
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000],
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000]])
y_tensor = torch.from_numpy(Y_train).float()
tensor([0., 0., 1., ..., 0., 0., 0.])
#Loss Computation
loss_func = nn.BCELoss()
#Optimizer
learning_rate = 0.0001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
train_loss =
iters = 500
for i in range(iters):
y_pred = model(x_tensor)
loss = loss_func(y_pred, y_tensor)
print " Loss in iteration :"
print (i, loss.item())
optimizer.zero_grad()
loss.backward()
optimizer.step()
train_loss.append(loss.item())
In the above case , what i'm not sure about is loss is being computed on y_pred which is a set of probabilities ,computed from the model on the training data with y_tensor (which is binary 0/1).
Is this way of loss computation fine in Classification problem in pytorch? Shouldn't loss be computed between two probabilities set ideally ? If this is fine , then does loss function , BCELoss over here , scales the input in some manner ?
Any insights towards this will be highly appreciated
loss-function pytorch
New contributor
$endgroup$
add a comment |
$begingroup$
Fairly newbie to Pytorch & neural nets world.Below is a code snippet from a binary classification being done using a simple 3 layer network :
n_input_dim = X_train.shape[1]
n_hidden = 100 # Number of hidden nodes
n_output = 1 # Number of output nodes = for binary classifier
# Build the network
model = nn.Sequential(
nn.Linear(n_input_dim, n_hidden),
nn.ELU(),
nn.Linear(n_hidden, n_output),
nn.Sigmoid())
x_tensor = torch.from_numpy(X_train.values).float()
tensor([[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000],
[ -1.0000, -1.0000, -1.0000, ..., 0.1538, 5.0000, 0.1538],
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, 6.0000, 0.2381],
...,
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000],
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000],
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000]])
y_tensor = torch.from_numpy(Y_train).float()
tensor([0., 0., 1., ..., 0., 0., 0.])
#Loss Computation
loss_func = nn.BCELoss()
#Optimizer
learning_rate = 0.0001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
train_loss =
iters = 500
for i in range(iters):
y_pred = model(x_tensor)
loss = loss_func(y_pred, y_tensor)
print " Loss in iteration :"
print (i, loss.item())
optimizer.zero_grad()
loss.backward()
optimizer.step()
train_loss.append(loss.item())
In the above case , what i'm not sure about is loss is being computed on y_pred which is a set of probabilities ,computed from the model on the training data with y_tensor (which is binary 0/1).
Is this way of loss computation fine in Classification problem in pytorch? Shouldn't loss be computed between two probabilities set ideally ? If this is fine , then does loss function , BCELoss over here , scales the input in some manner ?
Any insights towards this will be highly appreciated
loss-function pytorch
New contributor
$endgroup$
Fairly newbie to Pytorch & neural nets world.Below is a code snippet from a binary classification being done using a simple 3 layer network :
n_input_dim = X_train.shape[1]
n_hidden = 100 # Number of hidden nodes
n_output = 1 # Number of output nodes = for binary classifier
# Build the network
model = nn.Sequential(
nn.Linear(n_input_dim, n_hidden),
nn.ELU(),
nn.Linear(n_hidden, n_output),
nn.Sigmoid())
x_tensor = torch.from_numpy(X_train.values).float()
tensor([[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000],
[ -1.0000, -1.0000, -1.0000, ..., 0.1538, 5.0000, 0.1538],
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, 6.0000, 0.2381],
...,
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000],
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000],
[ -1.0000, -1.0000, -1.0000, ..., -99.0000, -99.0000, -99.0000]])
y_tensor = torch.from_numpy(Y_train).float()
tensor([0., 0., 1., ..., 0., 0., 0.])
#Loss Computation
loss_func = nn.BCELoss()
#Optimizer
learning_rate = 0.0001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
train_loss =
iters = 500
for i in range(iters):
y_pred = model(x_tensor)
loss = loss_func(y_pred, y_tensor)
print " Loss in iteration :"
print (i, loss.item())
optimizer.zero_grad()
loss.backward()
optimizer.step()
train_loss.append(loss.item())
In the above case , what i'm not sure about is loss is being computed on y_pred which is a set of probabilities ,computed from the model on the training data with y_tensor (which is binary 0/1).
Is this way of loss computation fine in Classification problem in pytorch? Shouldn't loss be computed between two probabilities set ideally ? If this is fine , then does loss function , BCELoss over here , scales the input in some manner ?
Any insights towards this will be highly appreciated
loss-function pytorch
loss-function pytorch
New contributor
New contributor
New contributor
asked 10 hours ago
raulraul
62
62
New contributor
New contributor
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
You are right about the fact that cross entropy is computed between 2 distributions, however, in the case of the y_tensor values, we know for sure which class the example should actually belong to which is the ground truth. So, you can think of the binary values as probability distributions over possible classes in which case the loss function is absolutely correct and the way to go for the problem. Hope that helps.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
raul is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48891%2fpytorch-loss-function-for-binary-classification%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
You are right about the fact that cross entropy is computed between 2 distributions, however, in the case of the y_tensor values, we know for sure which class the example should actually belong to which is the ground truth. So, you can think of the binary values as probability distributions over possible classes in which case the loss function is absolutely correct and the way to go for the problem. Hope that helps.
$endgroup$
add a comment |
$begingroup$
You are right about the fact that cross entropy is computed between 2 distributions, however, in the case of the y_tensor values, we know for sure which class the example should actually belong to which is the ground truth. So, you can think of the binary values as probability distributions over possible classes in which case the loss function is absolutely correct and the way to go for the problem. Hope that helps.
$endgroup$
add a comment |
$begingroup$
You are right about the fact that cross entropy is computed between 2 distributions, however, in the case of the y_tensor values, we know for sure which class the example should actually belong to which is the ground truth. So, you can think of the binary values as probability distributions over possible classes in which case the loss function is absolutely correct and the way to go for the problem. Hope that helps.
$endgroup$
You are right about the fact that cross entropy is computed between 2 distributions, however, in the case of the y_tensor values, we know for sure which class the example should actually belong to which is the ground truth. So, you can think of the binary values as probability distributions over possible classes in which case the loss function is absolutely correct and the way to go for the problem. Hope that helps.
answered 9 hours ago
Sajid AhmedSajid Ahmed
314
314
add a comment |
add a comment |
raul is a new contributor. Be nice, and check out our Code of Conduct.
raul is a new contributor. Be nice, and check out our Code of Conduct.
raul is a new contributor. Be nice, and check out our Code of Conduct.
raul is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48891%2fpytorch-loss-function-for-binary-classification%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown