In SVM Algorithm, why vector w is orthogonal to the separating hyperplane?
$begingroup$
I am a beginner on Machine Learning.
In SVM, the separating hyperplane is defined as $y = w^T x + b$.
Why we say vector $w$ orthogonal to the separating hyperplane?
machine-learning svm
$endgroup$
add a comment |
$begingroup$
I am a beginner on Machine Learning.
In SVM, the separating hyperplane is defined as $y = w^T x + b$.
Why we say vector $w$ orthogonal to the separating hyperplane?
machine-learning svm
$endgroup$
2
$begingroup$
An answer to a similar question (for neural networks) is here.
$endgroup$
– bogatron
Jun 9 '15 at 16:39
$begingroup$
@bogatron - I agree with you completely. But my ones just a SVM specific answer.
$endgroup$
– untitledprogrammer
Jun 10 '15 at 19:43
2
$begingroup$
Except it isn't. Your answer is correct but there is nothing about it that is specific to SVMs (nor should there be). $w^{T}x=b$ is simply a vector equation that defines a hyperplane.
$endgroup$
– bogatron
Jun 10 '15 at 22:01
add a comment |
$begingroup$
I am a beginner on Machine Learning.
In SVM, the separating hyperplane is defined as $y = w^T x + b$.
Why we say vector $w$ orthogonal to the separating hyperplane?
machine-learning svm
$endgroup$
I am a beginner on Machine Learning.
In SVM, the separating hyperplane is defined as $y = w^T x + b$.
Why we say vector $w$ orthogonal to the separating hyperplane?
machine-learning svm
machine-learning svm
edited Jun 10 '15 at 11:45
Nitesh
1,1451721
1,1451721
asked Jun 9 '15 at 14:39
Chong ZhengChong Zheng
4913
4913
2
$begingroup$
An answer to a similar question (for neural networks) is here.
$endgroup$
– bogatron
Jun 9 '15 at 16:39
$begingroup$
@bogatron - I agree with you completely. But my ones just a SVM specific answer.
$endgroup$
– untitledprogrammer
Jun 10 '15 at 19:43
2
$begingroup$
Except it isn't. Your answer is correct but there is nothing about it that is specific to SVMs (nor should there be). $w^{T}x=b$ is simply a vector equation that defines a hyperplane.
$endgroup$
– bogatron
Jun 10 '15 at 22:01
add a comment |
2
$begingroup$
An answer to a similar question (for neural networks) is here.
$endgroup$
– bogatron
Jun 9 '15 at 16:39
$begingroup$
@bogatron - I agree with you completely. But my ones just a SVM specific answer.
$endgroup$
– untitledprogrammer
Jun 10 '15 at 19:43
2
$begingroup$
Except it isn't. Your answer is correct but there is nothing about it that is specific to SVMs (nor should there be). $w^{T}x=b$ is simply a vector equation that defines a hyperplane.
$endgroup$
– bogatron
Jun 10 '15 at 22:01
2
2
$begingroup$
An answer to a similar question (for neural networks) is here.
$endgroup$
– bogatron
Jun 9 '15 at 16:39
$begingroup$
An answer to a similar question (for neural networks) is here.
$endgroup$
– bogatron
Jun 9 '15 at 16:39
$begingroup$
@bogatron - I agree with you completely. But my ones just a SVM specific answer.
$endgroup$
– untitledprogrammer
Jun 10 '15 at 19:43
$begingroup$
@bogatron - I agree with you completely. But my ones just a SVM specific answer.
$endgroup$
– untitledprogrammer
Jun 10 '15 at 19:43
2
2
$begingroup$
Except it isn't. Your answer is correct but there is nothing about it that is specific to SVMs (nor should there be). $w^{T}x=b$ is simply a vector equation that defines a hyperplane.
$endgroup$
– bogatron
Jun 10 '15 at 22:01
$begingroup$
Except it isn't. Your answer is correct but there is nothing about it that is specific to SVMs (nor should there be). $w^{T}x=b$ is simply a vector equation that defines a hyperplane.
$endgroup$
– bogatron
Jun 10 '15 at 22:01
add a comment |
4 Answers
4
active
oldest
votes
$begingroup$
Geometrically, the vector w is directed orthogonal to the line defined by $w^{T} x = b$. This can be understood as follows:
First take $b = 0$. Now it is clear that all vectors, $x$, with vanishing inner product with $w$ satisfy this equation, i.e. all vectors orthogonal to w satisfy this equation.
Now translate the hyperplane away from the origin over a vector a. The equation for the plane now becomes: $(x − a)^{T} w = 0$, i.e. we find that for the offset $b = a^{T} w$, which is the projection of the vector $a$ onto the vector $w$.
Without loss of generality we may thus choose a perpendicular to the plane, in which case the length $vertvert a vertvert = vert b vert /vertvert wvertvert$ which represents the shortest, orthogonal distance between the origin and the hyperplane.
Hence the vector $w$ is said to be orthogonal to the separating hyperplane.
$endgroup$
add a comment |
$begingroup$
The reason why $w$ is normal to the hyper-plane is because we define it to be that way:
Suppose that we have a (hyper)plane in 3d space. Let $P_0$ be a point on this plane i.e. $P_0 = x_0, y_0, z_0$. Therefore the vector from the origin $(0,0,0)$ to this point is just $<x_0,y_0,z_0>$. Suppose that we have an arbitrary point $P (x,y,z)$ on the plane. The vector joining $P$ and $P_0$ is then given by:
$$ vec{P} - vec{P_0} = <x-x_0, y-y_0, z-z_0>$$
Note that this vector lies in the plane.
Now let $hat{n}$ be the normal (orthogonal) vector to the plane. Therefore:
$$ hat{n} bullet (vec{P}-vec{P_0}) = 0$$
Therefore:
$$hat{n} bullet vec{P}- hat{n} bullet vec{P_0} = 0$$
Note that $-hat{n} bullet vec{P_0}$ is just a number and is equal to $b$ in our case, whereas $hat{n}$ is just $w$ and $vec{P}$ is $x$. So by definition, $w$ is orthogonal to the hyperplane.
$endgroup$
add a comment |
$begingroup$
Using the algebraic definition of a vector being orthogonal to a hyperplane:
$forall x_1, x_2$ on the separating hyperplane,
$$ w^T(x_1-x_2)=(w^Tx_1 + b)-(w^Tx_2 + b)=0-0=0 smallBox.$$
$endgroup$
add a comment |
$begingroup$
Let the decision boundary be defined as $w^Tx + b = 0$. Consider the points $x_a$ and $x_b$, which lie on the decision boundary. This gives us two equations:
begin{equation}
w^Tx_a + b = 0 \
w^Tx_b + b = 0
end{equation}
Subtracting these two equations gives us $w^T.(x_a - x_b) = 0$. Note that the vector $x_a - x_b$ lies on the decision boundary, and it is directed from $x_b$ to $x_a$. Since the dot product $w^T.(x_a - x_b)$ is zero, $w^T$ must be orthogonal to $x_a - x_b$, and in turn, to the decision boundary.
New contributor
adityagaydhani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f6054%2fin-svm-algorithm-why-vector-w-is-orthogonal-to-the-separating-hyperplane%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Geometrically, the vector w is directed orthogonal to the line defined by $w^{T} x = b$. This can be understood as follows:
First take $b = 0$. Now it is clear that all vectors, $x$, with vanishing inner product with $w$ satisfy this equation, i.e. all vectors orthogonal to w satisfy this equation.
Now translate the hyperplane away from the origin over a vector a. The equation for the plane now becomes: $(x − a)^{T} w = 0$, i.e. we find that for the offset $b = a^{T} w$, which is the projection of the vector $a$ onto the vector $w$.
Without loss of generality we may thus choose a perpendicular to the plane, in which case the length $vertvert a vertvert = vert b vert /vertvert wvertvert$ which represents the shortest, orthogonal distance between the origin and the hyperplane.
Hence the vector $w$ is said to be orthogonal to the separating hyperplane.
$endgroup$
add a comment |
$begingroup$
Geometrically, the vector w is directed orthogonal to the line defined by $w^{T} x = b$. This can be understood as follows:
First take $b = 0$. Now it is clear that all vectors, $x$, with vanishing inner product with $w$ satisfy this equation, i.e. all vectors orthogonal to w satisfy this equation.
Now translate the hyperplane away from the origin over a vector a. The equation for the plane now becomes: $(x − a)^{T} w = 0$, i.e. we find that for the offset $b = a^{T} w$, which is the projection of the vector $a$ onto the vector $w$.
Without loss of generality we may thus choose a perpendicular to the plane, in which case the length $vertvert a vertvert = vert b vert /vertvert wvertvert$ which represents the shortest, orthogonal distance between the origin and the hyperplane.
Hence the vector $w$ is said to be orthogonal to the separating hyperplane.
$endgroup$
add a comment |
$begingroup$
Geometrically, the vector w is directed orthogonal to the line defined by $w^{T} x = b$. This can be understood as follows:
First take $b = 0$. Now it is clear that all vectors, $x$, with vanishing inner product with $w$ satisfy this equation, i.e. all vectors orthogonal to w satisfy this equation.
Now translate the hyperplane away from the origin over a vector a. The equation for the plane now becomes: $(x − a)^{T} w = 0$, i.e. we find that for the offset $b = a^{T} w$, which is the projection of the vector $a$ onto the vector $w$.
Without loss of generality we may thus choose a perpendicular to the plane, in which case the length $vertvert a vertvert = vert b vert /vertvert wvertvert$ which represents the shortest, orthogonal distance between the origin and the hyperplane.
Hence the vector $w$ is said to be orthogonal to the separating hyperplane.
$endgroup$
Geometrically, the vector w is directed orthogonal to the line defined by $w^{T} x = b$. This can be understood as follows:
First take $b = 0$. Now it is clear that all vectors, $x$, with vanishing inner product with $w$ satisfy this equation, i.e. all vectors orthogonal to w satisfy this equation.
Now translate the hyperplane away from the origin over a vector a. The equation for the plane now becomes: $(x − a)^{T} w = 0$, i.e. we find that for the offset $b = a^{T} w$, which is the projection of the vector $a$ onto the vector $w$.
Without loss of generality we may thus choose a perpendicular to the plane, in which case the length $vertvert a vertvert = vert b vert /vertvert wvertvert$ which represents the shortest, orthogonal distance between the origin and the hyperplane.
Hence the vector $w$ is said to be orthogonal to the separating hyperplane.
edited Jun 18 '18 at 21:32
Community♦
1
1
answered Jun 9 '15 at 15:12
untitledprogrammeruntitledprogrammer
581216
581216
add a comment |
add a comment |
$begingroup$
The reason why $w$ is normal to the hyper-plane is because we define it to be that way:
Suppose that we have a (hyper)plane in 3d space. Let $P_0$ be a point on this plane i.e. $P_0 = x_0, y_0, z_0$. Therefore the vector from the origin $(0,0,0)$ to this point is just $<x_0,y_0,z_0>$. Suppose that we have an arbitrary point $P (x,y,z)$ on the plane. The vector joining $P$ and $P_0$ is then given by:
$$ vec{P} - vec{P_0} = <x-x_0, y-y_0, z-z_0>$$
Note that this vector lies in the plane.
Now let $hat{n}$ be the normal (orthogonal) vector to the plane. Therefore:
$$ hat{n} bullet (vec{P}-vec{P_0}) = 0$$
Therefore:
$$hat{n} bullet vec{P}- hat{n} bullet vec{P_0} = 0$$
Note that $-hat{n} bullet vec{P_0}$ is just a number and is equal to $b$ in our case, whereas $hat{n}$ is just $w$ and $vec{P}$ is $x$. So by definition, $w$ is orthogonal to the hyperplane.
$endgroup$
add a comment |
$begingroup$
The reason why $w$ is normal to the hyper-plane is because we define it to be that way:
Suppose that we have a (hyper)plane in 3d space. Let $P_0$ be a point on this plane i.e. $P_0 = x_0, y_0, z_0$. Therefore the vector from the origin $(0,0,0)$ to this point is just $<x_0,y_0,z_0>$. Suppose that we have an arbitrary point $P (x,y,z)$ on the plane. The vector joining $P$ and $P_0$ is then given by:
$$ vec{P} - vec{P_0} = <x-x_0, y-y_0, z-z_0>$$
Note that this vector lies in the plane.
Now let $hat{n}$ be the normal (orthogonal) vector to the plane. Therefore:
$$ hat{n} bullet (vec{P}-vec{P_0}) = 0$$
Therefore:
$$hat{n} bullet vec{P}- hat{n} bullet vec{P_0} = 0$$
Note that $-hat{n} bullet vec{P_0}$ is just a number and is equal to $b$ in our case, whereas $hat{n}$ is just $w$ and $vec{P}$ is $x$. So by definition, $w$ is orthogonal to the hyperplane.
$endgroup$
add a comment |
$begingroup$
The reason why $w$ is normal to the hyper-plane is because we define it to be that way:
Suppose that we have a (hyper)plane in 3d space. Let $P_0$ be a point on this plane i.e. $P_0 = x_0, y_0, z_0$. Therefore the vector from the origin $(0,0,0)$ to this point is just $<x_0,y_0,z_0>$. Suppose that we have an arbitrary point $P (x,y,z)$ on the plane. The vector joining $P$ and $P_0$ is then given by:
$$ vec{P} - vec{P_0} = <x-x_0, y-y_0, z-z_0>$$
Note that this vector lies in the plane.
Now let $hat{n}$ be the normal (orthogonal) vector to the plane. Therefore:
$$ hat{n} bullet (vec{P}-vec{P_0}) = 0$$
Therefore:
$$hat{n} bullet vec{P}- hat{n} bullet vec{P_0} = 0$$
Note that $-hat{n} bullet vec{P_0}$ is just a number and is equal to $b$ in our case, whereas $hat{n}$ is just $w$ and $vec{P}$ is $x$. So by definition, $w$ is orthogonal to the hyperplane.
$endgroup$
The reason why $w$ is normal to the hyper-plane is because we define it to be that way:
Suppose that we have a (hyper)plane in 3d space. Let $P_0$ be a point on this plane i.e. $P_0 = x_0, y_0, z_0$. Therefore the vector from the origin $(0,0,0)$ to this point is just $<x_0,y_0,z_0>$. Suppose that we have an arbitrary point $P (x,y,z)$ on the plane. The vector joining $P$ and $P_0$ is then given by:
$$ vec{P} - vec{P_0} = <x-x_0, y-y_0, z-z_0>$$
Note that this vector lies in the plane.
Now let $hat{n}$ be the normal (orthogonal) vector to the plane. Therefore:
$$ hat{n} bullet (vec{P}-vec{P_0}) = 0$$
Therefore:
$$hat{n} bullet vec{P}- hat{n} bullet vec{P_0} = 0$$
Note that $-hat{n} bullet vec{P_0}$ is just a number and is equal to $b$ in our case, whereas $hat{n}$ is just $w$ and $vec{P}$ is $x$. So by definition, $w$ is orthogonal to the hyperplane.
answered Sep 4 '18 at 14:09
Shehryar MalikShehryar Malik
112
112
add a comment |
add a comment |
$begingroup$
Using the algebraic definition of a vector being orthogonal to a hyperplane:
$forall x_1, x_2$ on the separating hyperplane,
$$ w^T(x_1-x_2)=(w^Tx_1 + b)-(w^Tx_2 + b)=0-0=0 smallBox.$$
$endgroup$
add a comment |
$begingroup$
Using the algebraic definition of a vector being orthogonal to a hyperplane:
$forall x_1, x_2$ on the separating hyperplane,
$$ w^T(x_1-x_2)=(w^Tx_1 + b)-(w^Tx_2 + b)=0-0=0 smallBox.$$
$endgroup$
add a comment |
$begingroup$
Using the algebraic definition of a vector being orthogonal to a hyperplane:
$forall x_1, x_2$ on the separating hyperplane,
$$ w^T(x_1-x_2)=(w^Tx_1 + b)-(w^Tx_2 + b)=0-0=0 smallBox.$$
$endgroup$
Using the algebraic definition of a vector being orthogonal to a hyperplane:
$forall x_1, x_2$ on the separating hyperplane,
$$ w^T(x_1-x_2)=(w^Tx_1 + b)-(w^Tx_2 + b)=0-0=0 smallBox.$$
answered Feb 17 '18 at 0:11
IndominusIndominus
1105
1105
add a comment |
add a comment |
$begingroup$
Let the decision boundary be defined as $w^Tx + b = 0$. Consider the points $x_a$ and $x_b$, which lie on the decision boundary. This gives us two equations:
begin{equation}
w^Tx_a + b = 0 \
w^Tx_b + b = 0
end{equation}
Subtracting these two equations gives us $w^T.(x_a - x_b) = 0$. Note that the vector $x_a - x_b$ lies on the decision boundary, and it is directed from $x_b$ to $x_a$. Since the dot product $w^T.(x_a - x_b)$ is zero, $w^T$ must be orthogonal to $x_a - x_b$, and in turn, to the decision boundary.
New contributor
adityagaydhani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
Let the decision boundary be defined as $w^Tx + b = 0$. Consider the points $x_a$ and $x_b$, which lie on the decision boundary. This gives us two equations:
begin{equation}
w^Tx_a + b = 0 \
w^Tx_b + b = 0
end{equation}
Subtracting these two equations gives us $w^T.(x_a - x_b) = 0$. Note that the vector $x_a - x_b$ lies on the decision boundary, and it is directed from $x_b$ to $x_a$. Since the dot product $w^T.(x_a - x_b)$ is zero, $w^T$ must be orthogonal to $x_a - x_b$, and in turn, to the decision boundary.
New contributor
adityagaydhani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
add a comment |
$begingroup$
Let the decision boundary be defined as $w^Tx + b = 0$. Consider the points $x_a$ and $x_b$, which lie on the decision boundary. This gives us two equations:
begin{equation}
w^Tx_a + b = 0 \
w^Tx_b + b = 0
end{equation}
Subtracting these two equations gives us $w^T.(x_a - x_b) = 0$. Note that the vector $x_a - x_b$ lies on the decision boundary, and it is directed from $x_b$ to $x_a$. Since the dot product $w^T.(x_a - x_b)$ is zero, $w^T$ must be orthogonal to $x_a - x_b$, and in turn, to the decision boundary.
New contributor
adityagaydhani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
$endgroup$
Let the decision boundary be defined as $w^Tx + b = 0$. Consider the points $x_a$ and $x_b$, which lie on the decision boundary. This gives us two equations:
begin{equation}
w^Tx_a + b = 0 \
w^Tx_b + b = 0
end{equation}
Subtracting these two equations gives us $w^T.(x_a - x_b) = 0$. Note that the vector $x_a - x_b$ lies on the decision boundary, and it is directed from $x_b$ to $x_a$. Since the dot product $w^T.(x_a - x_b)$ is zero, $w^T$ must be orthogonal to $x_a - x_b$, and in turn, to the decision boundary.
New contributor
adityagaydhani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
adityagaydhani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered 10 mins ago
adityagaydhaniadityagaydhani
12
12
New contributor
adityagaydhani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
adityagaydhani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
adityagaydhani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f6054%2fin-svm-algorithm-why-vector-w-is-orthogonal-to-the-separating-hyperplane%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
$begingroup$
An answer to a similar question (for neural networks) is here.
$endgroup$
– bogatron
Jun 9 '15 at 16:39
$begingroup$
@bogatron - I agree with you completely. But my ones just a SVM specific answer.
$endgroup$
– untitledprogrammer
Jun 10 '15 at 19:43
2
$begingroup$
Except it isn't. Your answer is correct but there is nothing about it that is specific to SVMs (nor should there be). $w^{T}x=b$ is simply a vector equation that defines a hyperplane.
$endgroup$
– bogatron
Jun 10 '15 at 22:01