Why is my training accuracy decreasing higher degrees of polynomial features?
$begingroup$
I am new to Machine Learning and started solving the Titanic Survivor problem on Kaggle.
While solving the problem using Logistic Regression I used various models having polynomial features with degree $2,3,4,5,6$ . Theoretically the accuracy on training set should increase with degree however it started decreasing post degree $2$ . The graph is as per below
scikit-learn logistic-regression accuracy classifier
New contributor
$endgroup$
add a comment |
$begingroup$
I am new to Machine Learning and started solving the Titanic Survivor problem on Kaggle.
While solving the problem using Logistic Regression I used various models having polynomial features with degree $2,3,4,5,6$ . Theoretically the accuracy on training set should increase with degree however it started decreasing post degree $2$ . The graph is as per below
scikit-learn logistic-regression accuracy classifier
New contributor
$endgroup$
$begingroup$
Welcome to the site! "Theoretically the accuracy on training set should increase with degree" - I disagree with this premise. Can you provide a citation or your rationale? I don't think this is a reasonable statement.
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
I read this in the Andrew NG course and logically speaking wouldn't the boundary fit more effectively if the degree of polynomial features increase ?
$endgroup$
– Apoorv Jain
10 hours ago
$begingroup$
No, not necessarily. The most common use of polynomials is when you have data that shows a correlation but isn't linear (so like an exponential curve, a parabola, etc). You can't just randomly try new polynomials, you should be trying a particular polynomial because it's better suited to the general layout of your data.
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
Could you please suggest a reading for this type of feature engineering .
$endgroup$
– Apoorv Jain
10 hours ago
add a comment |
$begingroup$
I am new to Machine Learning and started solving the Titanic Survivor problem on Kaggle.
While solving the problem using Logistic Regression I used various models having polynomial features with degree $2,3,4,5,6$ . Theoretically the accuracy on training set should increase with degree however it started decreasing post degree $2$ . The graph is as per below
scikit-learn logistic-regression accuracy classifier
New contributor
$endgroup$
I am new to Machine Learning and started solving the Titanic Survivor problem on Kaggle.
While solving the problem using Logistic Regression I used various models having polynomial features with degree $2,3,4,5,6$ . Theoretically the accuracy on training set should increase with degree however it started decreasing post degree $2$ . The graph is as per below
scikit-learn logistic-regression accuracy classifier
scikit-learn logistic-regression accuracy classifier
New contributor
New contributor
edited 10 mins ago
Siong Thye Goh
1,122418
1,122418
New contributor
asked 10 hours ago
Apoorv JainApoorv Jain
1112
1112
New contributor
New contributor
$begingroup$
Welcome to the site! "Theoretically the accuracy on training set should increase with degree" - I disagree with this premise. Can you provide a citation or your rationale? I don't think this is a reasonable statement.
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
I read this in the Andrew NG course and logically speaking wouldn't the boundary fit more effectively if the degree of polynomial features increase ?
$endgroup$
– Apoorv Jain
10 hours ago
$begingroup$
No, not necessarily. The most common use of polynomials is when you have data that shows a correlation but isn't linear (so like an exponential curve, a parabola, etc). You can't just randomly try new polynomials, you should be trying a particular polynomial because it's better suited to the general layout of your data.
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
Could you please suggest a reading for this type of feature engineering .
$endgroup$
– Apoorv Jain
10 hours ago
add a comment |
$begingroup$
Welcome to the site! "Theoretically the accuracy on training set should increase with degree" - I disagree with this premise. Can you provide a citation or your rationale? I don't think this is a reasonable statement.
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
I read this in the Andrew NG course and logically speaking wouldn't the boundary fit more effectively if the degree of polynomial features increase ?
$endgroup$
– Apoorv Jain
10 hours ago
$begingroup$
No, not necessarily. The most common use of polynomials is when you have data that shows a correlation but isn't linear (so like an exponential curve, a parabola, etc). You can't just randomly try new polynomials, you should be trying a particular polynomial because it's better suited to the general layout of your data.
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
Could you please suggest a reading for this type of feature engineering .
$endgroup$
– Apoorv Jain
10 hours ago
$begingroup$
Welcome to the site! "Theoretically the accuracy on training set should increase with degree" - I disagree with this premise. Can you provide a citation or your rationale? I don't think this is a reasonable statement.
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
Welcome to the site! "Theoretically the accuracy on training set should increase with degree" - I disagree with this premise. Can you provide a citation or your rationale? I don't think this is a reasonable statement.
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
I read this in the Andrew NG course and logically speaking wouldn't the boundary fit more effectively if the degree of polynomial features increase ?
$endgroup$
– Apoorv Jain
10 hours ago
$begingroup$
I read this in the Andrew NG course and logically speaking wouldn't the boundary fit more effectively if the degree of polynomial features increase ?
$endgroup$
– Apoorv Jain
10 hours ago
$begingroup$
No, not necessarily. The most common use of polynomials is when you have data that shows a correlation but isn't linear (so like an exponential curve, a parabola, etc). You can't just randomly try new polynomials, you should be trying a particular polynomial because it's better suited to the general layout of your data.
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
No, not necessarily. The most common use of polynomials is when you have data that shows a correlation but isn't linear (so like an exponential curve, a parabola, etc). You can't just randomly try new polynomials, you should be trying a particular polynomial because it's better suited to the general layout of your data.
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
Could you please suggest a reading for this type of feature engineering .
$endgroup$
– Apoorv Jain
10 hours ago
$begingroup$
Could you please suggest a reading for this type of feature engineering .
$endgroup$
– Apoorv Jain
10 hours ago
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
I disagree with the assertion of, "Theoretically the accuracy on training set should increase with degree". The goal of polynomial regression is not to randomly try new polynomials. The goal is to use a polynomial that better fits your data because the correlation is not linear.
Let's think about the end result of linear regression - it usually something like y = mx + b
If you show that to a data scientist, they're going to tell you it's linear regression. You show that to a math student and they will tell you its the formula for a straight line. Either way, it's just a formula for a graph. But, note that this is for a straight line and not all data is linear. So, knowing that you're just coming up with a formula, you should think about polynomial regression in the same way - what graph am I trying to draw?
If you use a scatter plot and you are seeing a correlation but that relationship is exponential, then you should use the corresponding polynomial; same goes for all of the other variations. There is no logical explanation to use a polynomial that will not draw a graph that will closely align with your data correlation.
$endgroup$
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy
$endgroup$
– Apoorv Jain
10 hours ago
1
$begingroup$
@ApoorvJain Dont start with the formula, start with your data, start with a scatterplot. What does that plot look like? What polynomial would you use to draw a similar graph? When you start thinking in those terms, then you start to think like a data scientist :-)
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy then we have a boundary comprising x,y,xy,x^2,y^2 hence the boundary represented with the above features would be of the form ax+by+cxy+dx^2+ey^2 hence we could anyway construct the same boundary as we could have with single degree features . Since loss function would take every possible boundary hence shouldn't our error with degree 2 <= degree 1
$endgroup$
– Apoorv Jain
10 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Apoorv Jain is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46052%2fwhy-is-my-training-accuracy-decreasing-higher-degrees-of-polynomial-features%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
I disagree with the assertion of, "Theoretically the accuracy on training set should increase with degree". The goal of polynomial regression is not to randomly try new polynomials. The goal is to use a polynomial that better fits your data because the correlation is not linear.
Let's think about the end result of linear regression - it usually something like y = mx + b
If you show that to a data scientist, they're going to tell you it's linear regression. You show that to a math student and they will tell you its the formula for a straight line. Either way, it's just a formula for a graph. But, note that this is for a straight line and not all data is linear. So, knowing that you're just coming up with a formula, you should think about polynomial regression in the same way - what graph am I trying to draw?
If you use a scatter plot and you are seeing a correlation but that relationship is exponential, then you should use the corresponding polynomial; same goes for all of the other variations. There is no logical explanation to use a polynomial that will not draw a graph that will closely align with your data correlation.
$endgroup$
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy
$endgroup$
– Apoorv Jain
10 hours ago
1
$begingroup$
@ApoorvJain Dont start with the formula, start with your data, start with a scatterplot. What does that plot look like? What polynomial would you use to draw a similar graph? When you start thinking in those terms, then you start to think like a data scientist :-)
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy then we have a boundary comprising x,y,xy,x^2,y^2 hence the boundary represented with the above features would be of the form ax+by+cxy+dx^2+ey^2 hence we could anyway construct the same boundary as we could have with single degree features . Since loss function would take every possible boundary hence shouldn't our error with degree 2 <= degree 1
$endgroup$
– Apoorv Jain
10 hours ago
add a comment |
$begingroup$
I disagree with the assertion of, "Theoretically the accuracy on training set should increase with degree". The goal of polynomial regression is not to randomly try new polynomials. The goal is to use a polynomial that better fits your data because the correlation is not linear.
Let's think about the end result of linear regression - it usually something like y = mx + b
If you show that to a data scientist, they're going to tell you it's linear regression. You show that to a math student and they will tell you its the formula for a straight line. Either way, it's just a formula for a graph. But, note that this is for a straight line and not all data is linear. So, knowing that you're just coming up with a formula, you should think about polynomial regression in the same way - what graph am I trying to draw?
If you use a scatter plot and you are seeing a correlation but that relationship is exponential, then you should use the corresponding polynomial; same goes for all of the other variations. There is no logical explanation to use a polynomial that will not draw a graph that will closely align with your data correlation.
$endgroup$
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy
$endgroup$
– Apoorv Jain
10 hours ago
1
$begingroup$
@ApoorvJain Dont start with the formula, start with your data, start with a scatterplot. What does that plot look like? What polynomial would you use to draw a similar graph? When you start thinking in those terms, then you start to think like a data scientist :-)
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy then we have a boundary comprising x,y,xy,x^2,y^2 hence the boundary represented with the above features would be of the form ax+by+cxy+dx^2+ey^2 hence we could anyway construct the same boundary as we could have with single degree features . Since loss function would take every possible boundary hence shouldn't our error with degree 2 <= degree 1
$endgroup$
– Apoorv Jain
10 hours ago
add a comment |
$begingroup$
I disagree with the assertion of, "Theoretically the accuracy on training set should increase with degree". The goal of polynomial regression is not to randomly try new polynomials. The goal is to use a polynomial that better fits your data because the correlation is not linear.
Let's think about the end result of linear regression - it usually something like y = mx + b
If you show that to a data scientist, they're going to tell you it's linear regression. You show that to a math student and they will tell you its the formula for a straight line. Either way, it's just a formula for a graph. But, note that this is for a straight line and not all data is linear. So, knowing that you're just coming up with a formula, you should think about polynomial regression in the same way - what graph am I trying to draw?
If you use a scatter plot and you are seeing a correlation but that relationship is exponential, then you should use the corresponding polynomial; same goes for all of the other variations. There is no logical explanation to use a polynomial that will not draw a graph that will closely align with your data correlation.
$endgroup$
I disagree with the assertion of, "Theoretically the accuracy on training set should increase with degree". The goal of polynomial regression is not to randomly try new polynomials. The goal is to use a polynomial that better fits your data because the correlation is not linear.
Let's think about the end result of linear regression - it usually something like y = mx + b
If you show that to a data scientist, they're going to tell you it's linear regression. You show that to a math student and they will tell you its the formula for a straight line. Either way, it's just a formula for a graph. But, note that this is for a straight line and not all data is linear. So, knowing that you're just coming up with a formula, you should think about polynomial regression in the same way - what graph am I trying to draw?
If you use a scatter plot and you are seeing a correlation but that relationship is exponential, then you should use the corresponding polynomial; same goes for all of the other variations. There is no logical explanation to use a polynomial that will not draw a graph that will closely align with your data correlation.
answered 10 hours ago
I_Play_With_DataI_Play_With_Data
979419
979419
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy
$endgroup$
– Apoorv Jain
10 hours ago
1
$begingroup$
@ApoorvJain Dont start with the formula, start with your data, start with a scatterplot. What does that plot look like? What polynomial would you use to draw a similar graph? When you start thinking in those terms, then you start to think like a data scientist :-)
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy then we have a boundary comprising x,y,xy,x^2,y^2 hence the boundary represented with the above features would be of the form ax+by+cxy+dx^2+ey^2 hence we could anyway construct the same boundary as we could have with single degree features . Since loss function would take every possible boundary hence shouldn't our error with degree 2 <= degree 1
$endgroup$
– Apoorv Jain
10 hours ago
add a comment |
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy
$endgroup$
– Apoorv Jain
10 hours ago
1
$begingroup$
@ApoorvJain Dont start with the formula, start with your data, start with a scatterplot. What does that plot look like? What polynomial would you use to draw a similar graph? When you start thinking in those terms, then you start to think like a data scientist :-)
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy then we have a boundary comprising x,y,xy,x^2,y^2 hence the boundary represented with the above features would be of the form ax+by+cxy+dx^2+ey^2 hence we could anyway construct the same boundary as we could have with single degree features . Since loss function would take every possible boundary hence shouldn't our error with degree 2 <= degree 1
$endgroup$
– Apoorv Jain
10 hours ago
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy
$endgroup$
– Apoorv Jain
10 hours ago
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy
$endgroup$
– Apoorv Jain
10 hours ago
1
1
$begingroup$
@ApoorvJain Dont start with the formula, start with your data, start with a scatterplot. What does that plot look like? What polynomial would you use to draw a similar graph? When you start thinking in those terms, then you start to think like a data scientist :-)
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
@ApoorvJain Dont start with the formula, start with your data, start with a scatterplot. What does that plot look like? What polynomial would you use to draw a similar graph? When you start thinking in those terms, then you start to think like a data scientist :-)
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy then we have a boundary comprising x,y,xy,x^2,y^2 hence the boundary represented with the above features would be of the form ax+by+cxy+dx^2+ey^2 hence we could anyway construct the same boundary as we could have with single degree features . Since loss function would take every possible boundary hence shouldn't our error with degree 2 <= degree 1
$endgroup$
– Apoorv Jain
10 hours ago
$begingroup$
Let's say my initial features were x,y hence I have degree 1 .Now lets say we come up with polynomial features of degree 2 ie x^2, y^2, xy then we have a boundary comprising x,y,xy,x^2,y^2 hence the boundary represented with the above features would be of the form ax+by+cxy+dx^2+ey^2 hence we could anyway construct the same boundary as we could have with single degree features . Since loss function would take every possible boundary hence shouldn't our error with degree 2 <= degree 1
$endgroup$
– Apoorv Jain
10 hours ago
add a comment |
Apoorv Jain is a new contributor. Be nice, and check out our Code of Conduct.
Apoorv Jain is a new contributor. Be nice, and check out our Code of Conduct.
Apoorv Jain is a new contributor. Be nice, and check out our Code of Conduct.
Apoorv Jain is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46052%2fwhy-is-my-training-accuracy-decreasing-higher-degrees-of-polynomial-features%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Welcome to the site! "Theoretically the accuracy on training set should increase with degree" - I disagree with this premise. Can you provide a citation or your rationale? I don't think this is a reasonable statement.
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
I read this in the Andrew NG course and logically speaking wouldn't the boundary fit more effectively if the degree of polynomial features increase ?
$endgroup$
– Apoorv Jain
10 hours ago
$begingroup$
No, not necessarily. The most common use of polynomials is when you have data that shows a correlation but isn't linear (so like an exponential curve, a parabola, etc). You can't just randomly try new polynomials, you should be trying a particular polynomial because it's better suited to the general layout of your data.
$endgroup$
– I_Play_With_Data
10 hours ago
$begingroup$
Could you please suggest a reading for this type of feature engineering .
$endgroup$
– Apoorv Jain
10 hours ago