How to find the most important attribute for each class
$begingroup$
I have a dataset with 28 attributes and 7 class values. I want to know if its possible to find out the most important attribute(s) for deciding the class value, for each class.
For example an answer could be: Attribute 2 is most important for class 1, Attribute 6 is most important for class 2 etc. Or an even more informed answer could be: Attribute 2 being below 0.5 is most important for class 1, Attribute 6 being above 0.75 is most important for class 2 etc
My initial approach to this was to build a decision tree on the data and find the node that had the largest information gain/gain ratio for each class and that would be the most determining factor for that class. The problem with this is that the decision tree implementations I have found don't give the information gain/gain ratio for each node and as this is time bound I don't have the time to implement my own version. My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute. Is this the right direction to go down or is their a better option?
machine-learning neural-network deep-learning feature-selection multiclass-classification
$endgroup$
|
show 1 more comment
$begingroup$
I have a dataset with 28 attributes and 7 class values. I want to know if its possible to find out the most important attribute(s) for deciding the class value, for each class.
For example an answer could be: Attribute 2 is most important for class 1, Attribute 6 is most important for class 2 etc. Or an even more informed answer could be: Attribute 2 being below 0.5 is most important for class 1, Attribute 6 being above 0.75 is most important for class 2 etc
My initial approach to this was to build a decision tree on the data and find the node that had the largest information gain/gain ratio for each class and that would be the most determining factor for that class. The problem with this is that the decision tree implementations I have found don't give the information gain/gain ratio for each node and as this is time bound I don't have the time to implement my own version. My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute. Is this the right direction to go down or is their a better option?
machine-learning neural-network deep-learning feature-selection multiclass-classification
$endgroup$
2
$begingroup$
Covariance matrix for each attribute and the output.
$endgroup$
– Vaalizaadeh
Oct 16 '18 at 12:47
$begingroup$
@Media could you expand on this please? I thought a covariance matrix would only let me know the correlation between attributes. If you have a solution you could add it as an answer
$endgroup$
– Nate
Oct 16 '18 at 13:15
$begingroup$
Important in what way? That best split your data (given that you were talking about decision trees and splits)? There are ML algorithms that return feature importances, for ex. random forests.
$endgroup$
– user2974951
Oct 16 '18 at 13:37
$begingroup$
Important in the way that I want to know the attributes that have the biggest say in determining each class. I believe this means the best split of the data. As far as I know random forest doesn't return feature importance for each class, or am I wrong about this?
$endgroup$
– Nate
Oct 16 '18 at 13:52
$begingroup$
@Nate, "My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute." ---> if you're happy with that definition of "most important" then yes. Your work is to define what you mean by "most important", but your approach (classification ONE vs ALL_OTHERS) is a good one.
$endgroup$
– f.g.
Oct 16 '18 at 17:33
|
show 1 more comment
$begingroup$
I have a dataset with 28 attributes and 7 class values. I want to know if its possible to find out the most important attribute(s) for deciding the class value, for each class.
For example an answer could be: Attribute 2 is most important for class 1, Attribute 6 is most important for class 2 etc. Or an even more informed answer could be: Attribute 2 being below 0.5 is most important for class 1, Attribute 6 being above 0.75 is most important for class 2 etc
My initial approach to this was to build a decision tree on the data and find the node that had the largest information gain/gain ratio for each class and that would be the most determining factor for that class. The problem with this is that the decision tree implementations I have found don't give the information gain/gain ratio for each node and as this is time bound I don't have the time to implement my own version. My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute. Is this the right direction to go down or is their a better option?
machine-learning neural-network deep-learning feature-selection multiclass-classification
$endgroup$
I have a dataset with 28 attributes and 7 class values. I want to know if its possible to find out the most important attribute(s) for deciding the class value, for each class.
For example an answer could be: Attribute 2 is most important for class 1, Attribute 6 is most important for class 2 etc. Or an even more informed answer could be: Attribute 2 being below 0.5 is most important for class 1, Attribute 6 being above 0.75 is most important for class 2 etc
My initial approach to this was to build a decision tree on the data and find the node that had the largest information gain/gain ratio for each class and that would be the most determining factor for that class. The problem with this is that the decision tree implementations I have found don't give the information gain/gain ratio for each node and as this is time bound I don't have the time to implement my own version. My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute. Is this the right direction to go down or is their a better option?
machine-learning neural-network deep-learning feature-selection multiclass-classification
machine-learning neural-network deep-learning feature-selection multiclass-classification
edited Oct 16 '18 at 17:23
Vaalizaadeh
7,59562263
7,59562263
asked Oct 16 '18 at 12:46
NateNate
61
61
2
$begingroup$
Covariance matrix for each attribute and the output.
$endgroup$
– Vaalizaadeh
Oct 16 '18 at 12:47
$begingroup$
@Media could you expand on this please? I thought a covariance matrix would only let me know the correlation between attributes. If you have a solution you could add it as an answer
$endgroup$
– Nate
Oct 16 '18 at 13:15
$begingroup$
Important in what way? That best split your data (given that you were talking about decision trees and splits)? There are ML algorithms that return feature importances, for ex. random forests.
$endgroup$
– user2974951
Oct 16 '18 at 13:37
$begingroup$
Important in the way that I want to know the attributes that have the biggest say in determining each class. I believe this means the best split of the data. As far as I know random forest doesn't return feature importance for each class, or am I wrong about this?
$endgroup$
– Nate
Oct 16 '18 at 13:52
$begingroup$
@Nate, "My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute." ---> if you're happy with that definition of "most important" then yes. Your work is to define what you mean by "most important", but your approach (classification ONE vs ALL_OTHERS) is a good one.
$endgroup$
– f.g.
Oct 16 '18 at 17:33
|
show 1 more comment
2
$begingroup$
Covariance matrix for each attribute and the output.
$endgroup$
– Vaalizaadeh
Oct 16 '18 at 12:47
$begingroup$
@Media could you expand on this please? I thought a covariance matrix would only let me know the correlation between attributes. If you have a solution you could add it as an answer
$endgroup$
– Nate
Oct 16 '18 at 13:15
$begingroup$
Important in what way? That best split your data (given that you were talking about decision trees and splits)? There are ML algorithms that return feature importances, for ex. random forests.
$endgroup$
– user2974951
Oct 16 '18 at 13:37
$begingroup$
Important in the way that I want to know the attributes that have the biggest say in determining each class. I believe this means the best split of the data. As far as I know random forest doesn't return feature importance for each class, or am I wrong about this?
$endgroup$
– Nate
Oct 16 '18 at 13:52
$begingroup$
@Nate, "My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute." ---> if you're happy with that definition of "most important" then yes. Your work is to define what you mean by "most important", but your approach (classification ONE vs ALL_OTHERS) is a good one.
$endgroup$
– f.g.
Oct 16 '18 at 17:33
2
2
$begingroup$
Covariance matrix for each attribute and the output.
$endgroup$
– Vaalizaadeh
Oct 16 '18 at 12:47
$begingroup$
Covariance matrix for each attribute and the output.
$endgroup$
– Vaalizaadeh
Oct 16 '18 at 12:47
$begingroup$
@Media could you expand on this please? I thought a covariance matrix would only let me know the correlation between attributes. If you have a solution you could add it as an answer
$endgroup$
– Nate
Oct 16 '18 at 13:15
$begingroup$
@Media could you expand on this please? I thought a covariance matrix would only let me know the correlation between attributes. If you have a solution you could add it as an answer
$endgroup$
– Nate
Oct 16 '18 at 13:15
$begingroup$
Important in what way? That best split your data (given that you were talking about decision trees and splits)? There are ML algorithms that return feature importances, for ex. random forests.
$endgroup$
– user2974951
Oct 16 '18 at 13:37
$begingroup$
Important in what way? That best split your data (given that you were talking about decision trees and splits)? There are ML algorithms that return feature importances, for ex. random forests.
$endgroup$
– user2974951
Oct 16 '18 at 13:37
$begingroup$
Important in the way that I want to know the attributes that have the biggest say in determining each class. I believe this means the best split of the data. As far as I know random forest doesn't return feature importance for each class, or am I wrong about this?
$endgroup$
– Nate
Oct 16 '18 at 13:52
$begingroup$
Important in the way that I want to know the attributes that have the biggest say in determining each class. I believe this means the best split of the data. As far as I know random forest doesn't return feature importance for each class, or am I wrong about this?
$endgroup$
– Nate
Oct 16 '18 at 13:52
$begingroup$
@Nate, "My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute." ---> if you're happy with that definition of "most important" then yes. Your work is to define what you mean by "most important", but your approach (classification ONE vs ALL_OTHERS) is a good one.
$endgroup$
– f.g.
Oct 16 '18 at 17:33
$begingroup$
@Nate, "My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute." ---> if you're happy with that definition of "most important" then yes. Your work is to define what you mean by "most important", but your approach (classification ONE vs ALL_OTHERS) is a good one.
$endgroup$
– f.g.
Oct 16 '18 at 17:33
|
show 1 more comment
4 Answers
4
active
oldest
votes
$begingroup$
If you must split the dataset for each class then I suggest you try PCA. Principle Component Analysis is basically used for dimensionality reduction in the sense that it gives you the subset of attributes which represent the distribution of data best. You could use that for all classes and in that way get the attributes which affect the class data distribution the best.
$endgroup$
$begingroup$
Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
$endgroup$
– Nate
Oct 16 '18 at 16:12
add a comment |
$begingroup$
Or an even more informed answer could be: Attribute 2 being below 0.5
is most important for class 1, Attribute 6 being above 0.75 is most
important for class 2 etc
A way of doing this is by discretizing your continuous variables into histograms. Every bin of a histogram can thereafter be considered as a separate variable and its importance can be easily found using standard decision tree implementations such as the one in sklearn that provides a _feature_importances
attribute. This will offer you insight on the important regions of each variable.
This is demonstrated in Figure 9 in this paper.
Welcome to the site, Nate!
$endgroup$
$begingroup$
Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
$endgroup$
– Nate
Oct 16 '18 at 14:23
$begingroup$
when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
$endgroup$
– pcko1
Oct 16 '18 at 14:35
add a comment |
$begingroup$
Let's say that in this way. The easiest way to ascertain the relation among different features and the outputs is to use covariance matrix. You can even visualise the data for each class. Take a look at the following image.
Suppose that the vertical axis is the output and the horizontal axis is for one of the features. As you can see, having knowledge about the features informs us of the changes in the output. Now, consider the following illustration.
In this figure, you can see that considering this typical feature does not inform you of the changes in the output.
Another approach can be using PCA
which finds the appropriate features itself. What it does is finding a linear combination of important features which are more relevant to the output.
$endgroup$
add a comment |
$begingroup$
It is difficult to get the answer to "which one is the most important" for every class, normally, the distintion between is "between classes" not "for a specific class".
I use the feature importance from xgboost, this method allows to measure which feature participates in more trees of the boosted forest xgboost has. There is even, the possibility of plotting these importances resulting in very nice data to show.
Feature importance in XGBoost
New contributor
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f39759%2fhow-to-find-the-most-important-attribute-for-each-class%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
If you must split the dataset for each class then I suggest you try PCA. Principle Component Analysis is basically used for dimensionality reduction in the sense that it gives you the subset of attributes which represent the distribution of data best. You could use that for all classes and in that way get the attributes which affect the class data distribution the best.
$endgroup$
$begingroup$
Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
$endgroup$
– Nate
Oct 16 '18 at 16:12
add a comment |
$begingroup$
If you must split the dataset for each class then I suggest you try PCA. Principle Component Analysis is basically used for dimensionality reduction in the sense that it gives you the subset of attributes which represent the distribution of data best. You could use that for all classes and in that way get the attributes which affect the class data distribution the best.
$endgroup$
$begingroup$
Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
$endgroup$
– Nate
Oct 16 '18 at 16:12
add a comment |
$begingroup$
If you must split the dataset for each class then I suggest you try PCA. Principle Component Analysis is basically used for dimensionality reduction in the sense that it gives you the subset of attributes which represent the distribution of data best. You could use that for all classes and in that way get the attributes which affect the class data distribution the best.
$endgroup$
If you must split the dataset for each class then I suggest you try PCA. Principle Component Analysis is basically used for dimensionality reduction in the sense that it gives you the subset of attributes which represent the distribution of data best. You could use that for all classes and in that way get the attributes which affect the class data distribution the best.
answered Oct 16 '18 at 14:00
Saket Kumar SinghSaket Kumar Singh
1362
1362
$begingroup$
Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
$endgroup$
– Nate
Oct 16 '18 at 16:12
add a comment |
$begingroup$
Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
$endgroup$
– Nate
Oct 16 '18 at 16:12
$begingroup$
Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
$endgroup$
– Nate
Oct 16 '18 at 16:12
$begingroup$
Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
$endgroup$
– Nate
Oct 16 '18 at 16:12
add a comment |
$begingroup$
Or an even more informed answer could be: Attribute 2 being below 0.5
is most important for class 1, Attribute 6 being above 0.75 is most
important for class 2 etc
A way of doing this is by discretizing your continuous variables into histograms. Every bin of a histogram can thereafter be considered as a separate variable and its importance can be easily found using standard decision tree implementations such as the one in sklearn that provides a _feature_importances
attribute. This will offer you insight on the important regions of each variable.
This is demonstrated in Figure 9 in this paper.
Welcome to the site, Nate!
$endgroup$
$begingroup$
Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
$endgroup$
– Nate
Oct 16 '18 at 14:23
$begingroup$
when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
$endgroup$
– pcko1
Oct 16 '18 at 14:35
add a comment |
$begingroup$
Or an even more informed answer could be: Attribute 2 being below 0.5
is most important for class 1, Attribute 6 being above 0.75 is most
important for class 2 etc
A way of doing this is by discretizing your continuous variables into histograms. Every bin of a histogram can thereafter be considered as a separate variable and its importance can be easily found using standard decision tree implementations such as the one in sklearn that provides a _feature_importances
attribute. This will offer you insight on the important regions of each variable.
This is demonstrated in Figure 9 in this paper.
Welcome to the site, Nate!
$endgroup$
$begingroup$
Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
$endgroup$
– Nate
Oct 16 '18 at 14:23
$begingroup$
when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
$endgroup$
– pcko1
Oct 16 '18 at 14:35
add a comment |
$begingroup$
Or an even more informed answer could be: Attribute 2 being below 0.5
is most important for class 1, Attribute 6 being above 0.75 is most
important for class 2 etc
A way of doing this is by discretizing your continuous variables into histograms. Every bin of a histogram can thereafter be considered as a separate variable and its importance can be easily found using standard decision tree implementations such as the one in sklearn that provides a _feature_importances
attribute. This will offer you insight on the important regions of each variable.
This is demonstrated in Figure 9 in this paper.
Welcome to the site, Nate!
$endgroup$
Or an even more informed answer could be: Attribute 2 being below 0.5
is most important for class 1, Attribute 6 being above 0.75 is most
important for class 2 etc
A way of doing this is by discretizing your continuous variables into histograms. Every bin of a histogram can thereafter be considered as a separate variable and its importance can be easily found using standard decision tree implementations such as the one in sklearn that provides a _feature_importances
attribute. This will offer you insight on the important regions of each variable.
This is demonstrated in Figure 9 in this paper.
Welcome to the site, Nate!
answered Oct 16 '18 at 14:10
pcko1pcko1
1,681418
1,681418
$begingroup$
Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
$endgroup$
– Nate
Oct 16 '18 at 14:23
$begingroup$
when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
$endgroup$
– pcko1
Oct 16 '18 at 14:35
add a comment |
$begingroup$
Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
$endgroup$
– Nate
Oct 16 '18 at 14:23
$begingroup$
when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
$endgroup$
– pcko1
Oct 16 '18 at 14:35
$begingroup$
Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
$endgroup$
– Nate
Oct 16 '18 at 14:23
$begingroup$
Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
$endgroup$
– Nate
Oct 16 '18 at 14:23
$begingroup$
when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
$endgroup$
– pcko1
Oct 16 '18 at 14:35
$begingroup$
when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
$endgroup$
– pcko1
Oct 16 '18 at 14:35
add a comment |
$begingroup$
Let's say that in this way. The easiest way to ascertain the relation among different features and the outputs is to use covariance matrix. You can even visualise the data for each class. Take a look at the following image.
Suppose that the vertical axis is the output and the horizontal axis is for one of the features. As you can see, having knowledge about the features informs us of the changes in the output. Now, consider the following illustration.
In this figure, you can see that considering this typical feature does not inform you of the changes in the output.
Another approach can be using PCA
which finds the appropriate features itself. What it does is finding a linear combination of important features which are more relevant to the output.
$endgroup$
add a comment |
$begingroup$
Let's say that in this way. The easiest way to ascertain the relation among different features and the outputs is to use covariance matrix. You can even visualise the data for each class. Take a look at the following image.
Suppose that the vertical axis is the output and the horizontal axis is for one of the features. As you can see, having knowledge about the features informs us of the changes in the output. Now, consider the following illustration.
In this figure, you can see that considering this typical feature does not inform you of the changes in the output.
Another approach can be using PCA
which finds the appropriate features itself. What it does is finding a linear combination of important features which are more relevant to the output.
$endgroup$
add a comment |
$begingroup$
Let's say that in this way. The easiest way to ascertain the relation among different features and the outputs is to use covariance matrix. You can even visualise the data for each class. Take a look at the following image.
Suppose that the vertical axis is the output and the horizontal axis is for one of the features. As you can see, having knowledge about the features informs us of the changes in the output. Now, consider the following illustration.
In this figure, you can see that considering this typical feature does not inform you of the changes in the output.
Another approach can be using PCA
which finds the appropriate features itself. What it does is finding a linear combination of important features which are more relevant to the output.
$endgroup$
Let's say that in this way. The easiest way to ascertain the relation among different features and the outputs is to use covariance matrix. You can even visualise the data for each class. Take a look at the following image.
Suppose that the vertical axis is the output and the horizontal axis is for one of the features. As you can see, having knowledge about the features informs us of the changes in the output. Now, consider the following illustration.
In this figure, you can see that considering this typical feature does not inform you of the changes in the output.
Another approach can be using PCA
which finds the appropriate features itself. What it does is finding a linear combination of important features which are more relevant to the output.
answered Oct 16 '18 at 17:22
VaalizaadehVaalizaadeh
7,59562263
7,59562263
add a comment |
add a comment |
$begingroup$
It is difficult to get the answer to "which one is the most important" for every class, normally, the distintion between is "between classes" not "for a specific class".
I use the feature importance from xgboost, this method allows to measure which feature participates in more trees of the boosted forest xgboost has. There is even, the possibility of plotting these importances resulting in very nice data to show.
Feature importance in XGBoost
New contributor
$endgroup$
add a comment |
$begingroup$
It is difficult to get the answer to "which one is the most important" for every class, normally, the distintion between is "between classes" not "for a specific class".
I use the feature importance from xgboost, this method allows to measure which feature participates in more trees of the boosted forest xgboost has. There is even, the possibility of plotting these importances resulting in very nice data to show.
Feature importance in XGBoost
New contributor
$endgroup$
add a comment |
$begingroup$
It is difficult to get the answer to "which one is the most important" for every class, normally, the distintion between is "between classes" not "for a specific class".
I use the feature importance from xgboost, this method allows to measure which feature participates in more trees of the boosted forest xgboost has. There is even, the possibility of plotting these importances resulting in very nice data to show.
Feature importance in XGBoost
New contributor
$endgroup$
It is difficult to get the answer to "which one is the most important" for every class, normally, the distintion between is "between classes" not "for a specific class".
I use the feature importance from xgboost, this method allows to measure which feature participates in more trees of the boosted forest xgboost has. There is even, the possibility of plotting these importances resulting in very nice data to show.
Feature importance in XGBoost
New contributor
New contributor
answered 3 mins ago
Juan Esteban de la CalleJuan Esteban de la Calle
18310
18310
New contributor
New contributor
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f39759%2fhow-to-find-the-most-important-attribute-for-each-class%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
$begingroup$
Covariance matrix for each attribute and the output.
$endgroup$
– Vaalizaadeh
Oct 16 '18 at 12:47
$begingroup$
@Media could you expand on this please? I thought a covariance matrix would only let me know the correlation between attributes. If you have a solution you could add it as an answer
$endgroup$
– Nate
Oct 16 '18 at 13:15
$begingroup$
Important in what way? That best split your data (given that you were talking about decision trees and splits)? There are ML algorithms that return feature importances, for ex. random forests.
$endgroup$
– user2974951
Oct 16 '18 at 13:37
$begingroup$
Important in the way that I want to know the attributes that have the biggest say in determining each class. I believe this means the best split of the data. As far as I know random forest doesn't return feature importance for each class, or am I wrong about this?
$endgroup$
– Nate
Oct 16 '18 at 13:52
$begingroup$
@Nate, "My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute." ---> if you're happy with that definition of "most important" then yes. Your work is to define what you mean by "most important", but your approach (classification ONE vs ALL_OTHERS) is a good one.
$endgroup$
– f.g.
Oct 16 '18 at 17:33