How to find the most important attribute for each class












1












$begingroup$


I have a dataset with 28 attributes and 7 class values. I want to know if its possible to find out the most important attribute(s) for deciding the class value, for each class.



For example an answer could be: Attribute 2 is most important for class 1, Attribute 6 is most important for class 2 etc. Or an even more informed answer could be: Attribute 2 being below 0.5 is most important for class 1, Attribute 6 being above 0.75 is most important for class 2 etc



My initial approach to this was to build a decision tree on the data and find the node that had the largest information gain/gain ratio for each class and that would be the most determining factor for that class. The problem with this is that the decision tree implementations I have found don't give the information gain/gain ratio for each node and as this is time bound I don't have the time to implement my own version. My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute. Is this the right direction to go down or is their a better option?










share|improve this question











$endgroup$








  • 2




    $begingroup$
    Covariance matrix for each attribute and the output.
    $endgroup$
    – Vaalizaadeh
    Oct 16 '18 at 12:47










  • $begingroup$
    @Media could you expand on this please? I thought a covariance matrix would only let me know the correlation between attributes. If you have a solution you could add it as an answer
    $endgroup$
    – Nate
    Oct 16 '18 at 13:15










  • $begingroup$
    Important in what way? That best split your data (given that you were talking about decision trees and splits)? There are ML algorithms that return feature importances, for ex. random forests.
    $endgroup$
    – user2974951
    Oct 16 '18 at 13:37










  • $begingroup$
    Important in the way that I want to know the attributes that have the biggest say in determining each class. I believe this means the best split of the data. As far as I know random forest doesn't return feature importance for each class, or am I wrong about this?
    $endgroup$
    – Nate
    Oct 16 '18 at 13:52










  • $begingroup$
    @Nate, "My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute." ---> if you're happy with that definition of "most important" then yes. Your work is to define what you mean by "most important", but your approach (classification ONE vs ALL_OTHERS) is a good one.
    $endgroup$
    – f.g.
    Oct 16 '18 at 17:33
















1












$begingroup$


I have a dataset with 28 attributes and 7 class values. I want to know if its possible to find out the most important attribute(s) for deciding the class value, for each class.



For example an answer could be: Attribute 2 is most important for class 1, Attribute 6 is most important for class 2 etc. Or an even more informed answer could be: Attribute 2 being below 0.5 is most important for class 1, Attribute 6 being above 0.75 is most important for class 2 etc



My initial approach to this was to build a decision tree on the data and find the node that had the largest information gain/gain ratio for each class and that would be the most determining factor for that class. The problem with this is that the decision tree implementations I have found don't give the information gain/gain ratio for each node and as this is time bound I don't have the time to implement my own version. My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute. Is this the right direction to go down or is their a better option?










share|improve this question











$endgroup$








  • 2




    $begingroup$
    Covariance matrix for each attribute and the output.
    $endgroup$
    – Vaalizaadeh
    Oct 16 '18 at 12:47










  • $begingroup$
    @Media could you expand on this please? I thought a covariance matrix would only let me know the correlation between attributes. If you have a solution you could add it as an answer
    $endgroup$
    – Nate
    Oct 16 '18 at 13:15










  • $begingroup$
    Important in what way? That best split your data (given that you were talking about decision trees and splits)? There are ML algorithms that return feature importances, for ex. random forests.
    $endgroup$
    – user2974951
    Oct 16 '18 at 13:37










  • $begingroup$
    Important in the way that I want to know the attributes that have the biggest say in determining each class. I believe this means the best split of the data. As far as I know random forest doesn't return feature importance for each class, or am I wrong about this?
    $endgroup$
    – Nate
    Oct 16 '18 at 13:52










  • $begingroup$
    @Nate, "My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute." ---> if you're happy with that definition of "most important" then yes. Your work is to define what you mean by "most important", but your approach (classification ONE vs ALL_OTHERS) is a good one.
    $endgroup$
    – f.g.
    Oct 16 '18 at 17:33














1












1








1





$begingroup$


I have a dataset with 28 attributes and 7 class values. I want to know if its possible to find out the most important attribute(s) for deciding the class value, for each class.



For example an answer could be: Attribute 2 is most important for class 1, Attribute 6 is most important for class 2 etc. Or an even more informed answer could be: Attribute 2 being below 0.5 is most important for class 1, Attribute 6 being above 0.75 is most important for class 2 etc



My initial approach to this was to build a decision tree on the data and find the node that had the largest information gain/gain ratio for each class and that would be the most determining factor for that class. The problem with this is that the decision tree implementations I have found don't give the information gain/gain ratio for each node and as this is time bound I don't have the time to implement my own version. My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute. Is this the right direction to go down or is their a better option?










share|improve this question











$endgroup$




I have a dataset with 28 attributes and 7 class values. I want to know if its possible to find out the most important attribute(s) for deciding the class value, for each class.



For example an answer could be: Attribute 2 is most important for class 1, Attribute 6 is most important for class 2 etc. Or an even more informed answer could be: Attribute 2 being below 0.5 is most important for class 1, Attribute 6 being above 0.75 is most important for class 2 etc



My initial approach to this was to build a decision tree on the data and find the node that had the largest information gain/gain ratio for each class and that would be the most determining factor for that class. The problem with this is that the decision tree implementations I have found don't give the information gain/gain ratio for each node and as this is time bound I don't have the time to implement my own version. My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute. Is this the right direction to go down or is their a better option?







machine-learning neural-network deep-learning feature-selection multiclass-classification






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Oct 16 '18 at 17:23









Vaalizaadeh

7,59562263




7,59562263










asked Oct 16 '18 at 12:46









NateNate

61




61








  • 2




    $begingroup$
    Covariance matrix for each attribute and the output.
    $endgroup$
    – Vaalizaadeh
    Oct 16 '18 at 12:47










  • $begingroup$
    @Media could you expand on this please? I thought a covariance matrix would only let me know the correlation between attributes. If you have a solution you could add it as an answer
    $endgroup$
    – Nate
    Oct 16 '18 at 13:15










  • $begingroup$
    Important in what way? That best split your data (given that you were talking about decision trees and splits)? There are ML algorithms that return feature importances, for ex. random forests.
    $endgroup$
    – user2974951
    Oct 16 '18 at 13:37










  • $begingroup$
    Important in the way that I want to know the attributes that have the biggest say in determining each class. I believe this means the best split of the data. As far as I know random forest doesn't return feature importance for each class, or am I wrong about this?
    $endgroup$
    – Nate
    Oct 16 '18 at 13:52










  • $begingroup$
    @Nate, "My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute." ---> if you're happy with that definition of "most important" then yes. Your work is to define what you mean by "most important", but your approach (classification ONE vs ALL_OTHERS) is a good one.
    $endgroup$
    – f.g.
    Oct 16 '18 at 17:33














  • 2




    $begingroup$
    Covariance matrix for each attribute and the output.
    $endgroup$
    – Vaalizaadeh
    Oct 16 '18 at 12:47










  • $begingroup$
    @Media could you expand on this please? I thought a covariance matrix would only let me know the correlation between attributes. If you have a solution you could add it as an answer
    $endgroup$
    – Nate
    Oct 16 '18 at 13:15










  • $begingroup$
    Important in what way? That best split your data (given that you were talking about decision trees and splits)? There are ML algorithms that return feature importances, for ex. random forests.
    $endgroup$
    – user2974951
    Oct 16 '18 at 13:37










  • $begingroup$
    Important in the way that I want to know the attributes that have the biggest say in determining each class. I believe this means the best split of the data. As far as I know random forest doesn't return feature importance for each class, or am I wrong about this?
    $endgroup$
    – Nate
    Oct 16 '18 at 13:52










  • $begingroup$
    @Nate, "My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute." ---> if you're happy with that definition of "most important" then yes. Your work is to define what you mean by "most important", but your approach (classification ONE vs ALL_OTHERS) is a good one.
    $endgroup$
    – f.g.
    Oct 16 '18 at 17:33








2




2




$begingroup$
Covariance matrix for each attribute and the output.
$endgroup$
– Vaalizaadeh
Oct 16 '18 at 12:47




$begingroup$
Covariance matrix for each attribute and the output.
$endgroup$
– Vaalizaadeh
Oct 16 '18 at 12:47












$begingroup$
@Media could you expand on this please? I thought a covariance matrix would only let me know the correlation between attributes. If you have a solution you could add it as an answer
$endgroup$
– Nate
Oct 16 '18 at 13:15




$begingroup$
@Media could you expand on this please? I thought a covariance matrix would only let me know the correlation between attributes. If you have a solution you could add it as an answer
$endgroup$
– Nate
Oct 16 '18 at 13:15












$begingroup$
Important in what way? That best split your data (given that you were talking about decision trees and splits)? There are ML algorithms that return feature importances, for ex. random forests.
$endgroup$
– user2974951
Oct 16 '18 at 13:37




$begingroup$
Important in what way? That best split your data (given that you were talking about decision trees and splits)? There are ML algorithms that return feature importances, for ex. random forests.
$endgroup$
– user2974951
Oct 16 '18 at 13:37












$begingroup$
Important in the way that I want to know the attributes that have the biggest say in determining each class. I believe this means the best split of the data. As far as I know random forest doesn't return feature importance for each class, or am I wrong about this?
$endgroup$
– Nate
Oct 16 '18 at 13:52




$begingroup$
Important in the way that I want to know the attributes that have the biggest say in determining each class. I believe this means the best split of the data. As far as I know random forest doesn't return feature importance for each class, or am I wrong about this?
$endgroup$
– Nate
Oct 16 '18 at 13:52












$begingroup$
@Nate, "My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute." ---> if you're happy with that definition of "most important" then yes. Your work is to define what you mean by "most important", but your approach (classification ONE vs ALL_OTHERS) is a good one.
$endgroup$
– f.g.
Oct 16 '18 at 17:33




$begingroup$
@Nate, "My current thought is to create multiple datasets which are all one class vs the rest and then perform attribute selection (eg. information gain) on them to find the most important attribute." ---> if you're happy with that definition of "most important" then yes. Your work is to define what you mean by "most important", but your approach (classification ONE vs ALL_OTHERS) is a good one.
$endgroup$
– f.g.
Oct 16 '18 at 17:33










4 Answers
4






active

oldest

votes


















1












$begingroup$

If you must split the dataset for each class then I suggest you try PCA. Principle Component Analysis is basically used for dimensionality reduction in the sense that it gives you the subset of attributes which represent the distribution of data best. You could use that for all classes and in that way get the attributes which affect the class data distribution the best.






share|improve this answer









$endgroup$













  • $begingroup$
    Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
    $endgroup$
    – Nate
    Oct 16 '18 at 16:12



















0












$begingroup$


Or an even more informed answer could be: Attribute 2 being below 0.5
is most important for class 1, Attribute 6 being above 0.75 is most
important for class 2 etc




A way of doing this is by discretizing your continuous variables into histograms. Every bin of a histogram can thereafter be considered as a separate variable and its importance can be easily found using standard decision tree implementations such as the one in sklearn that provides a _feature_importances attribute. This will offer you insight on the important regions of each variable.



This is demonstrated in Figure 9 in this paper.



Welcome to the site, Nate!






share|improve this answer









$endgroup$













  • $begingroup$
    Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
    $endgroup$
    – Nate
    Oct 16 '18 at 14:23










  • $begingroup$
    when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
    $endgroup$
    – pcko1
    Oct 16 '18 at 14:35





















0












$begingroup$

Let's say that in this way. The easiest way to ascertain the relation among different features and the outputs is to use covariance matrix. You can even visualise the data for each class. Take a look at the following image.



enter image description here



Suppose that the vertical axis is the output and the horizontal axis is for one of the features. As you can see, having knowledge about the features informs us of the changes in the output. Now, consider the following illustration.



enter image description here



In this figure, you can see that considering this typical feature does not inform you of the changes in the output.



Another approach can be using PCA which finds the appropriate features itself. What it does is finding a linear combination of important features which are more relevant to the output.






share|improve this answer









$endgroup$





















    0












    $begingroup$

    It is difficult to get the answer to "which one is the most important" for every class, normally, the distintion between is "between classes" not "for a specific class".



    I use the feature importance from xgboost, this method allows to measure which feature participates in more trees of the boosted forest xgboost has. There is even, the possibility of plotting these importances resulting in very nice data to show.



    Feature importance in XGBoost





    share








    New contributor




    Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






    $endgroup$














      Your Answer








      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "557"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f39759%2fhow-to-find-the-most-important-attribute-for-each-class%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      4 Answers
      4






      active

      oldest

      votes








      4 Answers
      4






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1












      $begingroup$

      If you must split the dataset for each class then I suggest you try PCA. Principle Component Analysis is basically used for dimensionality reduction in the sense that it gives you the subset of attributes which represent the distribution of data best. You could use that for all classes and in that way get the attributes which affect the class data distribution the best.






      share|improve this answer









      $endgroup$













      • $begingroup$
        Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
        $endgroup$
        – Nate
        Oct 16 '18 at 16:12
















      1












      $begingroup$

      If you must split the dataset for each class then I suggest you try PCA. Principle Component Analysis is basically used for dimensionality reduction in the sense that it gives you the subset of attributes which represent the distribution of data best. You could use that for all classes and in that way get the attributes which affect the class data distribution the best.






      share|improve this answer









      $endgroup$













      • $begingroup$
        Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
        $endgroup$
        – Nate
        Oct 16 '18 at 16:12














      1












      1








      1





      $begingroup$

      If you must split the dataset for each class then I suggest you try PCA. Principle Component Analysis is basically used for dimensionality reduction in the sense that it gives you the subset of attributes which represent the distribution of data best. You could use that for all classes and in that way get the attributes which affect the class data distribution the best.






      share|improve this answer









      $endgroup$



      If you must split the dataset for each class then I suggest you try PCA. Principle Component Analysis is basically used for dimensionality reduction in the sense that it gives you the subset of attributes which represent the distribution of data best. You could use that for all classes and in that way get the attributes which affect the class data distribution the best.







      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Oct 16 '18 at 14:00









      Saket Kumar SinghSaket Kumar Singh

      1362




      1362












      • $begingroup$
        Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
        $endgroup$
        – Nate
        Oct 16 '18 at 16:12


















      • $begingroup$
        Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
        $endgroup$
        – Nate
        Oct 16 '18 at 16:12
















      $begingroup$
      Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
      $endgroup$
      – Nate
      Oct 16 '18 at 16:12




      $begingroup$
      Could also 'rank' the importance of the attributes by looking at each attributes coefficient?
      $endgroup$
      – Nate
      Oct 16 '18 at 16:12











      0












      $begingroup$


      Or an even more informed answer could be: Attribute 2 being below 0.5
      is most important for class 1, Attribute 6 being above 0.75 is most
      important for class 2 etc




      A way of doing this is by discretizing your continuous variables into histograms. Every bin of a histogram can thereafter be considered as a separate variable and its importance can be easily found using standard decision tree implementations such as the one in sklearn that provides a _feature_importances attribute. This will offer you insight on the important regions of each variable.



      This is demonstrated in Figure 9 in this paper.



      Welcome to the site, Nate!






      share|improve this answer









      $endgroup$













      • $begingroup$
        Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
        $endgroup$
        – Nate
        Oct 16 '18 at 14:23










      • $begingroup$
        when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
        $endgroup$
        – pcko1
        Oct 16 '18 at 14:35


















      0












      $begingroup$


      Or an even more informed answer could be: Attribute 2 being below 0.5
      is most important for class 1, Attribute 6 being above 0.75 is most
      important for class 2 etc




      A way of doing this is by discretizing your continuous variables into histograms. Every bin of a histogram can thereafter be considered as a separate variable and its importance can be easily found using standard decision tree implementations such as the one in sklearn that provides a _feature_importances attribute. This will offer you insight on the important regions of each variable.



      This is demonstrated in Figure 9 in this paper.



      Welcome to the site, Nate!






      share|improve this answer









      $endgroup$













      • $begingroup$
        Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
        $endgroup$
        – Nate
        Oct 16 '18 at 14:23










      • $begingroup$
        when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
        $endgroup$
        – pcko1
        Oct 16 '18 at 14:35
















      0












      0








      0





      $begingroup$


      Or an even more informed answer could be: Attribute 2 being below 0.5
      is most important for class 1, Attribute 6 being above 0.75 is most
      important for class 2 etc




      A way of doing this is by discretizing your continuous variables into histograms. Every bin of a histogram can thereafter be considered as a separate variable and its importance can be easily found using standard decision tree implementations such as the one in sklearn that provides a _feature_importances attribute. This will offer you insight on the important regions of each variable.



      This is demonstrated in Figure 9 in this paper.



      Welcome to the site, Nate!






      share|improve this answer









      $endgroup$




      Or an even more informed answer could be: Attribute 2 being below 0.5
      is most important for class 1, Attribute 6 being above 0.75 is most
      important for class 2 etc




      A way of doing this is by discretizing your continuous variables into histograms. Every bin of a histogram can thereafter be considered as a separate variable and its importance can be easily found using standard decision tree implementations such as the one in sklearn that provides a _feature_importances attribute. This will offer you insight on the important regions of each variable.



      This is demonstrated in Figure 9 in this paper.



      Welcome to the site, Nate!







      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Oct 16 '18 at 14:10









      pcko1pcko1

      1,681418




      1,681418












      • $begingroup$
        Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
        $endgroup$
        – Nate
        Oct 16 '18 at 14:23










      • $begingroup$
        when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
        $endgroup$
        – pcko1
        Oct 16 '18 at 14:35




















      • $begingroup$
        Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
        $endgroup$
        – Nate
        Oct 16 '18 at 14:23










      • $begingroup$
        when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
        $endgroup$
        – pcko1
        Oct 16 '18 at 14:35


















      $begingroup$
      Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
      $endgroup$
      – Nate
      Oct 16 '18 at 14:23




      $begingroup$
      Ah yes good idea, the only thing is that wouldn't the _feature_importances attribute tell me the importance of the feetures for the whole model, rather than the importance per class.
      $endgroup$
      – Nate
      Oct 16 '18 at 14:23












      $begingroup$
      when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
      $endgroup$
      – pcko1
      Oct 16 '18 at 14:35






      $begingroup$
      when a feature is important to classify a sample into class A, it means that it is also important for not classifying the same sample in class B. I mean that you shouldn't ask "which feature is important to classify the sample in class A and which one to classify it in class B" BUT "which feature is important to classify the sample correctly". This is feature importance.
      $endgroup$
      – pcko1
      Oct 16 '18 at 14:35













      0












      $begingroup$

      Let's say that in this way. The easiest way to ascertain the relation among different features and the outputs is to use covariance matrix. You can even visualise the data for each class. Take a look at the following image.



      enter image description here



      Suppose that the vertical axis is the output and the horizontal axis is for one of the features. As you can see, having knowledge about the features informs us of the changes in the output. Now, consider the following illustration.



      enter image description here



      In this figure, you can see that considering this typical feature does not inform you of the changes in the output.



      Another approach can be using PCA which finds the appropriate features itself. What it does is finding a linear combination of important features which are more relevant to the output.






      share|improve this answer









      $endgroup$


















        0












        $begingroup$

        Let's say that in this way. The easiest way to ascertain the relation among different features and the outputs is to use covariance matrix. You can even visualise the data for each class. Take a look at the following image.



        enter image description here



        Suppose that the vertical axis is the output and the horizontal axis is for one of the features. As you can see, having knowledge about the features informs us of the changes in the output. Now, consider the following illustration.



        enter image description here



        In this figure, you can see that considering this typical feature does not inform you of the changes in the output.



        Another approach can be using PCA which finds the appropriate features itself. What it does is finding a linear combination of important features which are more relevant to the output.






        share|improve this answer









        $endgroup$
















          0












          0








          0





          $begingroup$

          Let's say that in this way. The easiest way to ascertain the relation among different features and the outputs is to use covariance matrix. You can even visualise the data for each class. Take a look at the following image.



          enter image description here



          Suppose that the vertical axis is the output and the horizontal axis is for one of the features. As you can see, having knowledge about the features informs us of the changes in the output. Now, consider the following illustration.



          enter image description here



          In this figure, you can see that considering this typical feature does not inform you of the changes in the output.



          Another approach can be using PCA which finds the appropriate features itself. What it does is finding a linear combination of important features which are more relevant to the output.






          share|improve this answer









          $endgroup$



          Let's say that in this way. The easiest way to ascertain the relation among different features and the outputs is to use covariance matrix. You can even visualise the data for each class. Take a look at the following image.



          enter image description here



          Suppose that the vertical axis is the output and the horizontal axis is for one of the features. As you can see, having knowledge about the features informs us of the changes in the output. Now, consider the following illustration.



          enter image description here



          In this figure, you can see that considering this typical feature does not inform you of the changes in the output.



          Another approach can be using PCA which finds the appropriate features itself. What it does is finding a linear combination of important features which are more relevant to the output.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Oct 16 '18 at 17:22









          VaalizaadehVaalizaadeh

          7,59562263




          7,59562263























              0












              $begingroup$

              It is difficult to get the answer to "which one is the most important" for every class, normally, the distintion between is "between classes" not "for a specific class".



              I use the feature importance from xgboost, this method allows to measure which feature participates in more trees of the boosted forest xgboost has. There is even, the possibility of plotting these importances resulting in very nice data to show.



              Feature importance in XGBoost





              share








              New contributor




              Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              $endgroup$


















                0












                $begingroup$

                It is difficult to get the answer to "which one is the most important" for every class, normally, the distintion between is "between classes" not "for a specific class".



                I use the feature importance from xgboost, this method allows to measure which feature participates in more trees of the boosted forest xgboost has. There is even, the possibility of plotting these importances resulting in very nice data to show.



                Feature importance in XGBoost





                share








                New contributor




                Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.






                $endgroup$
















                  0












                  0








                  0





                  $begingroup$

                  It is difficult to get the answer to "which one is the most important" for every class, normally, the distintion between is "between classes" not "for a specific class".



                  I use the feature importance from xgboost, this method allows to measure which feature participates in more trees of the boosted forest xgboost has. There is even, the possibility of plotting these importances resulting in very nice data to show.



                  Feature importance in XGBoost





                  share








                  New contributor




                  Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  $endgroup$



                  It is difficult to get the answer to "which one is the most important" for every class, normally, the distintion between is "between classes" not "for a specific class".



                  I use the feature importance from xgboost, this method allows to measure which feature participates in more trees of the boosted forest xgboost has. There is even, the possibility of plotting these importances resulting in very nice data to show.



                  Feature importance in XGBoost






                  share








                  New contributor




                  Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.








                  share


                  share






                  New contributor




                  Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  answered 3 mins ago









                  Juan Esteban de la CalleJuan Esteban de la Calle

                  18310




                  18310




                  New contributor




                  Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.





                  New contributor





                  Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  Juan Esteban de la Calle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Data Science Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f39759%2fhow-to-find-the-most-important-attribute-for-each-class%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Ponta tanko

                      Tantalo (mitologio)

                      Erzsébet Schaár