How to choose metrics for evaluating classification results?
$begingroup$
Recently we have developed a python library named PyCM specialized for analyzing multi-class confusion matrices. A parameter recommender system has been added in version 1.9 of this module in order to recommend most related parameters considering the characteristics of the input dataset and its classification problem.
This new option is very challenging and raising many questions. At first, I try to explain the assumptions and describe how this module works in this part. After that, some questions are going to be asked for evaluating the performance of this recommender system.
Considered characteristics:
The characteristics according to which the parameters are suggested are as following:
- Classification problem type (binary or multi-class)
- Dataset type (balanced or imbalanced)
It should be noticed that in the case that the problem is either a binary or a multi-class classification on an imbalanced dataset, for recommending the parameters, just being imbalanced is considered. Therefore, the inspected states can be categorized into three main groups:
- Balanced dataset – Binary classification
- Balanced dataset – Multi-class classification
- Imbalanced dataset
The definition of being imbalanced:
Recognizing the fact that a classification problem is binary or multi-class is so easy. But the margin between being balanced or imbalanced for a dataset is not clear. In PyCM module for checking if the input dataset is balanced or not, a definition has been introduced. According to this definition, if the ratio of the population of the most populous class to the population of the most deserted class is bigger than 3, the dataset is assumed imbalanced.
Recommended parameters:
The recommendation lists have been gathered according to the respective paper of each parameter and the capabilities which had been claimed by the paper. For further information, read the document of PyCM or visit the project page.
Binary – Balanced recommended parameters:
ACC, TPR, PPV, AUC, AUCI, TNR, F1
Multi-class – Balanced recommended parameters:
ERR, TPR Micro, TPR Macro, PPV Micro, PPV Macro, ACC, Overall ACC, MCC, Overall MCC, BCD, Hamming Loss, Zero-one Loss
Imbalanced recommended parameters:
Kappa, SOA1(Landis & Koch), SOA2(Fleiss), SOA3(Altman), SOA4(Cicchetti), CEN, MCEN, MCC, J, Overall J, Overall MCC, Overall CEN, Overall MCEN, AUC, AUCI, G, DP, DPI, GI
Questions:
1. Is the proposed definition of being imbalanced correct? Is there any more comprehensive definition for this characteristic?
2. Is recommending the same parameters for both binary and multi-class classification problem correct over imbalanced dataset?
3. Are the recommendation parameter lists correct and complete? Is there any other parameter for recommending?
4. Is there any other characteristics (like binary/multi-class and balanced/imbalanced) which can effect on evaluating the result of a classification method?
Website: http://www.pycm.ir/
Github: https://github.com/sepandhaghighi/pycm
Paper: https://www.theoj.org/joss-papers/joss.00729/10.21105.joss.00729.pdf
machine-learning python classification multiclass-classification confusion-matrix
$endgroup$
add a comment |
$begingroup$
Recently we have developed a python library named PyCM specialized for analyzing multi-class confusion matrices. A parameter recommender system has been added in version 1.9 of this module in order to recommend most related parameters considering the characteristics of the input dataset and its classification problem.
This new option is very challenging and raising many questions. At first, I try to explain the assumptions and describe how this module works in this part. After that, some questions are going to be asked for evaluating the performance of this recommender system.
Considered characteristics:
The characteristics according to which the parameters are suggested are as following:
- Classification problem type (binary or multi-class)
- Dataset type (balanced or imbalanced)
It should be noticed that in the case that the problem is either a binary or a multi-class classification on an imbalanced dataset, for recommending the parameters, just being imbalanced is considered. Therefore, the inspected states can be categorized into three main groups:
- Balanced dataset – Binary classification
- Balanced dataset – Multi-class classification
- Imbalanced dataset
The definition of being imbalanced:
Recognizing the fact that a classification problem is binary or multi-class is so easy. But the margin between being balanced or imbalanced for a dataset is not clear. In PyCM module for checking if the input dataset is balanced or not, a definition has been introduced. According to this definition, if the ratio of the population of the most populous class to the population of the most deserted class is bigger than 3, the dataset is assumed imbalanced.
Recommended parameters:
The recommendation lists have been gathered according to the respective paper of each parameter and the capabilities which had been claimed by the paper. For further information, read the document of PyCM or visit the project page.
Binary – Balanced recommended parameters:
ACC, TPR, PPV, AUC, AUCI, TNR, F1
Multi-class – Balanced recommended parameters:
ERR, TPR Micro, TPR Macro, PPV Micro, PPV Macro, ACC, Overall ACC, MCC, Overall MCC, BCD, Hamming Loss, Zero-one Loss
Imbalanced recommended parameters:
Kappa, SOA1(Landis & Koch), SOA2(Fleiss), SOA3(Altman), SOA4(Cicchetti), CEN, MCEN, MCC, J, Overall J, Overall MCC, Overall CEN, Overall MCEN, AUC, AUCI, G, DP, DPI, GI
Questions:
1. Is the proposed definition of being imbalanced correct? Is there any more comprehensive definition for this characteristic?
2. Is recommending the same parameters for both binary and multi-class classification problem correct over imbalanced dataset?
3. Are the recommendation parameter lists correct and complete? Is there any other parameter for recommending?
4. Is there any other characteristics (like binary/multi-class and balanced/imbalanced) which can effect on evaluating the result of a classification method?
Website: http://www.pycm.ir/
Github: https://github.com/sepandhaghighi/pycm
Paper: https://www.theoj.org/joss-papers/joss.00729/10.21105.joss.00729.pdf
machine-learning python classification multiclass-classification confusion-matrix
$endgroup$
add a comment |
$begingroup$
Recently we have developed a python library named PyCM specialized for analyzing multi-class confusion matrices. A parameter recommender system has been added in version 1.9 of this module in order to recommend most related parameters considering the characteristics of the input dataset and its classification problem.
This new option is very challenging and raising many questions. At first, I try to explain the assumptions and describe how this module works in this part. After that, some questions are going to be asked for evaluating the performance of this recommender system.
Considered characteristics:
The characteristics according to which the parameters are suggested are as following:
- Classification problem type (binary or multi-class)
- Dataset type (balanced or imbalanced)
It should be noticed that in the case that the problem is either a binary or a multi-class classification on an imbalanced dataset, for recommending the parameters, just being imbalanced is considered. Therefore, the inspected states can be categorized into three main groups:
- Balanced dataset – Binary classification
- Balanced dataset – Multi-class classification
- Imbalanced dataset
The definition of being imbalanced:
Recognizing the fact that a classification problem is binary or multi-class is so easy. But the margin between being balanced or imbalanced for a dataset is not clear. In PyCM module for checking if the input dataset is balanced or not, a definition has been introduced. According to this definition, if the ratio of the population of the most populous class to the population of the most deserted class is bigger than 3, the dataset is assumed imbalanced.
Recommended parameters:
The recommendation lists have been gathered according to the respective paper of each parameter and the capabilities which had been claimed by the paper. For further information, read the document of PyCM or visit the project page.
Binary – Balanced recommended parameters:
ACC, TPR, PPV, AUC, AUCI, TNR, F1
Multi-class – Balanced recommended parameters:
ERR, TPR Micro, TPR Macro, PPV Micro, PPV Macro, ACC, Overall ACC, MCC, Overall MCC, BCD, Hamming Loss, Zero-one Loss
Imbalanced recommended parameters:
Kappa, SOA1(Landis & Koch), SOA2(Fleiss), SOA3(Altman), SOA4(Cicchetti), CEN, MCEN, MCC, J, Overall J, Overall MCC, Overall CEN, Overall MCEN, AUC, AUCI, G, DP, DPI, GI
Questions:
1. Is the proposed definition of being imbalanced correct? Is there any more comprehensive definition for this characteristic?
2. Is recommending the same parameters for both binary and multi-class classification problem correct over imbalanced dataset?
3. Are the recommendation parameter lists correct and complete? Is there any other parameter for recommending?
4. Is there any other characteristics (like binary/multi-class and balanced/imbalanced) which can effect on evaluating the result of a classification method?
Website: http://www.pycm.ir/
Github: https://github.com/sepandhaghighi/pycm
Paper: https://www.theoj.org/joss-papers/joss.00729/10.21105.joss.00729.pdf
machine-learning python classification multiclass-classification confusion-matrix
$endgroup$
Recently we have developed a python library named PyCM specialized for analyzing multi-class confusion matrices. A parameter recommender system has been added in version 1.9 of this module in order to recommend most related parameters considering the characteristics of the input dataset and its classification problem.
This new option is very challenging and raising many questions. At first, I try to explain the assumptions and describe how this module works in this part. After that, some questions are going to be asked for evaluating the performance of this recommender system.
Considered characteristics:
The characteristics according to which the parameters are suggested are as following:
- Classification problem type (binary or multi-class)
- Dataset type (balanced or imbalanced)
It should be noticed that in the case that the problem is either a binary or a multi-class classification on an imbalanced dataset, for recommending the parameters, just being imbalanced is considered. Therefore, the inspected states can be categorized into three main groups:
- Balanced dataset – Binary classification
- Balanced dataset – Multi-class classification
- Imbalanced dataset
The definition of being imbalanced:
Recognizing the fact that a classification problem is binary or multi-class is so easy. But the margin between being balanced or imbalanced for a dataset is not clear. In PyCM module for checking if the input dataset is balanced or not, a definition has been introduced. According to this definition, if the ratio of the population of the most populous class to the population of the most deserted class is bigger than 3, the dataset is assumed imbalanced.
Recommended parameters:
The recommendation lists have been gathered according to the respective paper of each parameter and the capabilities which had been claimed by the paper. For further information, read the document of PyCM or visit the project page.
Binary – Balanced recommended parameters:
ACC, TPR, PPV, AUC, AUCI, TNR, F1
Multi-class – Balanced recommended parameters:
ERR, TPR Micro, TPR Macro, PPV Micro, PPV Macro, ACC, Overall ACC, MCC, Overall MCC, BCD, Hamming Loss, Zero-one Loss
Imbalanced recommended parameters:
Kappa, SOA1(Landis & Koch), SOA2(Fleiss), SOA3(Altman), SOA4(Cicchetti), CEN, MCEN, MCC, J, Overall J, Overall MCC, Overall CEN, Overall MCEN, AUC, AUCI, G, DP, DPI, GI
Questions:
1. Is the proposed definition of being imbalanced correct? Is there any more comprehensive definition for this characteristic?
2. Is recommending the same parameters for both binary and multi-class classification problem correct over imbalanced dataset?
3. Are the recommendation parameter lists correct and complete? Is there any other parameter for recommending?
4. Is there any other characteristics (like binary/multi-class and balanced/imbalanced) which can effect on evaluating the result of a classification method?
Website: http://www.pycm.ir/
Github: https://github.com/sepandhaghighi/pycm
Paper: https://www.theoj.org/joss-papers/joss.00729/10.21105.joss.00729.pdf
machine-learning python classification multiclass-classification confusion-matrix
machine-learning python classification multiclass-classification confusion-matrix
asked 16 mins ago
alireza zolanvarialireza zolanvari
544
544
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46509%2fhow-to-choose-metrics-for-evaluating-classification-results%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46509%2fhow-to-choose-metrics-for-evaluating-classification-results%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown