Exclude observations with measurements below limit of detection?
$begingroup$
I am analysing a dataset for the relationship between an exposure variable x and a response y (in my case, these are urinary concentration of a specific compound and a measure of cognitive function). x is measured using an analytical method which has a lower detection limit - and approximately 12% of the population have concentrations below the detection limit.
In a first analysis, I compared participants with y above and below detection limit, and found a significant difference - which is not surprising.
My question is: when I conduct a regression analysis for y ~ x - should I exclude all those x < detection limit, or not? It does affect the results and actually inverses the association (if I include all observations, the association is positive - if I exclude them, it is negative).
regression censoring chemometrics
$endgroup$
add a comment |
$begingroup$
I am analysing a dataset for the relationship between an exposure variable x and a response y (in my case, these are urinary concentration of a specific compound and a measure of cognitive function). x is measured using an analytical method which has a lower detection limit - and approximately 12% of the population have concentrations below the detection limit.
In a first analysis, I compared participants with y above and below detection limit, and found a significant difference - which is not surprising.
My question is: when I conduct a regression analysis for y ~ x - should I exclude all those x < detection limit, or not? It does affect the results and actually inverses the association (if I include all observations, the association is positive - if I exclude them, it is negative).
regression censoring chemometrics
$endgroup$
add a comment |
$begingroup$
I am analysing a dataset for the relationship between an exposure variable x and a response y (in my case, these are urinary concentration of a specific compound and a measure of cognitive function). x is measured using an analytical method which has a lower detection limit - and approximately 12% of the population have concentrations below the detection limit.
In a first analysis, I compared participants with y above and below detection limit, and found a significant difference - which is not surprising.
My question is: when I conduct a regression analysis for y ~ x - should I exclude all those x < detection limit, or not? It does affect the results and actually inverses the association (if I include all observations, the association is positive - if I exclude them, it is negative).
regression censoring chemometrics
$endgroup$
I am analysing a dataset for the relationship between an exposure variable x and a response y (in my case, these are urinary concentration of a specific compound and a measure of cognitive function). x is measured using an analytical method which has a lower detection limit - and approximately 12% of the population have concentrations below the detection limit.
In a first analysis, I compared participants with y above and below detection limit, and found a significant difference - which is not surprising.
My question is: when I conduct a regression analysis for y ~ x - should I exclude all those x < detection limit, or not? It does affect the results and actually inverses the association (if I include all observations, the association is positive - if I exclude them, it is negative).
regression censoring chemometrics
regression censoring chemometrics
edited 1 hour ago
cbeleites
23.2k147100
23.2k147100
asked 2 hours ago
GuxGux
8110
8110
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Don't exclude cases solely because they are below LLOQ! (lower limit of quantitation)
- The LLOQ is not a magic hard threshold below which nothing can be said. It is rather a convention to mark the concentration where the relative error of the analyses falls below 10 %.
- Note that LLOQ is often computed assuming homescedasticity, i.e. the absolute error being independent of the concentration. That is, you don't even assume different absolute error for cases below or above LLOQ. From that point of view, LLOQ is essentially just a way to express the absoute uncertainty of the analytical method in a concentration unit. (Like fuel economy in l/100 km vs. miles/gallon)
- Even if analytical error is concentration dependent, two cases with true concentration almost the same but slightly below and above LLOQ have almost the same uncertainty.
- (Left) censoring data (which is the technical term for excluding cases below LLOQ) leads to all kinds of complications in consecutive data analyses (and you'd need to use particular statistical methods that can deal with such data).
- Say thank you to your clinical lab that they provide you with full data: I've met many people who have the opposite difficulty: getting a report that just says below LLOQ, and no possibility to recover any further information.
Bottomline: never censor your data unless you have really, really good reasons for doing so.
$endgroup$
$begingroup$
If the analytical study is only for those with a large concentration of the compound, that is, to look at severity of effect in extreme cases, that would be a different study that would exclude all low concentrations. That might be useful, bit does not appear to be the goal here. Is this correct?
$endgroup$
– James Phillips
1 hour ago
$begingroup$
@JamesPhillips: IMHO that would indeed be a totally different question. And it would require that the analyte concentration can be measured with sufficient precision that the inclusion/exclusion decision is not hampered by analytical error.
$endgroup$
– cbeleites
1 hour ago
$begingroup$
@JamesPhillips: plus, from my chemist's world-view, that makes sense only if we actually have distinct subpopulations, i.e. clusters as opposed to a continuum where a rather arbitrary threshold cuts of a tail - in that case a regression is more sensible. Note that if you cut between clusters of cases, you have less of a thresholding/censoring problem. Whereas cutting "through the middle" of a single population leads to complications that are somewhat related to those of censoring.
$endgroup$
– cbeleites
1 hour ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f388567%2fexclude-observations-with-measurements-below-limit-of-detection%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Don't exclude cases solely because they are below LLOQ! (lower limit of quantitation)
- The LLOQ is not a magic hard threshold below which nothing can be said. It is rather a convention to mark the concentration where the relative error of the analyses falls below 10 %.
- Note that LLOQ is often computed assuming homescedasticity, i.e. the absolute error being independent of the concentration. That is, you don't even assume different absolute error for cases below or above LLOQ. From that point of view, LLOQ is essentially just a way to express the absoute uncertainty of the analytical method in a concentration unit. (Like fuel economy in l/100 km vs. miles/gallon)
- Even if analytical error is concentration dependent, two cases with true concentration almost the same but slightly below and above LLOQ have almost the same uncertainty.
- (Left) censoring data (which is the technical term for excluding cases below LLOQ) leads to all kinds of complications in consecutive data analyses (and you'd need to use particular statistical methods that can deal with such data).
- Say thank you to your clinical lab that they provide you with full data: I've met many people who have the opposite difficulty: getting a report that just says below LLOQ, and no possibility to recover any further information.
Bottomline: never censor your data unless you have really, really good reasons for doing so.
$endgroup$
$begingroup$
If the analytical study is only for those with a large concentration of the compound, that is, to look at severity of effect in extreme cases, that would be a different study that would exclude all low concentrations. That might be useful, bit does not appear to be the goal here. Is this correct?
$endgroup$
– James Phillips
1 hour ago
$begingroup$
@JamesPhillips: IMHO that would indeed be a totally different question. And it would require that the analyte concentration can be measured with sufficient precision that the inclusion/exclusion decision is not hampered by analytical error.
$endgroup$
– cbeleites
1 hour ago
$begingroup$
@JamesPhillips: plus, from my chemist's world-view, that makes sense only if we actually have distinct subpopulations, i.e. clusters as opposed to a continuum where a rather arbitrary threshold cuts of a tail - in that case a regression is more sensible. Note that if you cut between clusters of cases, you have less of a thresholding/censoring problem. Whereas cutting "through the middle" of a single population leads to complications that are somewhat related to those of censoring.
$endgroup$
– cbeleites
1 hour ago
add a comment |
$begingroup$
Don't exclude cases solely because they are below LLOQ! (lower limit of quantitation)
- The LLOQ is not a magic hard threshold below which nothing can be said. It is rather a convention to mark the concentration where the relative error of the analyses falls below 10 %.
- Note that LLOQ is often computed assuming homescedasticity, i.e. the absolute error being independent of the concentration. That is, you don't even assume different absolute error for cases below or above LLOQ. From that point of view, LLOQ is essentially just a way to express the absoute uncertainty of the analytical method in a concentration unit. (Like fuel economy in l/100 km vs. miles/gallon)
- Even if analytical error is concentration dependent, two cases with true concentration almost the same but slightly below and above LLOQ have almost the same uncertainty.
- (Left) censoring data (which is the technical term for excluding cases below LLOQ) leads to all kinds of complications in consecutive data analyses (and you'd need to use particular statistical methods that can deal with such data).
- Say thank you to your clinical lab that they provide you with full data: I've met many people who have the opposite difficulty: getting a report that just says below LLOQ, and no possibility to recover any further information.
Bottomline: never censor your data unless you have really, really good reasons for doing so.
$endgroup$
$begingroup$
If the analytical study is only for those with a large concentration of the compound, that is, to look at severity of effect in extreme cases, that would be a different study that would exclude all low concentrations. That might be useful, bit does not appear to be the goal here. Is this correct?
$endgroup$
– James Phillips
1 hour ago
$begingroup$
@JamesPhillips: IMHO that would indeed be a totally different question. And it would require that the analyte concentration can be measured with sufficient precision that the inclusion/exclusion decision is not hampered by analytical error.
$endgroup$
– cbeleites
1 hour ago
$begingroup$
@JamesPhillips: plus, from my chemist's world-view, that makes sense only if we actually have distinct subpopulations, i.e. clusters as opposed to a continuum where a rather arbitrary threshold cuts of a tail - in that case a regression is more sensible. Note that if you cut between clusters of cases, you have less of a thresholding/censoring problem. Whereas cutting "through the middle" of a single population leads to complications that are somewhat related to those of censoring.
$endgroup$
– cbeleites
1 hour ago
add a comment |
$begingroup$
Don't exclude cases solely because they are below LLOQ! (lower limit of quantitation)
- The LLOQ is not a magic hard threshold below which nothing can be said. It is rather a convention to mark the concentration where the relative error of the analyses falls below 10 %.
- Note that LLOQ is often computed assuming homescedasticity, i.e. the absolute error being independent of the concentration. That is, you don't even assume different absolute error for cases below or above LLOQ. From that point of view, LLOQ is essentially just a way to express the absoute uncertainty of the analytical method in a concentration unit. (Like fuel economy in l/100 km vs. miles/gallon)
- Even if analytical error is concentration dependent, two cases with true concentration almost the same but slightly below and above LLOQ have almost the same uncertainty.
- (Left) censoring data (which is the technical term for excluding cases below LLOQ) leads to all kinds of complications in consecutive data analyses (and you'd need to use particular statistical methods that can deal with such data).
- Say thank you to your clinical lab that they provide you with full data: I've met many people who have the opposite difficulty: getting a report that just says below LLOQ, and no possibility to recover any further information.
Bottomline: never censor your data unless you have really, really good reasons for doing so.
$endgroup$
Don't exclude cases solely because they are below LLOQ! (lower limit of quantitation)
- The LLOQ is not a magic hard threshold below which nothing can be said. It is rather a convention to mark the concentration where the relative error of the analyses falls below 10 %.
- Note that LLOQ is often computed assuming homescedasticity, i.e. the absolute error being independent of the concentration. That is, you don't even assume different absolute error for cases below or above LLOQ. From that point of view, LLOQ is essentially just a way to express the absoute uncertainty of the analytical method in a concentration unit. (Like fuel economy in l/100 km vs. miles/gallon)
- Even if analytical error is concentration dependent, two cases with true concentration almost the same but slightly below and above LLOQ have almost the same uncertainty.
- (Left) censoring data (which is the technical term for excluding cases below LLOQ) leads to all kinds of complications in consecutive data analyses (and you'd need to use particular statistical methods that can deal with such data).
- Say thank you to your clinical lab that they provide you with full data: I've met many people who have the opposite difficulty: getting a report that just says below LLOQ, and no possibility to recover any further information.
Bottomline: never censor your data unless you have really, really good reasons for doing so.
answered 1 hour ago
cbeleitescbeleites
23.2k147100
23.2k147100
$begingroup$
If the analytical study is only for those with a large concentration of the compound, that is, to look at severity of effect in extreme cases, that would be a different study that would exclude all low concentrations. That might be useful, bit does not appear to be the goal here. Is this correct?
$endgroup$
– James Phillips
1 hour ago
$begingroup$
@JamesPhillips: IMHO that would indeed be a totally different question. And it would require that the analyte concentration can be measured with sufficient precision that the inclusion/exclusion decision is not hampered by analytical error.
$endgroup$
– cbeleites
1 hour ago
$begingroup$
@JamesPhillips: plus, from my chemist's world-view, that makes sense only if we actually have distinct subpopulations, i.e. clusters as opposed to a continuum where a rather arbitrary threshold cuts of a tail - in that case a regression is more sensible. Note that if you cut between clusters of cases, you have less of a thresholding/censoring problem. Whereas cutting "through the middle" of a single population leads to complications that are somewhat related to those of censoring.
$endgroup$
– cbeleites
1 hour ago
add a comment |
$begingroup$
If the analytical study is only for those with a large concentration of the compound, that is, to look at severity of effect in extreme cases, that would be a different study that would exclude all low concentrations. That might be useful, bit does not appear to be the goal here. Is this correct?
$endgroup$
– James Phillips
1 hour ago
$begingroup$
@JamesPhillips: IMHO that would indeed be a totally different question. And it would require that the analyte concentration can be measured with sufficient precision that the inclusion/exclusion decision is not hampered by analytical error.
$endgroup$
– cbeleites
1 hour ago
$begingroup$
@JamesPhillips: plus, from my chemist's world-view, that makes sense only if we actually have distinct subpopulations, i.e. clusters as opposed to a continuum where a rather arbitrary threshold cuts of a tail - in that case a regression is more sensible. Note that if you cut between clusters of cases, you have less of a thresholding/censoring problem. Whereas cutting "through the middle" of a single population leads to complications that are somewhat related to those of censoring.
$endgroup$
– cbeleites
1 hour ago
$begingroup$
If the analytical study is only for those with a large concentration of the compound, that is, to look at severity of effect in extreme cases, that would be a different study that would exclude all low concentrations. That might be useful, bit does not appear to be the goal here. Is this correct?
$endgroup$
– James Phillips
1 hour ago
$begingroup$
If the analytical study is only for those with a large concentration of the compound, that is, to look at severity of effect in extreme cases, that would be a different study that would exclude all low concentrations. That might be useful, bit does not appear to be the goal here. Is this correct?
$endgroup$
– James Phillips
1 hour ago
$begingroup$
@JamesPhillips: IMHO that would indeed be a totally different question. And it would require that the analyte concentration can be measured with sufficient precision that the inclusion/exclusion decision is not hampered by analytical error.
$endgroup$
– cbeleites
1 hour ago
$begingroup$
@JamesPhillips: IMHO that would indeed be a totally different question. And it would require that the analyte concentration can be measured with sufficient precision that the inclusion/exclusion decision is not hampered by analytical error.
$endgroup$
– cbeleites
1 hour ago
$begingroup$
@JamesPhillips: plus, from my chemist's world-view, that makes sense only if we actually have distinct subpopulations, i.e. clusters as opposed to a continuum where a rather arbitrary threshold cuts of a tail - in that case a regression is more sensible. Note that if you cut between clusters of cases, you have less of a thresholding/censoring problem. Whereas cutting "through the middle" of a single population leads to complications that are somewhat related to those of censoring.
$endgroup$
– cbeleites
1 hour ago
$begingroup$
@JamesPhillips: plus, from my chemist's world-view, that makes sense only if we actually have distinct subpopulations, i.e. clusters as opposed to a continuum where a rather arbitrary threshold cuts of a tail - in that case a regression is more sensible. Note that if you cut between clusters of cases, you have less of a thresholding/censoring problem. Whereas cutting "through the middle" of a single population leads to complications that are somewhat related to those of censoring.
$endgroup$
– cbeleites
1 hour ago
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f388567%2fexclude-observations-with-measurements-below-limit-of-detection%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown