J
JapaneseKappa
Guest
Then give me an example of a false negative.No, my objection was that your threshold had a lot of flaws. That was just a demonstration of one of them. There are more.
Then give me an example of a false negative.No, my objection was that your threshold had a lot of flaws. That was just a demonstration of one of them. There are more.
What’s the point? This thread is not about the “threshold” that you have proposed. And I have pointed out many flaws of it (you can make a trivial “false negative” from some of them, if you want it), and many reasons why it is not very useful - and couldn’t be expected to be useful. And you haven’t responded to them, thus, presumably, you agree.Then give me an example of a false negative.
Quite right, but you were the one that demanded I create one and attacked it after I did. The threshold is fundamentally personal and it is possible to complete my exercise with an incomplete and imperfect threshold.What’s the point? This thread is not about the “threshold” that you have proposed.
But it is critical to your contention that it is impossible to construct a set of criteria which would allow you to successfully complete the thought experiment I proposed. If the constraints find a negative result (i.e. that we should not be certain,) and we can’t think of a way that the negative might be a false negative, then the exercise is complete.And I have pointed out many flaws of it (you can make a trivial “false negative” from some of them, if you want it), and many reasons why it is not very useful - and couldn’t be expected to be useful. And you haven’t responded to them, thus, presumably, you agree.![]()
If you want it that much… OK, let’s take this one flaw:I don’t believe you can make a trivial false negative (or indeed any false negative.) That is why I invited you to try. I am calling your bluff.
Let’s take a hypothetical resurrection that is “just good enough” and is supported by fingerprints (among other evidence). That would be a “true positive”. Then let’s remove fingerprints (let’s say that the human we are dealing with has lost his hands and has no fingerprints) and add iris scan results. I would say that if the previous evidence was good enough, so should be the new evidence. But it fails your threshold (since the evidence was “just good enough” previously and a piece of evidence that the threshold does consider has been replaced by a piece of evidence that it ignores). Thus it is a “false negative”.Another example would be not mentioning any other biometric information (I’d say there is no reason to ignore iris scan, if it is available).
Actually, I gave other arguments challenging the usefulness of your approach (setting the threshold before seeing the evidence). For example, the fact that this approach is not used in criminal investigations. I am pretty sure that if it was a good and practical approach, at least some lawyers would demand it. But, for some reason, we do not hear about development of such thresholds for murder, theft or treason…Quite right, but you were the one that demanded I create one and attacked it after I did. The threshold is fundamentally personal and it is possible to complete my exercise with an incomplete and imperfect threshold.
But it is critical to your contention that it is impossible to construct a set of criteria which would allow you to successfully complete the thought experiment I proposed. If the constraints find a negative result (i.e. that we should not be certain,) and we can’t think of a way that the negative might be a false negative, then the exercise is complete.
I’m not sure how exceeding the standards results in a rejection. I guess you are again forcing your demand for an exhaustive list of criteria onto a list I created as minimum standards.Let’s take a hypothetical resurrection that is “just good enough” and is supported by fingerprints (among other evidence). That would be a “true positive”. Then let’s remove fingerprints (let’s say that the human we are dealing with has lost his hands and has no fingerprints) and add iris scan results. I would say that if the previous evidence was good enough, so should be the new evidence. But it fails your threshold (since the evidence was “just good enough” previously and a piece of evidence that the threshold does consider has been replaced by a piece of evidence that it ignores). Thus it is a “false negative”.
We do, after a fashion: phrases like “preponderance of the evidence” and “beyond a reasonable doubt” are precisely that. The government doesn’t go so far as to put a straight-jacket on what constitutes “reasonable doubt,” they leave it up to the individual juries. I would be uncomfortable legislating a strict definition of reasonable doubt myself, even if I did come up with a set of criteria I was personally very comfortable with. People are judged by juries of their peers for a reason.Actually, I gave other arguments challenging the usefulness of your approach (setting the threshold before seeing the evidence). For example, the fact that this approach is not used in criminal investigations. I am pretty sure that if it was a good and practical approach, at least some lawyers would demand it. But, for some reason, we do not hear about development of such thresholds for murder, theft or treason…
As a matter of fact, I have no idea what the list was meant to do - that is, other than rejection of resurrection of Jesus. Looks like you didn’t really specify if it was meant to be used like “Threshold has been exceeded - accept, otherwise - reject.”, “Threshold has been exceeded - accept, otherwise - inconclusive.”, or “Threshold has been exceeded - inconclusive, otherwise - reject.”. Thus I get to choose that, and I choose the first case, as it is the most simple. Also, your talk about both “false positives” and “false negatives” makes less sense otherwise.I’m not sure how exceeding the standards results in a rejection. I guess you are again forcing your demand for an exhaustive list of criteria onto a list I created as minimum standards.
And that runs into another argument I gave (and you ignored): if the threshold is flexible and informal, it is no longer a good safeguard against bias. And that was the reason you gave to use a threshold on the first place. We waste time and effort and get nothing in return.I’m not sure how exceeding the standards results in a rejection. I guess you are again forcing your demand for an exhaustive list of criteria onto a list I created as minimum standards.
If we were not being deliberately obtuse, the process would go like this:
I propose some standards.
We compare a hypothetical to the standards.
We find that the hypothetical is identical to the minimum standards in all respects except instead of fingerprints, there is an as-yet-undiscovered method of identification which is superior to all current methods.
We compare the as-yet-undiscovered-ID-method to the methods in the standard and find that the new methods give more certainty than the old methods.
We conclude that we are certain.
That “beyond reasonable doubt” is another example of a single value of certainty. That is completely different from your approach. After all, I did offer you some numbers and got this in return:We do, after a fashion: phrases like “preponderance of the evidence” and “beyond a reasonable doubt” are precisely that. The government doesn’t go so far as to put a straight-jacket on what constitutes “reasonable doubt,” they leave it up to the individual juries. I would be uncomfortable legislating a strict definition of reasonable doubt myself, even if I did come up with a set of criteria I was personally very comfortable with. People are judged by juries of their peers for a reason.
en.wikipedia.org/wiki/Reasonable_doubt
As I said, the criteria people propose would be personal. What sort of evidence one person finds convincing will not necessarily be convincing to another. I don’t expect other people to agree with mine on all counts, nor would I take issue with other people’s standards. What I am asking for is for people to actually lay out their standards are before adding up the evidence.
And criminal investigations themselves do not happen in courts (not to mention that many of things done in courts, like “inadmissible evidence” concerned with “Fruit of the poisonous tree” are clearly not designed with the goal to find out the truth). Policemen and prosecutors are employed by government. Someone would prepare strict guidelines for amount of evidence for something like murder or theft, if it was not obvious it is both very hard and very pointless.You’ve missed the forest for a single tree. I didn’t ask you to ascribe a value, but to sketch out what sorts of evidence would make you certain about a modern day ressurection.
False positive: finding certainty where the evidence does not warrant it.As a matter of fact, I have no idea what the list was meant to do - that is, other than rejection of resurrection of Jesus. Looks like you didn’t really specify if it was meant to be used like “Threshold has been exceeded - accept, otherwise - reject.”, “Threshold has been exceeded - accept, otherwise - inconclusive.”, or “Threshold has been exceeded - inconclusive, otherwise - reject.”. Thus I get to choose that, and I choose the first case, as it is the most simple. Also, your talk about both “false positives” and “false negatives” makes less sense otherwise.
The reduction in bias comes from thinking about a more neutral topic. Instead of the loaded question of “is this core part of a religion true” we are considering a hypothetical modern-day event. From your responses, it seems that this may not even be enough distance for many people, though. I have no doubt that the fact I have no takers on my exercise is not due solely to people thinking it to be a waste of time, but rather their immediate realization that they would have much higher standards for a modern resurrection than they’ve used for Jesus’.…that is, other than rejection of resurrection of Jesus…
And that runs into another argument I gave (and you ignored): if the threshold is flexible and informal, it is no longer a good safeguard against bias. And that was the reason you gave to use a threshold on the first place. We waste time and effort and get nothing in return.
The context was:That “beyond reasonable doubt” is another example of a single value of certainty. That is completely different from your approach. After all, I did offer you some numbers and got this in return:
This reads like you’ve set the knob to “maximum thread derailment!” I’ve never argued that my exercise is something everyone should do with all evidence. I’ve never argued that it is used everywhere it could potentially be useful. The criminal justice system is complex and has interests that compete with judging the evidence uniformly (e.g. use of clemency to coerce cooperation) that too-strict rules would likely hamper. Moreover, such guidelines would have to deal with all possible situations; all I’ve asked for is guidelines for one particular scenario (a resurrection in rural India.)And criminal investigations themselves do not happen in courts (not to mention that many of things done in courts, like “inadmissible evidence” concerned with “Fruit of the poisonous tree” are clearly not designed with the goal to find out the truth). Policemen and prosecutors are employed by government. Someone would prepare strict guidelines for amount of evidence for something like murder or theft, if it was not obvious it is both very hard and very pointless.
If you think otherwise, find such guidelines or proposals for guidelines.
If you want some examples of places that could be expected to have such guidelines (that is, if one could expect them to exist), I can offer you cps.gov.uk/publications/directors_guidance/dpp_guidance_5.html (“The Director’s Guidance On Charging 2013 - fifth edition, May 2013”, “Guidance to Police Officers and Crown Prosecutors Issued by the Director of Public Prosecution under S37A of the Police and Criminal Evidence Act 1984”) or “Criminal Investigation” by Ronald F. Becker, Aric W. Dutelle (“Google” seems to show at least some pages from this textbook).
p-values (and their relative “confidence interval”) are good when it comes to statistics and evidentiary research. The problem with p-values is more that most people don’t really understand what they mean. For those that don’t understand, a p-value of .05 (also called a 95% confidence interval) means that, if the threshold is met, a researcher can know with 95% certainty that a given hypothesis has been supported by the evidence. A p-value of .01 (or a 99% confidence interval) means that, if the threshold is met, a researcher can know with 99% certainty that a given hypothesis has been supported by the evidence. The problems with p-values, of course, are that (a) a p-value can never be 0 - that is a 100% certainty, though they can be made infiinitely small, (b) lower p-values give rise to too many false positives (that is, cases in which it appears that the evidence supports the hypothesis, when it does not), and (c) higher p-values have too many false negatives (that is, the threshold is too high to determine that the evidence supports the hypothesis). This is one reason why most p-values are either 0.1 (90% confidence) or 0.05 (95% confidence). And if there is a dispute between the two (evidence satisfies the 90% threshold but not the 95% threshold), recommendations are asked for more testing.False positive: finding certainty where the evidence does not warrant it.
False negative: failure to find certainty where the evidence warrants it.
The original goal was to set out some criteria which, if satisfied, would convince us that you would be certain that a hypothetical ressurection event in rural India.
The reduction in bias comes from thinking about a more neutral topic. Instead of the loaded question of “is this core part of a religion true” we are considering a hypothetical modern-day event. From your responses, it seems that this may not even be enough distance for many people, though. I have no doubt that the fact I have no takers on my exercise is not due solely to people thinking it to be a waste of time, but rather their immediate realization that they would have much higher standards for a modern resurrection than they’ve used for Jesus’.
The context was:
MPat: People don’t set out evidentiary thresholds ahead of time!
JK: They do, for example p values.
MPat: If you want to use p values you can try, but I am skeptical of their usefulness here.
JK: I didn’t ask you to use p values. I asked for your own criteria.
If you don’t know how or don’t like to use p values, then there is no way you would phrase your personal threshold in terms of p values. If you were comfortable with p-values, then you would have been welcome to give a number.
This reads like you’ve set the knob to “maximum thread derailment!” I’ve never argued that my exercise is something everyone should do with all evidence. I’ve never argued that it is used everywhere it could potentially be useful. The criminal justice system is complex and has interests that compete with judging the evidence uniformly (e.g. use of clemency to coerce cooperation) that too-strict rules would likely hamper. Moreover, such guidelines would have to deal with all possible situations; all I’ve asked for is guidelines for one particular scenario (a resurrection in rural India.)
I have no doubt that the training of policemen, prosecutors, etc, involves the use of examples (real and hypothetical) that give the policemen, prosecutors, etc a basis for comparison when they eventually face real evidence, as well as experience in thinking about the various issues with the evidence.
Really? So, your “binary classifier” tries to guess if the evidence is sufficient and not if the event has actually happened?False positive: finding certainty where the evidence does not warrant it.
False negative: failure to find certainty where the evidence warrants it.
So, “If threshold is met - accept, otherwise - inconclusive.”?The original goal was to set out some criteria which, if satisfied, would convince us that you would be certain that a hypothetical ressurection event in rural India.
You mean you actually thought that someone is going to consider an obvious caricature of resurrection separately from the resurrection we are actually interested in?The reduction in bias comes from thinking about a more neutral topic. Instead of the loaded question of “is this core part of a religion true” we are considering a hypothetical modern-day event. From your responses, it seems that this may not even be enough distance for many people, though.
Interestingly enough, you claim complete certainty here while citing no evidence at all.I have no doubt that the fact I have no takers on my exercise is not due solely to people thinking it to be a waste of time, but rather their immediate realization that they would have much higher standards for a modern resurrection than they’ve used for Jesus’.
You might also note that I actually gave a couple of values there (0.5 and 0.05). If you can think of the way to find good use for them, feel free to do so.The context was:
MPat: People don’t set out evidentiary thresholds ahead of time!
JK: They do, for example p values.
MPat: If you want to use p values you can try, but I am skeptical of their usefulness here.
JK: I didn’t ask you to use p values. I asked for your own criteria.
As opposed to rural Pakistan…?This reads like you’ve set the knob to “maximum thread derailment!” I’ve never argued that my exercise is something everyone should do with all evidence. I’ve never argued that it is used everywhere it could potentially be useful. The criminal justice system is complex and has interests that compete with judging the evidence uniformly (e.g. use of clemency to coerce cooperation) that too-strict rules would likely hamper. Moreover, such guidelines would have to deal with all possible situations; all I’ve asked for is guidelines for one particular scenario (a resurrection in rural India.)
Yes, they learn to gather all evidence they can, generate many hypotheses and to check which of them agree with evidence. Which is exactly the approach that is supposed to be tried out in this thread.I have no doubt that the training of policemen, prosecutors, etc, involves the use of examples (real and hypothetical) that give the policemen, prosecutors, etc a basis for comparison when they eventually face real evidence, as well as experience in thinking about the various issues with the evidence.
Certainly, as I’ve pointed out before: in order to convince me that an actual resurrection event had taken place I would require some hefty evidence.So, if you want to go back to the subject, let’s do so. In that case, please stop evading the need to formulate an alternative hypothesis and to test it. And no, just trying to rule much of evidence out or to claim that evidence is not sufficient without actually checking which hypotheses can be disproved by it is not going to work.
And yes, I am pretty sure that in such case everything else in this post is irrelevant.
We already know that it is unreasonable to expect to persuade you. There is no need to give even more evidence in support of that.Certainly, as I’ve pointed out before: in order to convince me that an actual resurrection event had taken place I would require some hefty evidence.