How to implement float hashing with approximate equality

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}

Let's say we have the following python class (the problem exists in java just the same with equals and hashCode)

class Temperature:

    def __init__(self, degrees):

        self.degrees = degrees

where degrees is the temperature in Kelvin as a float. Now, I would like to implement equality testing and hashing for Temperature in a way that

compares floats up to an epsilon difference instead of direct equality testing,

and honors the contract that a == b implies hash(a) == hash(b).

def __eq__(self, other):

    return abs(self.degrees - other.degrees) < EPSILON



def __hash__(self):

    return # What goes here?

The python documentation talks a bit about hashing numbers to ensure that hash(2) == hash(2.0) but this is not quite the same problem.

Am I even on the right track? And if so, what is the standard way to implement hashing in this situation?

Update: Now I understand that this type of equality testing for floats eliminates the transitivity of == and equals. But how does that go together with the "common knowledge" that floats should not be compared directly? If you implement an equality operator by comparing floats, static analysis tools will complain. Are they right to do so?

edited 14 mins ago

Solomon Ucko

11717

asked 12 hours ago

CQQL

1244

New contributor

The standard hashCode/equals contract says that if two instances are considered equal, then their hash codes must match. You violate this using a float, since its hash will change. If you're willing to allow a tolerance to be "fixed" (say, rounded to the nearest thousandths digit), then the hash code of a rounded degree would match the hash code of a rounded degree in another instance.. Just an idea..

– Neil
5 hours ago

2

why is the question has Java's tag?

– Laiv
4 hours ago

4

About your update: I would say that hashing floats is generally a questionable thing. Try to avoid using floats as keys or as set elements.

– J. Fabian Meier
3 hours ago

1

@Neil: At the same time, doesn't rounding sound like integers? By that I mean: if you can round to, say, thousandths of degrees, then you could simply used a fixed-point representation -- an integer expressing the temperature in thousandths of degrees. For ease of use, you could have a getter/setter transparently converting from/to floats if you wish to...

– Matthieu M.
1 hour ago

@MatthieuM. If that's possible, that's obviously ideal. Though if you had to round to the nearest 0.32%, then that's difficult to represent properly using a mere integer.

– Neil
25 mins ago

|
show 1 more comment

Let's say we have the following python class (the problem exists in java just the same with equals and hashCode)

class Temperature:

    def __init__(self, degrees):

        self.degrees = degrees

where degrees is the temperature in Kelvin as a float. Now, I would like to implement equality testing and hashing for Temperature in a way that

compares floats up to an epsilon difference instead of direct equality testing,

and honors the contract that a == b implies hash(a) == hash(b).

def __eq__(self, other):

    return abs(self.degrees - other.degrees) < EPSILON



def __hash__(self):

    return # What goes here?

The python documentation talks a bit about hashing numbers to ensure that hash(2) == hash(2.0) but this is not quite the same problem.

Am I even on the right track? And if so, what is the standard way to implement hashing in this situation?

edited 14 mins ago

Solomon Ucko

11717

asked 12 hours ago

CQQL

1244

New contributor

The standard hashCode/equals contract says that if two instances are considered equal, then their hash codes must match. You violate this using a float, since its hash will change. If you're willing to allow a tolerance to be "fixed" (say, rounded to the nearest thousandths digit), then the hash code of a rounded degree would match the hash code of a rounded degree in another instance.. Just an idea..

– Neil
5 hours ago

2

why is the question has Java's tag?

– Laiv
4 hours ago

4

About your update: I would say that hashing floats is generally a questionable thing. Try to avoid using floats as keys or as set elements.

– J. Fabian Meier
3 hours ago

1

@Neil: At the same time, doesn't rounding sound like integers? By that I mean: if you can round to, say, thousandths of degrees, then you could simply used a fixed-point representation -- an integer expressing the temperature in thousandths of degrees. For ease of use, you could have a getter/setter transparently converting from/to floats if you wish to...

– Matthieu M.
1 hour ago

@MatthieuM. If that's possible, that's obviously ideal. Though if you had to round to the nearest 0.32%, then that's difficult to represent properly using a mere integer.

– Neil
25 mins ago

|
show 1 more comment

Let's say we have the following python class (the problem exists in java just the same with equals and hashCode)

class Temperature:

    def __init__(self, degrees):

        self.degrees = degrees

where degrees is the temperature in Kelvin as a float. Now, I would like to implement equality testing and hashing for Temperature in a way that

compares floats up to an epsilon difference instead of direct equality testing,

and honors the contract that a == b implies hash(a) == hash(b).

def __eq__(self, other):

    return abs(self.degrees - other.degrees) < EPSILON



def __hash__(self):

    return # What goes here?

The python documentation talks a bit about hashing numbers to ensure that hash(2) == hash(2.0) but this is not quite the same problem.

Am I even on the right track? And if so, what is the standard way to implement hashing in this situation?

edited 14 mins ago

Solomon Ucko

11717

asked 12 hours ago

CQQL

1244

New contributor

Let's say we have the following python class (the problem exists in java just the same with equals and hashCode)

class Temperature:

    def __init__(self, degrees):

        self.degrees = degrees

where degrees is the temperature in Kelvin as a float. Now, I would like to implement equality testing and hashing for Temperature in a way that

compares floats up to an epsilon difference instead of direct equality testing,

and honors the contract that a == b implies hash(a) == hash(b).

def __eq__(self, other):

    return abs(self.degrees - other.degrees) < EPSILON



def __hash__(self):

    return # What goes here?

The python documentation talks a bit about hashing numbers to ensure that hash(2) == hash(2.0) but this is not quite the same problem.

Am I even on the right track? And if so, what is the standard way to implement hashing in this situation?

java python hashing floating-point

edited 14 mins ago

Solomon Ucko

11717

asked 12 hours ago

CQQL

1244

New contributor

edited 14 mins ago

Solomon Ucko

11717

asked 12 hours ago

CQQL

1244

New contributor

edited 14 mins ago

Solomon Ucko

11717

edited 14 mins ago

Solomon Ucko

11717

edited 14 mins ago

Solomon Ucko

11717

asked 12 hours ago

CQQL

1244

New contributor

asked 12 hours ago

CQQL

1244

asked 12 hours ago

CQQL

1244

New contributor

CQQL is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

The standard hashCode/equals contract says that if two instances are considered equal, then their hash codes must match. You violate this using a float, since its hash will change. If you're willing to allow a tolerance to be "fixed" (say, rounded to the nearest thousandths digit), then the hash code of a rounded degree would match the hash code of a rounded degree in another instance.. Just an idea..

– Neil
5 hours ago

2

why is the question has Java's tag?

– Laiv
4 hours ago

4

About your update: I would say that hashing floats is generally a questionable thing. Try to avoid using floats as keys or as set elements.

– J. Fabian Meier
3 hours ago

1

@Neil: At the same time, doesn't rounding sound like integers? By that I mean: if you can round to, say, thousandths of degrees, then you could simply used a fixed-point representation -- an integer expressing the temperature in thousandths of degrees. For ease of use, you could have a getter/setter transparently converting from/to floats if you wish to...

– Matthieu M.
1 hour ago

@MatthieuM. If that's possible, that's obviously ideal. Though if you had to round to the nearest 0.32%, then that's difficult to represent properly using a mere integer.

– Neil
25 mins ago

|
show 1 more comment

The standard hashCode/equals contract says that if two instances are considered equal, then their hash codes must match. You violate this using a float, since its hash will change. If you're willing to allow a tolerance to be "fixed" (say, rounded to the nearest thousandths digit), then the hash code of a rounded degree would match the hash code of a rounded degree in another instance.. Just an idea..

– Neil
5 hours ago

2

why is the question has Java's tag?

– Laiv
4 hours ago

4

About your update: I would say that hashing floats is generally a questionable thing. Try to avoid using floats as keys or as set elements.

– J. Fabian Meier
3 hours ago

1

@Neil: At the same time, doesn't rounding sound like integers? By that I mean: if you can round to, say, thousandths of degrees, then you could simply used a fixed-point representation -- an integer expressing the temperature in thousandths of degrees. For ease of use, you could have a getter/setter transparently converting from/to floats if you wish to...

– Matthieu M.
1 hour ago

@MatthieuM. If that's possible, that's obviously ideal. Though if you had to round to the nearest 0.32%, then that's difficult to represent properly using a mere integer.

– Neil
25 mins ago

The standard hashCode/equals contract says that if two instances are considered equal, then their hash codes must match. You violate this using a float, since its hash will change. If you're willing to allow a tolerance to be "fixed" (say, rounded to the nearest thousandths digit), then the hash code of a rounded degree would match the hash code of a rounded degree in another instance.. Just an idea..

– Neil
5 hours ago

why is the question has Java's tag?

– Laiv
4 hours ago

About your update: I would say that hashing floats is generally a questionable thing. Try to avoid using floats as keys or as set elements.

– J. Fabian Meier
3 hours ago

@Neil: At the same time, doesn't rounding sound like integers? By that I mean: if you can round to, say, thousandths of degrees, then you could simply used a fixed-point representation -- an integer expressing the temperature in thousandths of degrees. For ease of use, you could have a getter/setter transparently converting from/to floats if you wish to...

– Matthieu M.
1 hour ago

@MatthieuM. If that's possible, that's obviously ideal. Though if you had to round to the nearest 0.32%, then that's difficult to represent properly using a mere integer.

– Neil
25 mins ago

|
show 1 more comment

2 Answers
2

active

oldest

votes

implement equality testing and hashing for Temperature in a way that compares floats up to an epsilon difference instead of direct equality testing,

Fuzzy equality violates the requirements that Java places on the equals method, namely transitivity, i.e. that if x == y and y == z, then x == z. But if you do an fuzzy equality with, for example, an epsilon of 0.1, then 0.1 == 0.2 and 0.2 == 0.3, but 0.1 == 0.3 does not hold.

While Python does not document such a requirement, still the implications of having a non-transitive equality make it a very bad idea; reasoning about such types is headache-inducing.

So I strongly recommend you don't do that.

Either provide exact equality and base your hash on that in the obvious way, and provide a separate method to do the fuzzy matching, or go with the equivalence class approach suggested by Kain. Though in the latter case, I recommend you fix your value to a representative member of the equivalence class in the constructor, and then go with simple exact equality and hashing for the rest; it's much easier to reason about the types this way.

(But if you do that, you might as well use a fixed point representation instead of floating point, i.e. you use an integer to count thousandths of a degree, or whatever precision you require.)

answered 5 hours ago

Sebastian Redl

11.5k63741

interesting thoughts. So by accumulating millions of epsilon and with transitivity you can conclude that anything is equal to anything else :-) But does this mathematic constraint acknowledge the discrete foundation of floating points, which in many cases are approximations of the number they are intended to represent ?

– Christophe
5 hours ago

add a comment |

Good Luck

You are not going to be able to achieve that, without being stupid with hashes, or sacrificing the epsilon.

Example:

Assume that each point hashes to its own unique hash value.

As floating point numbers are sequential there will be up to k numbers prior to a given floating point value, and up to k numbers after a given floating point value which are within some epsilon of the given point.

For each two points within epsilon of each other that do not share the same hash value.
- Adjust the hashing scheme so that these two points hash to the same value.

Inducting for all such pairs the entire sequence of floating point numbers will collapse toward a single has value.

There are a few cases where this will not hold true:

Positive/Negative Infinity

A few De-normalised ranges that may not be linkable to the main range for a given epsilon.

perhaps a few other format specific instances

However >=99% of the floating point range will hash to a single value for any value of epsilon that includes at least one floating point value above or below some given floating point value.

Outcome

Either >= 99% entire floating point range hashes to a single value seriously comprimising the intent of a hash value (and any device/container relying on a fairly distributed low-collision hash).

Or the epsilon is such that only exact matches are permitted.

Granular

You could of course go for a granular approach instead.

Under this approach you define exact buckets down to a particular resolution. ie:

[0.001, 0.002)

[0.002, 0.003)

[0.003, 0.004)

...

[122.999, 123.000)

...

Each bucket has a unique hash, and any floating point within the bucket compares equal to any other float in the same bucket.

Unfortunately it is still possible for two floats to be epsilon distance away, and have two separate hashes.

answered 10 hours ago

Kain0_0

4,620419

1

This could become a good answer, when you remove the "not possible" nonsense from the first paragraph and extend the last paragraph by the small part which is missing: the idea of testing the neighboured buckets as well.

– Doc Brown
6 hours ago

I agree that the granular approach here would probably be best, if that fits OP's requirements. Though I'm afraid OP has like +/- 0.1% type requirements, meaning it can't be granular.

– Neil
5 hours ago

2

@DocBrown The "not possible" part is correct. If epsilon based equality should imply that the hash codes are equal, then you automatically have all hash codes equal, so the hash function is not useful anymore. The buckets approach can be fruitful, but you will have numbers with different hash codes that are arbitrarily close to each other.

– J. Fabian Meier
3 hours ago

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "131"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

CQQL is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsoftwareengineering.stackexchange.com%2fquestions%2f391100%2fhow-to-implement-float-hashing-with-approximate-equality%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

implement equality testing and hashing for Temperature in a way that compares floats up to an epsilon difference instead of direct equality testing,

While Python does not document such a requirement, still the implications of having a non-transitive equality make it a very bad idea; reasoning about such types is headache-inducing.

So I strongly recommend you don't do that.

(But if you do that, you might as well use a fixed point representation instead of floating point, i.e. you use an integer to count thousandths of a degree, or whatever precision you require.)

answered 5 hours ago

Sebastian Redl

11.5k63741

interesting thoughts. So by accumulating millions of epsilon and with transitivity you can conclude that anything is equal to anything else :-) But does this mathematic constraint acknowledge the discrete foundation of floating points, which in many cases are approximations of the number they are intended to represent ?

– Christophe
5 hours ago

add a comment |

implement equality testing and hashing for Temperature in a way that compares floats up to an epsilon difference instead of direct equality testing,

While Python does not document such a requirement, still the implications of having a non-transitive equality make it a very bad idea; reasoning about such types is headache-inducing.

So I strongly recommend you don't do that.

(But if you do that, you might as well use a fixed point representation instead of floating point, i.e. you use an integer to count thousandths of a degree, or whatever precision you require.)

answered 5 hours ago

Sebastian Redl

11.5k63741

interesting thoughts. So by accumulating millions of epsilon and with transitivity you can conclude that anything is equal to anything else :-) But does this mathematic constraint acknowledge the discrete foundation of floating points, which in many cases are approximations of the number they are intended to represent ?

– Christophe
5 hours ago

add a comment |

implement equality testing and hashing for Temperature in a way that compares floats up to an epsilon difference instead of direct equality testing,

While Python does not document such a requirement, still the implications of having a non-transitive equality make it a very bad idea; reasoning about such types is headache-inducing.

So I strongly recommend you don't do that.

(But if you do that, you might as well use a fixed point representation instead of floating point, i.e. you use an integer to count thousandths of a degree, or whatever precision you require.)

answered 5 hours ago

Sebastian Redl

11.5k63741

implement equality testing and hashing for Temperature in a way that compares floats up to an epsilon difference instead of direct equality testing,

While Python does not document such a requirement, still the implications of having a non-transitive equality make it a very bad idea; reasoning about such types is headache-inducing.

So I strongly recommend you don't do that.

(But if you do that, you might as well use a fixed point representation instead of floating point, i.e. you use an integer to count thousandths of a degree, or whatever precision you require.)

answered 5 hours ago

Sebastian Redl

11.5k63741

answered 5 hours ago

Sebastian Redl

11.5k63741

answered 5 hours ago

Sebastian Redl

11.5k63741

answered 5 hours ago

Sebastian Redl

11.5k63741

interesting thoughts. So by accumulating millions of epsilon and with transitivity you can conclude that anything is equal to anything else :-) But does this mathematic constraint acknowledge the discrete foundation of floating points, which in many cases are approximations of the number they are intended to represent ?

– Christophe
5 hours ago

add a comment |

interesting thoughts. So by accumulating millions of epsilon and with transitivity you can conclude that anything is equal to anything else :-) But does this mathematic constraint acknowledge the discrete foundation of floating points, which in many cases are approximations of the number they are intended to represent ?

– Christophe
5 hours ago

interesting thoughts. So by accumulating millions of epsilon and with transitivity you can conclude that anything is equal to anything else :-) But does this mathematic constraint acknowledge the discrete foundation of floating points, which in many cases are approximations of the number they are intended to represent ?

– Christophe
5 hours ago

add a comment |

Good Luck

You are not going to be able to achieve that, without being stupid with hashes, or sacrificing the epsilon.

Example:

Assume that each point hashes to its own unique hash value.

For each two points within epsilon of each other that do not share the same hash value.
- Adjust the hashing scheme so that these two points hash to the same value.

Inducting for all such pairs the entire sequence of floating point numbers will collapse toward a single has value.

There are a few cases where this will not hold true:

Positive/Negative Infinity

A few De-normalised ranges that may not be linkable to the main range for a given epsilon.

perhaps a few other format specific instances

However >=99% of the floating point range will hash to a single value for any value of epsilon that includes at least one floating point value above or below some given floating point value.

Outcome

Either >= 99% entire floating point range hashes to a single value seriously comprimising the intent of a hash value (and any device/container relying on a fairly distributed low-collision hash).

Or the epsilon is such that only exact matches are permitted.

Granular

You could of course go for a granular approach instead.

Under this approach you define exact buckets down to a particular resolution. ie:

[0.001, 0.002)

[0.002, 0.003)

[0.003, 0.004)

...

[122.999, 123.000)

...

Each bucket has a unique hash, and any floating point within the bucket compares equal to any other float in the same bucket.

Unfortunately it is still possible for two floats to be epsilon distance away, and have two separate hashes.

answered 10 hours ago

Kain0_0

4,620419

1

This could become a good answer, when you remove the "not possible" nonsense from the first paragraph and extend the last paragraph by the small part which is missing: the idea of testing the neighboured buckets as well.

– Doc Brown
6 hours ago

I agree that the granular approach here would probably be best, if that fits OP's requirements. Though I'm afraid OP has like +/- 0.1% type requirements, meaning it can't be granular.

– Neil
5 hours ago

2

@DocBrown The "not possible" part is correct. If epsilon based equality should imply that the hash codes are equal, then you automatically have all hash codes equal, so the hash function is not useful anymore. The buckets approach can be fruitful, but you will have numbers with different hash codes that are arbitrarily close to each other.

– J. Fabian Meier
3 hours ago

add a comment |

Good Luck

You are not going to be able to achieve that, without being stupid with hashes, or sacrificing the epsilon.

Example:

Assume that each point hashes to its own unique hash value.

For each two points within epsilon of each other that do not share the same hash value.
- Adjust the hashing scheme so that these two points hash to the same value.

Inducting for all such pairs the entire sequence of floating point numbers will collapse toward a single has value.

There are a few cases where this will not hold true:

Positive/Negative Infinity

A few De-normalised ranges that may not be linkable to the main range for a given epsilon.

perhaps a few other format specific instances

However >=99% of the floating point range will hash to a single value for any value of epsilon that includes at least one floating point value above or below some given floating point value.

Outcome

Either >= 99% entire floating point range hashes to a single value seriously comprimising the intent of a hash value (and any device/container relying on a fairly distributed low-collision hash).

Or the epsilon is such that only exact matches are permitted.

Granular

You could of course go for a granular approach instead.

Under this approach you define exact buckets down to a particular resolution. ie:

[0.001, 0.002)

[0.002, 0.003)

[0.003, 0.004)

...

[122.999, 123.000)

...

Each bucket has a unique hash, and any floating point within the bucket compares equal to any other float in the same bucket.

Unfortunately it is still possible for two floats to be epsilon distance away, and have two separate hashes.

answered 10 hours ago

Kain0_0

4,620419

1

This could become a good answer, when you remove the "not possible" nonsense from the first paragraph and extend the last paragraph by the small part which is missing: the idea of testing the neighboured buckets as well.

– Doc Brown
6 hours ago

I agree that the granular approach here would probably be best, if that fits OP's requirements. Though I'm afraid OP has like +/- 0.1% type requirements, meaning it can't be granular.

– Neil
5 hours ago

2

@DocBrown The "not possible" part is correct. If epsilon based equality should imply that the hash codes are equal, then you automatically have all hash codes equal, so the hash function is not useful anymore. The buckets approach can be fruitful, but you will have numbers with different hash codes that are arbitrarily close to each other.

– J. Fabian Meier
3 hours ago

add a comment |

Good Luck

You are not going to be able to achieve that, without being stupid with hashes, or sacrificing the epsilon.

Example:

Assume that each point hashes to its own unique hash value.

For each two points within epsilon of each other that do not share the same hash value.
- Adjust the hashing scheme so that these two points hash to the same value.

Inducting for all such pairs the entire sequence of floating point numbers will collapse toward a single has value.

There are a few cases where this will not hold true:

Positive/Negative Infinity

A few De-normalised ranges that may not be linkable to the main range for a given epsilon.

perhaps a few other format specific instances

However >=99% of the floating point range will hash to a single value for any value of epsilon that includes at least one floating point value above or below some given floating point value.

Outcome

Either >= 99% entire floating point range hashes to a single value seriously comprimising the intent of a hash value (and any device/container relying on a fairly distributed low-collision hash).

Or the epsilon is such that only exact matches are permitted.

Granular

You could of course go for a granular approach instead.

Under this approach you define exact buckets down to a particular resolution. ie:

[0.001, 0.002)

[0.002, 0.003)

[0.003, 0.004)

...

[122.999, 123.000)

...

Each bucket has a unique hash, and any floating point within the bucket compares equal to any other float in the same bucket.

Unfortunately it is still possible for two floats to be epsilon distance away, and have two separate hashes.

answered 10 hours ago

Kain0_0

4,620419

Good Luck

You are not going to be able to achieve that, without being stupid with hashes, or sacrificing the epsilon.

Example:

Assume that each point hashes to its own unique hash value.

For each two points within epsilon of each other that do not share the same hash value.
- Adjust the hashing scheme so that these two points hash to the same value.

Inducting for all such pairs the entire sequence of floating point numbers will collapse toward a single has value.

There are a few cases where this will not hold true:

Positive/Negative Infinity

A few De-normalised ranges that may not be linkable to the main range for a given epsilon.

perhaps a few other format specific instances

However >=99% of the floating point range will hash to a single value for any value of epsilon that includes at least one floating point value above or below some given floating point value.

Outcome

Either >= 99% entire floating point range hashes to a single value seriously comprimising the intent of a hash value (and any device/container relying on a fairly distributed low-collision hash).

Or the epsilon is such that only exact matches are permitted.

Granular

You could of course go for a granular approach instead.

Under this approach you define exact buckets down to a particular resolution. ie:

[0.001, 0.002)

[0.002, 0.003)

[0.003, 0.004)

...

[122.999, 123.000)

...

Each bucket has a unique hash, and any floating point within the bucket compares equal to any other float in the same bucket.

Unfortunately it is still possible for two floats to be epsilon distance away, and have two separate hashes.

answered 10 hours ago

Kain0_0

4,620419

answered 10 hours ago

Kain0_0

4,620419

answered 10 hours ago

Kain0_0

4,620419

answered 10 hours ago

Kain0_0

4,620419

1

This could become a good answer, when you remove the "not possible" nonsense from the first paragraph and extend the last paragraph by the small part which is missing: the idea of testing the neighboured buckets as well.

– Doc Brown
6 hours ago

I agree that the granular approach here would probably be best, if that fits OP's requirements. Though I'm afraid OP has like +/- 0.1% type requirements, meaning it can't be granular.

– Neil
5 hours ago

2

@DocBrown The "not possible" part is correct. If epsilon based equality should imply that the hash codes are equal, then you automatically have all hash codes equal, so the hash function is not useful anymore. The buckets approach can be fruitful, but you will have numbers with different hash codes that are arbitrarily close to each other.

– J. Fabian Meier
3 hours ago

add a comment |

1

This could become a good answer, when you remove the "not possible" nonsense from the first paragraph and extend the last paragraph by the small part which is missing: the idea of testing the neighboured buckets as well.

– Doc Brown
6 hours ago

I agree that the granular approach here would probably be best, if that fits OP's requirements. Though I'm afraid OP has like +/- 0.1% type requirements, meaning it can't be granular.

– Neil
5 hours ago

2

@DocBrown The "not possible" part is correct. If epsilon based equality should imply that the hash codes are equal, then you automatically have all hash codes equal, so the hash function is not useful anymore. The buckets approach can be fruitful, but you will have numbers with different hash codes that are arbitrarily close to each other.

– J. Fabian Meier
3 hours ago

This could become a good answer, when you remove the "not possible" nonsense from the first paragraph and extend the last paragraph by the small part which is missing: the idea of testing the neighboured buckets as well.

– Doc Brown
6 hours ago

I agree that the granular approach here would probably be best, if that fits OP's requirements. Though I'm afraid OP has like +/- 0.1% type requirements, meaning it can't be granular.

– Neil
5 hours ago

@DocBrown The "not possible" part is correct. If epsilon based equality should imply that the hash codes are equal, then you automatically have all hash codes equal, so the hash function is not useful anymore. The buckets approach can be fruitful, but you will have numbers with different hash codes that are arbitrarily close to each other.

– J. Fabian Meier
3 hours ago

add a comment |

CQQL is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

CQQL is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Software Engineering Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

YM1Wc3v64tDA8 1nzdGjA,3TNs,zyrdDsPIXrnEe2OTTPS Z3h A1

搜尋此網誌

Gfyuki