Skip to content

Epoch Millis Rounding Down and Not Up#116701

Closed
john-wagster wants to merge 11 commits intoelastic:mainfrom
john-wagster:roundup-epoch-milis
Closed

Epoch Millis Rounding Down and Not Up#116701
john-wagster wants to merge 11 commits intoelastic:mainfrom
john-wagster:roundup-epoch-milis

Conversation

@john-wagster
Copy link
Contributor

Related to: #109542

Dug into the above mentioned bug wherein a date range check for ">" was failing because the underlying date was being parsed as off by 1 millisecond in some cases.

operations to reproduce
curl -XPUT --header 'Content-Type: application/json' "http://localhost:9200/test" -d '{
  "mappings": {
    "properties": {
      "mydate": {
        "type": "date",
        "format": "yyyy-MM-ddXXX"
      }
    }
  }
}'
curl -XPUT --header 'Content-Type: application/json' "http://localhost:9200/test/_doc/1?refresh" -d '{
  "mydate": "1969-12-31Z"
}'
# 1969-12-31Z as ms since epoch incorrectly returns 1 hit
curl -XGET --header 'Content-Type: application/json' "http://localhost:9200/test/_search?filter_path=hits.total.value" -d '{
  "query": {
    "range": {
      "mydate": {
        "gt": -86400000,
        "format": "epoch_millis"
      }
    }
  }
}'

I found that this was because negative epoch milliseconds were being rounded down by 999_999 nano seconds rather than up as per the missing date component logic documentation

This was because naively when nano seconds are defaulted to 999_999 we subsequently parse a negative epoch milli value and have to convert negative epoch millis into a (seconds, nanos) tuple. The math to do so never accounted for the intention of rounding and instead correctly was using the intended behavior of fractional milliseconds and pushing the time value toward negative infinity. For rounding up though this was likely not intended.

To compensate for this I've introduced a small hack wherein we detect the intention of rounding up as the default behavior by setting the default to an invalid value of -999_999 instead of 999_999. This allows the parser to detect whether the date was intended to round up by 999_999 or shift toward negative infinity by 999_999 fractional milliseconds.

…ad were rounded down causing gt behavior to fail when off by 1 millisecond
…ad were rounded down causing gt behavior to fail when off by 1 millisecond
@john-wagster john-wagster requested a review from a team as a code owner November 13, 2024 03:18
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.0.0 labels Nov 13, 2024
@john-wagster john-wagster removed the needs:triage Requires assignment of a team area label label Nov 13, 2024
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Nov 13, 2024
@john-wagster john-wagster added :Search Relevance/Search Catch all for Search Relevance :Core/Infra/Core Core issues without another label labels Nov 13, 2024
@elasticsearchmachine elasticsearchmachine added Team:Core/Infra Meta label for core/infra team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch and removed needs:triage Requires assignment of a team area label labels Nov 13, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@john-wagster john-wagster added needs:triage Requires assignment of a team area label v8.17.0 and removed Team:Core/Infra Meta label for core/infra team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch needs:triage Requires assignment of a team area label labels Nov 13, 2024
@elasticsearchmachine elasticsearchmachine added Team:Core/Infra Meta label for core/infra team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Nov 13, 2024
@john-wagster john-wagster requested a review from rjernst November 13, 2024 03:21
@john-wagster john-wagster added >bug auto-backport Automatically create backport pull requests when merged labels Nov 13, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @john-wagster, I've created a changelog YAML for you.

…ad were rounded down causing gt behavior to fail when off by 1 millisecond
…ad were rounded down causing gt behavior to fail when off by 1 millisecond
…ad were rounded down causing gt behavior to fail when off by 1 millisecond
nanos -= nanosOfMilli;
if (roundUp) {
// these are not the nanos you think they are; these are "round up nanos" not the fractional part of the input
nanos += nanosOfMilli;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this leads to a different result - the nanos are negated above, but then we're also adding them rather than subtracting them here?

If we don't negate the nanos, and the nanos are negative, then nanos -= nanosOfMillis === nanos = nanos - (-nanosOfMillis) === nanos += nanosOfMillis, which is what we want?

Copy link
Contributor Author

@john-wagster john-wagster Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nanos are negated because the negative is used as a placeholder to indicate rounding. However, they don't have to be. I could rewrite the code to not negative the nanos and subtract them subsequently. This might be every so slightly more performant but I found the current implementation more readable as you can immediately see that the negate nanos which are invalid were intentionally invalid as a placeholder indicating rounding should occur. I'm up for opinions here. I'm happy to replace the logic with a large comment indicating that negative nanos is not a bug but an intentional flag here instead of using a boolean to flag it. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a negative value mean a different thing is really confusing. It's better to have an explicit boolean, with a comment, then we're not having two different branches that are actually doing the same thing, because a negative value has meaning elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I'm not sure of an alternative. Do you have any suggestions?

The parser has to set some available field to indicate during parsing that this is a value that's missing fractional milliseconds and needs to be rounded. Utilizing the NANOS_OF_MILLIS field is the most intuitive thing to me given the situation. I had originally tried piggy backing on the negative sign but that misses an edge case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thecoop did you have any additional suggestions here?

LongSupplier supplier = () -> 0L;
{
Instant instant = formatter.toDateMathParser().parse("0", supplier, true, ZoneId.of("UTC"));
assertEquals("1970-01-01T00:00:00.000999999Z", instant.toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would epoch have non-zero nanos?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this bug I don't think it matters, but to follow the docs and if there is some other edge case with date comparison elsewhere, the date created should follow the spec outlined in the range query docs where when rounded nanos are 999_999.

Currently the DateFieldMapper compares at a millisecond level and so keeping 999_999 nanos has no impact for this specific case (it can be dropped at this point); although one might argue that this is a separate bug as all (I believe all of them) of our date fields in mapping support nanos and so if a user were to store and compare nanos they would find our resolution of milliseconds buggy as in and I was planning to create a separate issue for this:

curl -XPUT --header 'Content-Type: application/json' "http://localhost:9200/test" -d '{
  "mappings": {
    "properties": {
      "mydate": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss.SSSSSSSSSX"
      }
    }
  }
}'
curl -XPUT --header 'Content-Type: application/json' "http://localhost:9200/test/_doc/1?refresh" -d '{
  "mydate": "1969-12-31 00:00:00.000000001Z"
}'

both of these queries should produce a result of 1 but instead produce a result of 0 because our resolution for date comparison only goes down to the millisecond level in DateFieldMapper.

# 1969-12-31Z as ms since epoch
curl -XGET --header 'Content-Type: application/json' "http://localhost:9200/test/_search?filter_path=hits.total.value" -d '{
  "query": {
    "range": {
      "mydate": {
        "gt": "-86400000.000000",
        "format": "epoch_millis"
      }
    }
  }
}'
# 1969-12-31Z as ms since epoch
curl -XGET --header 'Content-Type: application/json' "http://localhost:9200/test/_search?filter_path=hits.total.value" -d '{
  "query": {
    "range": {
      "mydate": {
        "gt": "-86400000",
        "format": "epoch_millis"
      }
    }
  }
}'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rjernst any additional questions or concerns here?

@elasticsearchmachine
Copy link
Collaborator

Hi @john-wagster, I've updated the changelog YAML for you.

@john-wagster
Copy link
Contributor Author

@elasticmachine please update branch

@john-wagster
Copy link
Contributor Author

@elasticmachine update branch

@john-wagster
Copy link
Contributor Author

@elasticmachine update branch

@john-wagster john-wagster marked this pull request as draft December 10, 2024 15:39
@john-wagster
Copy link
Contributor Author

closing in favor of: #118353

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >bug :Core/Infra/Core Core issues without another label :Search Relevance/Search Catch all for Search Relevance Team:Core/Infra Meta label for core/infra team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.18.0 v9.0.0

5 participants