Skip to content

Adding Destination fields#976

Merged
eyalkoren merged 14 commits intoelastic:masterfrom
eyalkoren:destination-fields
Jan 9, 2020
Merged

Adding Destination fields#976
eyalkoren merged 14 commits intoelastic:masterfrom
eyalkoren:destination-fields

Conversation

@eyalkoren
Copy link
Contributor

@eyalkoren eyalkoren commented Dec 26, 2019

Closes #934
Closes #957
Closes #969

Some technology-specific info:

  • Hibernate search is not supported as it seems a low ROI, especially since only relevant for usage with remote data stores.
  • Databases support is largely based on online research. It is not documented whether the URL we would get from the actual connection is exactly the same as the one used to configure the data source. I tested where and what possible, but since our integration tests rely on testcontainers, we cannot test all syntaxes.
    For example- the Oracle address list parenthesis syntax or MySQL's multiple host configuration. If those are required, we can add them in the future.
  • JMS doesn't offer a standard API to retrieve the broker address, so this needs to be implemented for each JMS client. Since the address and port are not required for the service map, I excluded it from this already big enough PR. We can implement that in a later time. Examples for getting the address in the two clients we have integration tests for:
    • ActiveMQ: ((ActiveMQMessageProducer) producer).session.getConnection().getTransport().getRemoteAddress() returns a String that we can use to create a URI, from which host and port can be extracted.
    • ActiveMQ Artemis:((TopologyMember)((ActiveMQMessageProducer)producer).connection.getSessionFactory().getServerLocator().getTopology().getMembers().toArray()[0]).toURI() returns a String that we can use to create a URI, from which host and port of one broker node can be extracted. If multiple destinations are configured, this is not enough to get the actual address used for a specific connection.
  • Redis (requires @felixbarny's input):
    • Jedis: it seems like the Sharded client instrumentation is redundant, as this client first finds a shard and then calls the already instrumented Jedis. The integration tests work without those. Am I missing something here?
    • Lettuce: I couldn't find an elegant way to locate the address and port. We can instrument the Client construction and keep mapping from Connections to the Client, then instrument the Connections and map created Commands to Connections and use those eventually (although there are differences between versions). Based on your experience, can you think of a better way? Unless you can think of something, we will keep Lettuce unsupported and if required create a separate issue.

Checklist

@codecov-io
Copy link

codecov-io commented Dec 26, 2019

Codecov Report

Merging #976 into master will increase coverage by 0.48%.
The diff coverage is 67.05%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #976      +/-   ##
============================================
+ Coverage     63.23%   63.72%   +0.48%     
  Complexity       85       85              
============================================
  Files           252      255       +3     
  Lines         10193    10482     +289     
  Branches       1345     1391      +46     
============================================
+ Hits           6446     6680     +234     
- Misses         3364     3393      +29     
- Partials        383      409      +26
Impacted Files Coverage Δ Complexity Δ
...astic/apm/agent/jdbc/StatementInstrumentation.java 32.5% <ø> (ø) 0 <0> (ø) ⬇️
...ent/hibernate/search/HibernateSearchConstants.java 0% <ø> (ø) 0 <0> (ø) ⬇️
.../apm/agent/okhttp/OkHttpClientInstrumentation.java 17.85% <0%> (-1.38%) 0 <0> (ø)
...client/AbstractAsyncHttpClientInstrumentation.java 42.85% <0%> (-0.9%) 0 <0> (ø)
...lastic/apm/agent/mongoclient/ConnectionAdvice.java 0% <0%> (ø) 0 <0> (ø) ⬇️
.../elastic/apm/agent/jdbc/helper/JdbcHelperImpl.java 0% <0%> (ø) 0 <0> (ø) ⬇️
...m/agent/mongoclient/ConnectionInstrumentation.java 0% <0%> (ø) 0 <0> (ø) ⬇️
...apm/agent/okhttp/OkHttp3ClientInstrumentation.java 18.51% <0%> (-1.49%) 0 <0> (ø)
...agent/okhttp/OkHttpClientAsyncInstrumentation.java 21.73% <0%> (-0.99%) 0 <0> (ø)
...nt/httpclient/HttpAsyncRequestProducerWrapper.java 0% <0%> (ø) 0 <0> (ø) ⬇️
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2f6d251...338a358. Read the comment docs.

@eyalkoren eyalkoren changed the title [WIP] Adding Destination fields Dec 29, 2019
@eyalkoren eyalkoren requested a review from felixbarny December 29, 2019 14:10
@eyalkoren eyalkoren self-assigned this Dec 29, 2019
@eyalkoren eyalkoren mentioned this pull request Jan 2, 2020
12 tasks
@felixbarny
Copy link
Member

Some general questions:

  • Can this be a post-processing step done by APM Server to avoid duplication across agents?
  • Is there a spec besides the GH issues?
  • Are there common test cases/acceptance tests?
  • What is the consequence of not supporting a particular RDBMS or JMS impl?
  • When communicating with a cluster consisting of multiple nodes, like with Elasticsearch, we send different destination fields based on which node we happen to talk to. Is that intentional? What is the effect of that on the service map?
@eyalkoren
Copy link
Contributor Author

Can this be a post-processing step done by APM Server to avoid duplication across agents?

May be only relevant for HTTP URLs, but maybe there are things agents need to take care differently anyway in this sense (eg IPv6 with/without square brackets ?). For DB- those are JDBC-specific URLs AFAIU.

Is there a spec besides the GH issues?

What's missing from the GH issues?

What is the consequence of not supporting a particular RDBMS or JMS impl?

When communicating with a cluster consisting of multiple nodes, like with Elasticsearch, we send different destination fields based on which node we happen to talk to. Is that intentional? What is the effect of that on the service map?

The destination.XXX fields are mostly for SIEM atm, although I can see how they can be useful down the road for service maps to realize clusters, integrate with metrics etc.
However, for now, the service maps will rely on the destination.service fields, where this is not relevant. If you look into these issues, you will see there is a discussion we started about how to deal with clustered services, but we worked around it for the first service map iteration. In fact, data stores are probably not going to be included for now.

Copy link
Member

@felixbarny felixbarny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure we don't forget adding destination.* fields going forward, check the presence of those fields in MockReporter.reportSpan if it's an external span. For those span types where we can't determine the destination fields, like Hibernate search, maintain an exclude list. Nice to have: If the test fails the error message should link to the spec/issue to read up on how the fields should be populated.

Overall a bit more complex than what I was expecting/hoping for but seems to be inherent complexity as opposed to accidental. Some examples:

  • Some HTTP clients use square brackets for IPV6 addresses, some don't
  • JDBC parsing is quite complex and highly specific to the vendor

This is also a thing to always keep in mind when adding new integrations (no judgment, just an observation).

@eyalkoren
Copy link
Contributor Author

check the presence of those fields in MockReporter.reportSpan if it's an external span.

Good idea, will do.

Overall a bit more complex than what I was expecting/hoping for but seems to be inherent complexity as opposed to accidental.

I did expect some complexity due to the reasons you listed, as well as the fact that you need to look in different places in different frameworks. In addition, trying to avoid allocations or expensive lookups, you need to be specific on where you get the info, for example in some cases you it's cheapest to use a ready URL and in others it's best to get the host and port. But certainly more complex than I wanted it to be...

This is also a thing to always keep in mind when adding new integrations

True, but your suggestion to fail tests when not setting these fields should remind us.

@eyalkoren eyalkoren requested a review from felixbarny January 9, 2020 12:37
@eyalkoren eyalkoren merged commit f2eff5e into elastic:master Jan 9, 2020
@eyalkoren eyalkoren deleted the destination-fields branch January 9, 2020 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants