Skip to content

Emit more granular data points for troubleshooting #2883

Open
@csterwa

Description

@csterwa

Problem Statement

Config Servers that are configured with Git environment repositories are making multiple connections to Git servers and may be providing filters properties to many clients. When there are timeouts or other issues with Git server fetches and applications have issues getting updates on restarts it can be difficult to figure out where the root of the problem is. Especially when the symptom is on the client that is having an issue getting properties.

Emit more granular metrics

There are two environment repositories that we initially need more detailed metrics for:

Git

  • Git server fetch response time, errors and rate of requests
  • Processing time of fetched repo into properties

Vault

  • Vault secrets fetch time
  • Processing time of secrets into properties

Enhance health actuator

A Config Server should provide more details on what factors may have affected property fetching from services (Git, Vault and environment repositories) and dissemination to clients when it is unhealthy.

Additional context

Many organizations I talk with have Config Server dashboards showing health. They also have observability tools but are not able to get sufficient data points to help in troubleshooting Config Server instances.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions