CPU and memory limits and requests can vary according to the number of targets monitored, and the number of metrics exposed by each target. For example, a Prometheus OpenMetrics integration which scrapes 800
targets, exposing 1000 timeseries
each, with a latency of 150ms
and a scrape_duration
of 30 seconds, consumes 2.5CPU
and 700MB
of RAM.
Configure the integration for large environments
To estimate the size of the environment you are monitoring, run the following query to see how many targets are being scraped:
SELECT latest(nr_stats_targets) FROM Metric where clusterName=’clusterName’ SINCE 30 MINUTES AGO TIMESERIES
In huge environments with hundreds of targets to be scraped, the latency on the /metrics
endpoints must be below 1 second. Run this query to check the latency of the different targets. This query retrieves the data exposed by the Prometheus OpenMetrics integration, and shows the time required to fetch each endpoint.
SELECT average(nr_stats_integration_fetch_target_duration_seconds) FROM Metric where clusterName=’clustername' SINCE 30 MINUTES AGO FACET target LIMIT 30
In order to keep the time needed to scrape all the targets below 30 seconds, use the following configurations:
Targets | Configuration |
---|---|
Targets < 400, with 1000 metrics each | No modification is required. CPU ranges roughly between |
400 < targets < 1000, with 1000 metrics each | The number of workers should be increased to |
Targets > 1000, with 1000 metrics each | The number of workers should be increased to 10 or more. CPU is over |
For more help
If you need more help, check out these support and learning resources:
- Browse the Explorers Hub to get help from the community and join in discussions.
- Find answers on our sites and learn how to use our support portal.
- Run New Relic Diagnostics, our troubleshooting tool for Linux, Windows, and macOS.
- Review New Relic's and and documentation.