METRICS-REFERENCE
Here’s a non-exhaustive, albeit detailed, summary of supported Prometheus metrics. This includes each metric’s name, description, and associated variable labels (which provide additional context and granularity to the metrics).
Table of Contents
- Prometheus: major changes in v3.26
- Variable labels
- Common metrics: ais targets and gateways
- Target metrics
- Backend metrics
Prometheus: major changes in v3.26
- so-called default
go_*
counters and gauges (go_gc
.go_metstats
. etc.) are completely gone; - metrics are now updated directly in real time
- previously: periodically via
prometheus.Collect
interface - see related note in stats/prom.go
- previously: periodically via
- we are no longer publishing internally computed latencies and throughputs;
- use
*.ns.total
(nanoseconds) and*.size
(bytes) metrics to compute latency and throughput, respectively;- based on the user controlled time intervals - for reference, see CLI
performance throughput
andperformance latency
- note: for Prometheus client, internal
.ns.total
suffix becomes_ns_total
, and.size
, respectively,_bytes
;
- based on the user controlled time intervals - for reference, see CLI
- in addition to total aggregated numbers there are now separately computed per-backend latency and throughput numbers;
- those with
aws.
prefix, for instance.
- those with
Variable labels
Each AIS metric carries node_id
- a static label in Prometheus terminology.
Starting v3.26, majority of the metrics will also contain variable labels:
- Variable Labels:
bucket
: Name of the associated bucket.xkind
: Job kind.xid
: Job ID.mountpath
: Mountpath.
- all I/O metrics will now carry the bucket name (or
Cname
, to be precise) as a Prometheus variable label; - all in-cluster writing generated by xactions (jobs) will now also have this xaction labels as well: the respective kind and ID;
- one major side-effect of the above is that we will now see more PUT metrics, and not only those that result from user PUT requests;
- all GET, PUT, and DELETE errors will also have the bucket label;
- all FSHC related errors (the so called IO errors) will carry mountpath (ie., faulty disk) label.
Common metrics: ais targets and gateways
- Request Metrics:
GetCount
: Total number of executed GET(object) requests.- Variable Labels:
bucket
- Variable Labels:
PutCount
: Total number of executed PUT(object) requests.- Variable Labels:
bucket
,xkind
,xid
- Variable Labels:
HeadCount
: Total number of executed HEAD(object) requests (currently only remote HEAD).- Variable Labels:
bucket
- Variable Labels:
AppendCount
: Total number of executed APPEND(object) requests.- Variable Labels:
bucket
- Variable Labels:
DeleteCount
: Total number of executed DELETE(object) requests.- Variable Labels:
bucket
- Variable Labels:
RenameCount
: Total number of executed rename(object) requests.- Variable Labels:
bucket
- Variable Labels:
ListCount
: Total number of executed list-objects requests.- Variable Labels:
bucket
- Variable Labels:
Common Error Counters
- Error Metrics:
ErrGetCount
: Total number of GET(object) errors.- Variable Labels:
bucket
- Variable Labels:
ErrPutCount
: Total number of PUT(object) errors.- Variable Labels:
bucket
,xkind
,xid
- Variable Labels:
ErrHeadCount
: Total number of HEAD(object) errors.- Variable Labels:
bucket
- Variable Labels:
ErrAppendCount
: Total number of APPEND(object) errors.- Variable Labels:
bucket
- Variable Labels:
ErrDeleteCount
: Total number of DELETE(object) errors.- Variable Labels:
bucket
- Variable Labels:
ErrRenameCount
: Total number of rename(object) errors.- Variable Labels:
bucket
- Variable Labels:
ErrListCount
: Total number of list-objects errors.- Variable Labels:
bucket
- Variable Labels:
Common Latencies
- Latency Metrics:
GetLatency
: GET average time (milliseconds) over the last periodic.stats_time interval.- Variable Labels:
bucket
- Variable Labels:
GetLatencyTotal
: GET total cumulative time (nanoseconds).- Variable Labels:
bucket
- Variable Labels:
ListLatency
: List-objects average time (milliseconds) over the last periodic.stats_time interval.- Variable Labels:
bucket
- Variable Labels:
For convenience, we also include here a (somewhat redundant) table that summarizes common metrics.
Internal name | Public name | Internal Type | Description (Prometheus help) | Prometheus labels |
---|---|---|---|---|
get.n |
get_count |
counter | total number of executed GET(object) requests | default |
put.n |
put_count |
counter | total number of executed PUT(object) requests | default |
head.n |
head_count |
counter | total number of executed HEAD(object) requests | default |
append.n |
append_count |
counter | total number of executed APPEND(object) requests | default |
del.n |
del_count |
counter | total number of executed DELETE(object) requests | default |
ren.n |
ren_count |
counter | total number of executed rename(object) requests | default |
lst.n |
lst_count |
counter | total number of executed list-objects requests | default |
err.get.n |
err_get_count |
counter | total number of GET(object) errors | default |
err.put.n |
err_put_count |
counter | total number of PUT(object) errors | default |
err.head.n |
err_head_count |
counter | total number of HEAD(object) errors | default |
err.append.n |
err_append_count |
counter | total number of APPEND(object) errors | default |
err.del.n |
err_del_count |
counter | total number of DELETE(object) errors | default |
err.ren.n |
err_ren_count |
counter | total number of rename(object) errors | default |
err.lst.n |
err_lst_count |
counter | total number of list-objects errors | default |
err.http.write.n |
err_http_write_count |
counter | total number of HTTP write-response errors | default |
err.dl.n |
err_dl_count |
counter | downloader: number of download errors | default |
err.put.mirror.n |
err_put_mirror_count |
counter | number of n-way mirroring errors | default |
get.ns |
get_ms |
latency | GET: average time (milliseconds) over the last periodic.stats_time interval | default |
get.ns.total |
get_ns_total |
total | GET: total cumulative time (nanoseconds) | default |
lst.ns |
lst_ms |
latency | list-objects: average time (milliseconds) over the last periodic.stats_time interval | default |
kalive.ns |
kalive_ms |
latency | in-cluster keep-alive (heartbeat): average time (milliseconds) over the last periodic.stats_time interval | default |
up.ns.time |
uptime |
special | this node’s uptime since its startup (seconds) | default |
state.flags |
state_flags |
gauge | bitwise 64-bit value that carries enumerated node-state flags, including warnings and alerts; see https://github.com/NVIDIA/aistore/blob/main/cmn/cos/node_state.go |
Target metrics
- Out-of-Band Metrics:
VerChangeCount
: Number of out-of-band updates (by a 3rd party performing remote PUTs from outside this cluster).- Variable Labels:
bucket
- Variable Labels:
VerChangeSize
: Total cumulative size (bytes) of objects updated out-of-band across all backends combined.- Variable Labels:
bucket
- Variable Labels:
RemoteDeletedDelCount
: Number of out-of-band deletes (by a 3rd party remote DELETE(object) from outside this cluster).- Variable Labels:
bucket
- Variable Labels:
- PUT Latency Metrics:
PutLatency
: PUT average time (milliseconds) over the last periodic.stats_time interval.- Variable Labels:
bucket
,xkind
,xid
- Variable Labels:
PutLatencyTotal
: PUT total cumulative time (nanoseconds).- Variable Labels:
bucket
,xkind
,xid
- Variable Labels:
- HEAD Latency Metrics:
HeadLatency
: HEAD average time (milliseconds) over the last periodic.stats_time interval.- Variable Labels:
bucket
- Variable Labels:
HeadLatencyTotal
: HEAD total cumulative time (nanoseconds).- Variable Labels:
bucket
- Variable Labels:
- APPEND Latency Metrics:
AppendLatency
: APPEND average time (milliseconds) over the last periodic.stats_time interval.- Variable Labels:
bucket
- Variable Labels:
- Throughput Metrics:
GetThroughput
: GET average throughput (MB/s) over the last periodic.stats_time interval.- Variable Labels:
bucket
- Variable Labels:
PutThroughput
: PUT average throughput (MB/s) over the last periodic.stats_time interval.- Variable Labels:
bucket
,xkind
,xid
- Variable Labels:
- Size Metrics:
GetSize
: GET total cumulative size (bytes).- Variable Labels:
bucket
- Variable Labels:
PutSize
: PUT total cumulative size (bytes).- Variable Labels:
bucket
,xkind
,xid
- Variable Labels:
- Error Metrics:
ErrPutCksumCount
: PUT number of checksum errors.- Variable Labels:
bucket
,xkind
,xid
- Variable Labels:
ErrFSHCCount
: Number of times filesystem health checker (FSHC) was triggered by an I/O error or errors.- Variable Labels:
mountpath
- Variable Labels:
IOErrGetCount
: GET number of I/O errors (excluding remote backend and network errors).- Variable Labels:
bucket
- Variable Labels:
IOErrDeleteCount
: DELETE(object) number of I/O errors (excluding remote backend and network errors).- Variable Labels:
bucket
- Variable Labels:
For convenience, a table that summarizes target metrics follows below.
Internal name | Public name | Internal Type | Description (Prometheus help) | Prometheus labels |
---|---|---|---|---|
disk.<DISK-NAME>.read.bps |
disk_read_mbps |
computed-bandwidth | read bandwidth (MB/s) | map[disk:<DISK-NAME> node_id:<AIS-NODE-ID> ] |
disk.<DISK-NAME>.avg.rsize |
disk_avg_rsize |
gauge | average read size (bytes) | map[disk:<DISK-NAME> node_id:<AIS-NODE-ID> ] |
disk.<DISK-NAME>.write.bps |
disk_write_mbps |
computed-bandwidth | write bandwidth (MB/s) | map[disk:<DISK-NAME> node_id:<AIS-NODE-ID> ] |
disk.<DISK-NAME>.avg.wsize |
disk_avg_wsize |
gauge | average write size (bytes) | map[disk:<DISK-NAME> node_id:<AIS-NODE-ID> ] |
disk.<DISK-NAME>.util |
disk_util |
gauge | disk utilization (%%) | map[disk:<DISK-NAME> node_id:<AIS-NODE-ID> ] |
lru.evict.n |
lru_evict_count |
counter | number of LRU evictions | default |
lru.evict.size |
lru_evict_bytes |
size | total cumulative size (bytes) of LRU evictions | default |
cleanup.store.n |
cleanup_store_count |
counter | space cleanup: number of removed misplaced objects and old work files | default |
cleanup.store.size |
cleanup_store_bytes |
size | space cleanup: total size (bytes) of all removed misplaced objects and old work files (not including removed deleted objects) | default |
ver.change.n |
ver_change_count |
counter | number of out-of-band updates (by a 3rd party performing remote PUTs from outside this cluster) | default |
ver.change.size |
ver_change_bytes |
size | total cumulative size (bytes) of objects that were updated out-of-band across all backends combined | defaul t |
remote.deleted.del.n |
remote_deleted_del_count |
counter | number of out-of-band deletes (by a 3rd party remote DELETE(object) from outside this cluster) | default |
put.ns |
put_ms |
latency | PUT: average time (milliseconds) over the last periodic.stats_time interval | default |
put.ns.total |
put_ns_total |
total | PUT: total cumulative time (nanoseconds) | default |
append.ns |
append_ms |
latency | APPEND(object): average time (milliseconds) over the last periodic.stats_time interval | default |
get.redir.ns |
get_redir_ms |
latency | GET: average gateway-to-target HTTP redirect latency (milliseconds) over the last periodic.stats_time interval | default |
put.redir.ns |
put_redir_ms |
latency | PUT: average gateway-to-target HTTP redirect latency (milliseconds) over the last periodic.stats_time interval | default |
get.bps |
get_mbps |
bandwidth | GET: average throughput (MB/s) over the last periodic.stats_time interval | default |
put.bps |
put_mbps |
bandwidth | PUT: average throughput (MB/s) over the last periodic.stats_time interval | default |
get.size |
get_bytes |
size | GET: total cumulative size (bytes) | default |
put.size |
put_bytes |
size | PUT: total cumulative size (bytes) | default |
err.cksum.n |
err_cksum_count |
counter | PUT: number of checksum errors | default |
err.fshc.n |
err_fshc_count |
counter | number of times filesystem health checker (FSHC) was triggered by an I/O error or errors | default |
err.io.get.n |
err_io_get_count |
counter | GET: number of I/O errors not including remote backend and network errors | default |
err.io.put.n |
err_io_put_count |
counter | PUT: number of I/O errors not including remote backend and network errors | default |
err.io.del.n |
err_io_del_count |
counter | DELETE(object): number of I/O errors not including remote backend and network errors | default |
stream.out.n |
stream_out_count |
counter | intra-cluster streaming communications: number of sent objects | default |
stream.out.size |
stream_out_bytes |
size | intra-cluster streaming communications: total cumulative size (bytes) of all transmitted objects | default |
stream.in.n |
stream_in_count |
counter | intra-cluster streaming communications: number of received objects | default |
stream.in.size |
stream_in_bytes |
size | intra-cluster streaming communications: total cumulative size (bytes) of all received objects | default |
dl.size |
dl_bytes |
size | total downloaded size (bytes) | default |
dl.ns.total |
dl_ns_total |
total | total downloading time (nanoseconds) | default |
dsort.creation.req.n |
dsort_creation_req_count |
counter | dsort: see https://github.com/NVIDIA/aistore/blob/main/docs/dsort.md#metrics | default |
dsort.creation.resp.n |
dsort_creation_resp_count |
counter | dsort: see https://github.com/NVIDIA/aistore/blob/main/docs/dsort.md#metrics | default |
dsort.creation.resp.ns |
dsort_creation_resp_ms |
latency | dsort: see https://github.com/NVIDIA/aistore/blob/main/docs/dsort.md#metrics | default |
dsort.extract.shard.dsk.n |
dsort_extract_shard_dsk_count |
counter | dsort: see https://github.com/NVIDIA/aistore/blob/main/docs/dsort.md#metrics | default |
dsort.extract.shard.mem.n |
dsort_extract_shard_mem_count |
counter | dsort: see https://github.com/NVIDIA/aistore/blob/main/docs/dsort.md#metrics | default |
dsort.extract.shard.size |
dsort_extract_shard_bytes |
size | dsort: see https://github.com/NVIDIA/aistore/blob/main/docs/dsort.md#metrics | default |
lcache.collision.n |
lcache_collision_count |
counter | number of LOM cache collisions (core, internal) | default |
lcache.evicted.n |
lcache_evicted_count |
counter | number of LOM cache evictions (core, internal) | default |
lcache.flush.cold.n |
lcache_flush_cold_count |
counter | number of times a LOM from cache was written to stable storage (core, internal) | default |
remais.get.n |
remote_get_count |
counter | GET: total number of executed remote requests (cold GETs) | map[backend:remais node_id:<AIS-NODE-ID> ] |
remais.get.ns.total |
remote_get_ns_total |
total | GET: total cumulative time (nanoseconds) to execute cold GETs and store new object versions in-cluster | map[backend:remais node_id:<AIS-NODE-ID> ] |
remais.e2e.get.ns.total |
remote_e2e_get_ns_total |
total | GET: total end-to-end time (nanoseconds) servicing remote requests; includes: receiving request, executing cold-GET, storing new object version in-cluster, and transmitting response | map[backend:remais node_id:<AIS-NODE-ID> ] |
remais.get.size |
remote_get_bytes_total |
size | GET: total cumulative size (bytes) of all cold-GET transactions | map[backend:remais node_id:<AIS-NODE-ID> ] |
remais.head.n |
remote_head_count |
counter | HEAD: total number of executed remote requests to a given backend | map[backend:remais node_id:<AIS-NODE-ID> ] |
remais.put.n |
remote_put_count |
counter | PUT: total number of executed remote requests to a given backend | map[backend:remais node_id:<AIS-NODE-ID> ] |
remais.put.ns.total |
remote_put_ns_total |
total | PUT: total cumulative time (nanoseconds) to execute remote requests and store new object versions in-cluster | map[backend:remais node_id:<AIS-NODE-ID> ] |
remais.e2e.put.ns.total |
remote_e2e_put_ns_total |
total | PUT: total end-to-end time (nanoseconds) servicing remote requests; includes: receiving PUT payload, storing it in-cluster, executing remote PUT, finalizing new in-cluster object | map[backend:remais node_id:<AIS-NODE-ID> ] |
remais.put.size |
remote_e2e_put_bytes_total |
size | PUT: total cumulative size (bytes) of all PUTs to a given remote backend | map[backend:remais node_id:ClCt8081] |
remais.ver.change.n |
remote_ver_change_count |
counter | number of out-of-band updates (by a 3rd party performing remote PUTs outside this cluster) | map[backend:remais node_id:<AIS-NODE-ID> ] |
remais.ver.change.size |
remote_ver_change_bytes_total |
size | total cumulative size of objects that were updated out-of-band | map[backend:remais node_id:<AIS-NODE-ID> ] |
gcp.get.n |
remote_get_count |
counter | GET: total number of executed remote requests (cold GETs) | map[backend:gcp node_id:<AIS-NODE-ID> ] |
gcp.get.ns.total |
remote_get_ns_total |
total | GET: total cumulative time (nanoseconds) to execute cold GETs and store new object versions in-cluster | map[backend:gcp node_id:<AIS-NODE-ID> ] |
gcp.e2e.get.ns.total |
remote_e2e_get_ns_total |
total | GET: total end-to-end time (nanoseconds) servicing remote requests; includes: receiving request, executing cold-GET, storing new object version in-cluster, and transmitting response | map[backend:gcp node_id:<AIS-NODE-ID> ] |
gcp.get.size |
remote_get_bytes_total |
size | GET: total cumulative size (bytes) of all cold-GET transactions | map[backend:gcp node_id:<AIS-NODE-ID> ] |
gcp.head.n |
remote_head_count |
counter | HEAD: total number of executed remote requests to a given backend | map[backend:gcp node_id:<AIS-NODE-ID> ] |
gcp.put.n |
remote_put_count |
counter | PUT: total number of executed remote requests to a given backend | map[backend:gcp node_id:<AIS-NODE-ID> ] |
gcp.put.ns.total |
remote_put_ns_total |
total | PUT: total cumulative time (nanoseconds) to execute remote requests and store new object versions in-cluster | map[backend:gcp node_id:<AIS-NODE-ID> ] |
gcp.e2e.put.ns.total |
remote_e2e_put_ns_total |
total | PUT: total end-to-end time (nanoseconds) servicing remote requests; includes: receiving PUT payload, storing it in-cluster, executing remote PUT, finalizing new in-cluster object | map[backend:gcp node_id:<AIS-NODE-ID> ] |
gcp.put.size |
remote_e2e_put_bytes_total |
size | PUT: total cumulative size (bytes) of all PUTs to a given remote backend | map[backend:gcp node_id:<AIS-NODE-ID> ] |
gcp.ver.change.n |
remote_ver_change_count |
counter | number of out-of-band updates (by a 3rd party performing remote PUTs outside this cluster) | map[backend:gcp node_id:<AIS-NODE-ID> ] |
gcp.ver.change.size |
remote_ver_change_bytes_total |
size | total cumulative size of objects that were updated out-of-band | map[backend:gcp node_id:<AIS-NODE-ID> ] |
aws.get.n |
remote_get_count |
counter | GET: total number of executed remote requests (cold GETs) | map[backend:aws node_id:<AIS-NODE-ID> ] |
aws.get.ns.total |
remote_get_ns_total |
total | GET: total cumulative time (nanoseconds) to execute cold GETs and store new object versions in-cluster | map[backend:aws node_id:<AIS-NODE-ID> ] |
aws.e2e.get.ns.total |
remote_e2e_get_ns_total |
total | GET: total end-to-end time (nanoseconds) servicing remote requests; includes: receiving request , executing cold-GET, storing new object version in-cluster, and transmitting response | map[backend:aws node_id:<AIS-NODE-ID> ] |
aws.get.size |
remote_get_bytes_total |
size | GET: total cumulative size (bytes) of all cold-GET transactions | map[backend:aws node_id:<AIS-NODE-ID> ] |
aws.head.n |
remote_head_count |
counter | HEAD: total number of executed remote requests to a given backend | map[backend:aws node_id:<AIS-NODE-ID> ] |
aws.put.n |
remote_put_count |
counter | PUT: total number of executed remote requests to a given backend | map[backend:aws node_id:<AIS-NODE-ID> ] |
aws.put.ns.total |
remote_put_ns_total |
total | PUT: total cumulative time (nanoseconds) to execute remote requests and store new object versions in-cluster | map[backend:aws node_id:<AIS-NODE-ID> ] |
aws.e2e.put.ns.total |
remote_e2e_put_ns_total |
total | PUT: total end-to-end time (nanoseconds) servicing remote requests; includes: receiving PUT payload, storing it in-cluster, executing remote PUT, finalizing new in-cluster object | map[backend:aws node_id:<AIS-NODE-ID> ] |
aws.put.size |
remote_e2e_put_bytes_total |
size | PUT: total cumulative size (bytes) of all PUTs to a given remote backend | map[backend:aws node_id:<AIS-NODE-ID> ] |
aws.ver.change.n |
remote_ver_change_count |
counter | number of out-of-band updates (by a 3rd party performing remote PUTs outside this cluster) | map[backend:aws node_id:<AIS-NODE-ID> ] |
aws.ver.change.size |
remote_ver_change_bytes_total |
size | total cumulative size of objects that were updated out-of-band | map[backend:aws node_id:<AIS-NODE-ID> ] |
azure.get.n |
remote_get_count |
counter | GET: total number of executed remote requests (cold GETs) | map[backend:azure node_id:<AIS-NODE-ID> ] |
azure.get.ns.total |
remote_get_ns_total |
total | GET: total cumulative time (nanoseconds) to execute cold GETs and store new object versions in-cluster | map[backend:azure node_id:<AIS-NODE-ID> ] |
azure.e2e.get.ns.total |
remote_e2e_get_ns_total |
total | GET: total end-to-end time (nanoseconds) servicing remote requests; includes: receiving request, executing cold-GET, storing new object version in-cluster, and transmitting response | map[backend:azure node_id:<AIS-NODE-ID> ] |
azure.get.size |
remote_get_bytes_total |
size | GET: total cumulative size (bytes) of all cold-GET transactions | map[backend:azure node_id:<AIS-NODE-ID> ] |
azure.head.n |
remote_head_count |
counter | HEAD: total number of executed remote requests to a given backend | map[backend:azure node_id:<AIS-NODE-ID> ] |
azure.put.n |
remote_put_count |
counter | PUT: total number of executed remote requests to a given backend | map[backend:azure node_id:<AIS-NODE-ID> ] |
azure.put.ns.total |
remote_put_ns_total |
total | PUT: total cumulative time (nanoseconds) to execute remote requests and store new object versions in-cluster | map[backend:azure node_id:<AIS-NODE-ID> ] |
azure.e2e.put.ns.total |
remote_e2e_put_ns_total |
total | PUT: total end-to-end time (nanoseconds) servicing remote requests; includes: receiving PUT payload, storing it in-cluster, executing remote PUT, finalizing new in-cluster object | map[backend:azure node_id:<AIS-NODE-ID> ] |
azure.put.size |
remote_e2e_put_bytes_total |
size | PUT: total cumulative size (bytes) of all PUTs to a given remote backend | map[backend:azure node_id:<AIS-NODE-ID> ] |
azure.ver.change.n |
remote_ver_change_count |
counter | number of out-of-band updates (by a 3rd party performing remote PUTs outside this cluster) | map[backend:azure node_id:<AIS-NODE-ID> ] |
azure.ver.change.size |
remote_ver_change_bytes_total |
size | total cumulative size of objects that were updated out-of-band | map[backend:azure node_id:<AIS-NODE-ID> ] |
Backend metrics
- GET Metrics:
remote_get_count
: Total number of executed remote GET requests (cold GETs).- Variable Labels:
bucket
- Variable Labels:
remote_get_ns_total
: Total cumulative time (nanoseconds) to execute cold GETs and store new object versions in-cluster.- Variable Labels:
bucket
- Variable Labels:
remote_e2e_get_ns_total
: Total end-to-end time (nanoseconds) servicing remote requests (includes receiving request, executing cold-GET, storing new object version in-cluster, and transmitting response).- Variable Labels:
bucket
- Variable Labels:
remote_get_bytes_total
: Total cumulative size (bytes) of all cold-GET transactions.- Variable Labels:
bucket
- Variable Labels:
- PUT Metrics:
remote_put_count
: Total number of executed remote PUT requests to a given backend.- Variable Labels:
bucket
,xkind
,xid
- Variable Labels:
remote_put_ns_total
: Total cumulative time (nanoseconds) to execute remote PUT requests and store new object versions in-cluster.- Variable Labels:
bucket
,xkind
,xid
- Variable Labels:
remote_e2e_put_ns_total
: Total end-to-end time (nanoseconds) servicing remote PUT requests (includes receiving PUT payload, storing it in-cluster, executing remote PUT, finalizing new in-cluster object).- Variable Labels:
bucket
,xkind
,xid
- Variable Labels:
remote_e2e_put_bytes_total
: Total cumulative size (bytes) of all PUTs to a given remote backend.- Variable Labels:
bucket
,xkind
,xid
- Variable Labels:
- HEAD Metrics:
remote_head_count
: Total number of executed remote HEAD requests to a given backend.- Variable Labels:
bucket
- Variable Labels:
remote_head_ns_total
: Total cumulative time (nanoseconds) to execute remote HEAD requests.- Variable Labels:
bucket
- Variable Labels:
- Out-of-Band Updates:
remote_ver_change_count
: Number of out-of-band updates (by a 3rd party performing remote PUTs outside this cluster).- Variable Labels:
bucket
- Variable Labels:
remote_ver_change_bytes_total
: Total cumulative size (bytes) of objects that were updated out-of-band.- Variable Labels:
bucket
- Variable Labels: