Dashboards reference
This document contains a complete reference on Sourcegraph’s available dashboards, as well as details on how to interpret the panels and metrics.
To learn more about Sourcegraph’s metrics and how to view these dashboards, see our metrics guide.
Frontend
Serves all end-user browser and API requests.
Frontend: Search at a glance
frontend: 99th_percentile_search_request_duration
This panel indicates 99th percentile successful search request duration over 5m.
Managed by the Sourcegraph Search team.
frontend: 90th_percentile_search_request_duration
This panel indicates 90th percentile successful search request duration over 5m.
Managed by the Sourcegraph Search team.
frontend: hard_timeout_search_responses
This panel indicates hard timeout search responses every 5m.
Managed by the Sourcegraph Search team.
frontend: hard_error_search_responses
This panel indicates hard error search responses every 5m.
Managed by the Sourcegraph Search team.
frontend: partial_timeout_search_responses
This panel indicates partial timeout search responses every 5m.
Managed by the Sourcegraph Search team.
frontend: search_alert_user_suggestions
This panel indicates search alert user suggestions shown every 5m.
Managed by the Sourcegraph Search team.
frontend: page_load_latency
This panel indicates 90th percentile page load latency over all routes over 10m.
Managed by the Sourcegraph Core application team.
frontend: blob_load_latency
This panel indicates 90th percentile blob load latency over 10m.
Managed by the Sourcegraph Core application team.
Frontend: Search-based code intelligence at a glance
frontend: 99th_percentile_search_codeintel_request_duration
This panel indicates 99th percentile code-intel successful search request duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: 90th_percentile_search_codeintel_request_duration
This panel indicates 90th percentile code-intel successful search request duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: hard_timeout_search_codeintel_responses
This panel indicates hard timeout search code-intel responses every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: hard_error_search_codeintel_responses
This panel indicates hard error search code-intel responses every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: partial_timeout_search_codeintel_responses
This panel indicates partial timeout search code-intel responses every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: search_codeintel_alert_user_suggestions
This panel indicates search code-intel alert user suggestions shown every 5m.
Managed by the Sourcegraph Code-intelligence team.
Frontend: Search API usage at a glance
frontend: 99th_percentile_search_api_request_duration
This panel indicates 99th percentile successful search API request duration over 5m.
Managed by the Sourcegraph Search team.
frontend: 90th_percentile_search_api_request_duration
This panel indicates 90th percentile successful search API request duration over 5m.
Managed by the Sourcegraph Search team.
frontend: hard_timeout_search_api_responses
This panel indicates hard timeout search API responses every 5m.
Managed by the Sourcegraph Search team.
frontend: hard_error_search_api_responses
This panel indicates hard error search API responses every 5m.
Managed by the Sourcegraph Search team.
frontend: partial_timeout_search_api_responses
This panel indicates partial timeout search API responses every 5m.
Managed by the Sourcegraph Search team.
frontend: search_api_alert_user_suggestions
This panel indicates search API alert user suggestions shown every 5m.
Managed by the Sourcegraph Search team.
Frontend: Precise code intelligence usage at a glance
frontend: codeintel_resolvers_99th_percentile_duration
This panel indicates 99th percentile successful resolver duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_resolvers_errors
This panel indicates resolver errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
Frontend: Precise code intelligence stores and clients
frontend: codeintel_dbstore_99th_percentile_duration
This panel indicates 99th percentile successful database store operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_dbstore_errors
This panel indicates database store errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_upload_workerstore_99th_percentile_duration
This panel indicates 99th percentile successful upload worker store operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_upload_workerstore_errors
This panel indicates upload worker store errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_index_workerstore_99th_percentile_duration
This panel indicates 99th percentile successful index worker store operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_index_workerstore_errors
This panel indicates index worker store errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_lsifstore_99th_percentile_duration
This panel indicates 99th percentile successful LSIF store operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_lsifstore_errors
This panel indicates lSIF store errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_uploadstore_99th_percentile_duration
This panel indicates 99th percentile successful upload store operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_uploadstore_errors
This panel indicates upload store errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_gitserverclient_99th_percentile_duration
This panel indicates 99th percentile successful gitserver client operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_gitserverclient_errors
This panel indicates gitserver client errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
Frontend: Precise code intelligence commit graph updater
frontend: codeintel_commit_graph_queue_size
This panel indicates commit graph queue size.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_commit_graph_queue_growth_rate
This panel indicates commit graph queue growth rate over 30m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_commit_graph_updater_99th_percentile_duration
This panel indicates 99th percentile successful commit graph updater operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_commit_graph_updater_errors
This panel indicates commit graph updater errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
Frontend: Precise code intelligence janitor
frontend: codeintel_janitor_errors
This panel indicates janitor errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_upload_records_removed
This panel indicates upload records expired or deleted every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_index_records_removed
This panel indicates index records expired or deleted every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_lsif_data_removed
This panel indicates data for unreferenced upload records removed every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_background_upload_resets
This panel indicates upload records re-queued (due to unresponsive worker) every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_background_upload_reset_failures
This panel indicates upload records errored due to repeated reset every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_background_index_resets
This panel indicates index records re-queued (due to unresponsive indexer) every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_background_index_reset_failures
This panel indicates index records errored due to repeated reset every 5m.
Managed by the Sourcegraph Code-intelligence team.
Frontend: Auto-indexing
frontend: codeintel_indexing_99th_percentile_duration
This panel indicates 99th percentile successful indexing operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_indexing_errors
This panel indicates indexing errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_autoindex_enqueuer_99th_percentile_duration
This panel indicates 99th percentile successful index enqueuer operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
frontend: codeintel_autoindex_enqueuer_errors
This panel indicates index enqueuer errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
Frontend: Out of band migrations
frontend: out_of_band_migrations_up_99th_percentile_duration
This panel indicates 99th percentile successful out-of-band up migration invocation (single batch processed) duration over 5m.
Managed by the Sourcegraph Core application team.
frontend: out_of_band_migrations_up_errors
This panel indicates out-of-band up migration errors every 5m.
Managed by the Sourcegraph Core application team.
frontend: out_of_band_migrations_down_99th_percentile_duration
This panel indicates 99th percentile successful out-of-band down migration invocation (single batch processed) duration over 5m.
Managed by the Sourcegraph Core application team.
frontend: out_of_band_migrations_down_errors
This panel indicates out-of-band down migration errors every 5m.
Managed by the Sourcegraph Core application team.
Frontend: Internal service requests
frontend: internal_indexed_search_error_responses
This panel indicates internal indexed search error responses every 5m.
Managed by the Sourcegraph Search team.
frontend: internal_unindexed_search_error_responses
This panel indicates internal unindexed search error responses every 5m.
Managed by the Sourcegraph Search team.
frontend: internal_api_error_responses
This panel indicates internal API error responses every 5m by route.
Managed by the Sourcegraph Core application team.
frontend: 99th_percentile_gitserver_duration
This panel indicates 99th percentile successful gitserver query duration over 5m.
Managed by the Sourcegraph Core application team.
frontend: gitserver_error_responses
This panel indicates gitserver error responses every 5m.
Managed by the Sourcegraph Core application team.
frontend: observability_test_alert_warning
This panel indicates warning test alert metric.
Managed by the Sourcegraph Distribution team.
frontend: observability_test_alert_critical
This panel indicates critical test alert metric.
Managed by the Sourcegraph Distribution team.
Frontend: Container monitoring (not available on server)
frontend: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Core application team.
frontend: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Core application team.
frontend: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod (frontend|sourcegraph-frontend)
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p (frontend|sourcegraph-frontend)
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' (frontend|sourcegraph-frontend)
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the (frontend|sourcegraph-frontend) container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs (frontend|sourcegraph-frontend)
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Core application team.
Frontend: Provisioning indicators (not available on server)
frontend: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Core application team.
frontend: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Core application team.
frontend: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Core application team.
frontend: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Core application team.
Frontend: Golang runtime monitoring
frontend: go_goroutines
This panel indicates maximum active goroutines.
A high value here indicates a possible goroutine leak.
Managed by the Sourcegraph Core application team.
frontend: go_gc_duration_seconds
This panel indicates maximum go garbage collection duration.
Managed by the Sourcegraph Core application team.
Frontend: Kubernetes monitoring (only available on Kubernetes)
frontend: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Core application team.
Frontend: Sentinel queries (only on sourcegraph.com)
frontend: mean_successful_sentinel_duration_5m
This panel indicates mean successful sentinel search duration over 5m.
Managed by the Sourcegraph Search team.
frontend: mean_sentinel_stream_latency_5m
This panel indicates mean sentinel stream latency over 5m.
Managed by the Sourcegraph Search team.
frontend: 90th_percentile_successful_sentinel_duration_5m
This panel indicates 90th percentile successful sentinel search duration over 5m.
Managed by the Sourcegraph Search team.
frontend: 90th_percentile_sentinel_stream_latency_5m
This panel indicates 90th percentile sentinel stream latency over 5m.
Managed by the Sourcegraph Search team.
frontend: mean_successful_sentinel_duration_by_query_5m
This panel indicates mean successful sentinel search duration by query over 5m.
- The mean search duration for sentinel queries, broken down by query. Useful for debugging whether a slowdown is limited to a specific type of query.
Managed by the Sourcegraph Search team.
frontend: mean_sentinel_stream_latency_by_query_5m
This panel indicates mean sentinel stream latency by query over 5m.
- The mean streaming search latency for sentinel queries, broken down by query. Useful for debugging whether a slowdown is limited to a specific type of query.
Managed by the Sourcegraph Search team.
frontend: unsuccessful_status_rate_5m
This panel indicates unsuccessful status rate per 5m.
- The rate of unsuccessful sentinel query, broken down by failure type
Managed by the Sourcegraph Search team.
Git Server
Stores, manages, and operates Git repositories.
gitserver: memory_working_set
This panel indicates memory working set.
Managed by the Sourcegraph Core application team.
gitserver: go_routines
This panel indicates go routines.
Managed by the Sourcegraph Core application team.
gitserver: cpu_throttling_time
This panel indicates container CPU throttling time %.
Managed by the Sourcegraph Core application team.
gitserver: cpu_usage_seconds
This panel indicates cpu usage seconds.
Managed by the Sourcegraph Core application team.
gitserver: disk_space_remaining
This panel indicates disk space remaining by instance.
Managed by the Sourcegraph Core application team.
gitserver: io_reads_total
This panel indicates i/o reads total.
Managed by the Sourcegraph Core application team.
gitserver: io_writes_total
This panel indicates i/o writes total.
Managed by the Sourcegraph Core application team.
gitserver: io_reads
This panel indicates i/o reads.
Managed by the Sourcegraph Core application team.
gitserver: io_writes
This panel indicates i/o writes.
Managed by the Sourcegraph Core application team.
gitserver: io_read_througput
This panel indicates i/o read throughput.
Managed by the Sourcegraph Core application team.
gitserver: io_write_throughput
This panel indicates i/o write throughput.
Managed by the Sourcegraph Core application team.
gitserver: running_git_commands
This panel indicates git commands sent to each gitserver instance.
A high value signals load.
Managed by the Sourcegraph Core application team.
gitserver: repository_clone_queue_size
This panel indicates repository clone queue size.
Managed by the Sourcegraph Core application team.
gitserver: repository_existence_check_queue_size
This panel indicates repository existence check queue size.
Managed by the Sourcegraph Core application team.
gitserver: echo_command_duration_test
This panel indicates echo test command duration.
A high value here likely indicates a problem, especially if consistently high.
You can query for individual commands using sum by (cmd)(src_gitserver_exec_running)
in Grafana (/-/debug/grafana
) to see if a specific Git Server command might be spiking in frequency.
If this value is consistently high, consider the following:
- Single container deployments: Upgrade to a Docker Compose deployment which offers better scalability and resource isolation.
- Kubernetes and Docker Compose: Check that you are running a similar number of git server replicas and that their CPU/memory limits are allocated according to what is shown in the Sourcegraph resource estimator.
Managed by the Sourcegraph Core application team.
gitserver: frontend_internal_api_error_responses
This panel indicates frontend-internal API error responses every 5m by route.
Managed by the Sourcegraph Core application team.
Git Server: Container monitoring (not available on server)
gitserver: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Core application team.
gitserver: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Core application team.
gitserver: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod gitserver
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p gitserver
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' gitserver
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the gitserver container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs gitserver
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Core application team.
gitserver: fs_io_operations
This panel indicates filesystem reads and writes rate by instance over 1h.
This value indicates the number of filesystem read and write operations by containers of this service. When extremely high, this can indicate a resource usage problem, or can cause problems with the service itself, especially if high values or spikes correlate with {{CONTAINER_NAME}} issues.
Managed by the Sourcegraph Core application team.
Git Server: Provisioning indicators (not available on server)
gitserver: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Core application team.
gitserver: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Git Server is expected to use up all the memory it is provided.
Managed by the Sourcegraph Core application team.
gitserver: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Core application team.
gitserver: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Git Server is expected to use up all the memory it is provided.
Managed by the Sourcegraph Core application team.
Git Server: Golang runtime monitoring
gitserver: go_goroutines
This panel indicates maximum active goroutines.
A high value here indicates a possible goroutine leak.
Managed by the Sourcegraph Core application team.
gitserver: go_gc_duration_seconds
This panel indicates maximum go garbage collection duration.
Managed by the Sourcegraph Core application team.
Git Server: Kubernetes monitoring (ignore if using Docker Compose or server)
gitserver: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Core application team.
GitHub Proxy
Proxies all requests to github.com, keeping track of and managing rate limits.
GitHub Proxy: GitHub API monitoring
github-proxy: github_proxy_waiting_requests
This panel indicates number of requests waiting on the global mutex.
Managed by the Sourcegraph Core application team.
GitHub Proxy: Container monitoring (not available on server)
github-proxy: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Core application team.
github-proxy: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Core application team.
github-proxy: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod github-proxy
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p github-proxy
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' github-proxy
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the github-proxy container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs github-proxy
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Core application team.
GitHub Proxy: Provisioning indicators (not available on server)
github-proxy: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Core application team.
github-proxy: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Core application team.
github-proxy: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Core application team.
github-proxy: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Core application team.
GitHub Proxy: Golang runtime monitoring
github-proxy: go_goroutines
This panel indicates maximum active goroutines.
A high value here indicates a possible goroutine leak.
Managed by the Sourcegraph Core application team.
github-proxy: go_gc_duration_seconds
This panel indicates maximum go garbage collection duration.
Managed by the Sourcegraph Core application team.
GitHub Proxy: Kubernetes monitoring (only available on Kubernetes)
github-proxy: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Core application team.
Postgres
Postgres metrics, exported from postgres_exporter (only available on Kubernetes).
postgres: connections
This panel indicates active connections.
Managed by the Sourcegraph Core application team.
postgres: transaction_durations
This panel indicates maximum transaction durations.
Managed by the Sourcegraph Core application team.
Postgres: Database and collector status
postgres: postgres_up
This panel indicates database availability.
A non-zero value indicates the database is online.
Managed by the Sourcegraph Core application team.
postgres: invalid_indexes
This panel indicates invalid indexes (unusable by the query planner).
A non-zero value indicates the that Postgres failed to build an index. Expect degraded performance until the index is manually rebuilt.
Managed by the Sourcegraph Core application team.
postgres: pg_exporter_err
This panel indicates errors scraping postgres exporter.
This value indicates issues retrieving metrics from postgres_exporter.
Managed by the Sourcegraph Core application team.
postgres: migration_in_progress
This panel indicates active schema migration.
A 0 value indicates that no migration is in progress.
Managed by the Sourcegraph Core application team.
Postgres: Object size and bloat
postgres: pg_table_size
This panel indicates table size.
Total size of this table
Managed by the Sourcegraph Core application team.
postgres: pg_table_bloat_ratio
This panel indicates table bloat ratio.
Estimated bloat ratio of this table (high bloat = high overhead)
Managed by the Sourcegraph Core application team.
postgres: pg_index_size
This panel indicates index size.
Total size of this index
Managed by the Sourcegraph Core application team.
postgres: pg_index_bloat_ratio
This panel indicates index bloat ratio.
Estimated bloat ratio of this index (high bloat = high overhead)
Managed by the Sourcegraph Core application team.
Postgres: Provisioning indicators (not available on server)
postgres: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Core application team.
postgres: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Core application team.
postgres: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Core application team.
postgres: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Core application team.
Postgres: Kubernetes monitoring (only available on Kubernetes)
postgres: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Core application team.
Precise Code Intel Worker
Handles conversion of uploaded precise code intelligence bundles.
Precise Code Intel Worker: Upload queue
precise-code-intel-worker: upload_queue_size
This panel indicates queue size.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: upload_queue_growth_rate
This panel indicates queue growth rate over 30m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: job_errors
This panel indicates job errors errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: active_workers
This panel indicates active workers processing uploads.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: active_jobs
This panel indicates active jobs.
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Worker: Workers
precise-code-intel-worker: job_99th_percentile_duration
This panel indicates 99th percentile successful job duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Worker: Stores and clients
precise-code-intel-worker: codeintel_dbstore_99th_percentile_duration
This panel indicates 99th percentile successful database store operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: codeintel_dbstore_errors
This panel indicates database store errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: codeintel_workerstore_99th_percentile_duration
This panel indicates 99th percentile successful worker store operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: codeintel_workerstore_errors
This panel indicates worker store errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: codeintel_lsifstore_99th_percentile_duration
This panel indicates 99th percentile successful LSIF store operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: codeintel_lsifstore_errors
This panel indicates lSIF store errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: codeintel_uploadstore_99th_percentile_duration
This panel indicates 99th percentile successful upload store operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: codeintel_uploadstore_errors
This panel indicates upload store errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: codeintel_gitserverclient_99th_percentile_duration
This panel indicates 99th percentile successful gitserver client operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: codeintel_gitserverclient_errors
This panel indicates gitserver client errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Worker: Internal service requests
precise-code-intel-worker: frontend_internal_api_error_responses
This panel indicates frontend-internal API error responses every 5m by route.
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Worker: Container monitoring (not available on server)
precise-code-intel-worker: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod precise-code-intel-worker
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p precise-code-intel-worker
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' precise-code-intel-worker
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the precise-code-intel-worker container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs precise-code-intel-worker
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Worker: Provisioning indicators (not available on server)
precise-code-intel-worker: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Worker: Golang runtime monitoring
precise-code-intel-worker: go_goroutines
This panel indicates maximum active goroutines.
A high value here indicates a possible goroutine leak.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-worker: go_gc_duration_seconds
This panel indicates maximum go garbage collection duration.
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Worker: Kubernetes monitoring (only available on Kubernetes)
precise-code-intel-worker: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Code-intelligence team.
Query Runner
Periodically runs saved searches and instructs the frontend to send out notifications.
query-runner: frontend_internal_api_error_responses
This panel indicates frontend-internal API error responses every 5m by route.
Managed by the Sourcegraph Search team.
Query Runner: Container monitoring (not available on server)
query-runner: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Search team.
query-runner: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Search team.
query-runner: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod query-runner
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p query-runner
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' query-runner
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the query-runner container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs query-runner
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Search team.
Query Runner: Provisioning indicators (not available on server)
query-runner: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Search team.
query-runner: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Search team.
query-runner: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Search team.
query-runner: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Search team.
Query Runner: Golang runtime monitoring
query-runner: go_goroutines
This panel indicates maximum active goroutines.
A high value here indicates a possible goroutine leak.
Managed by the Sourcegraph Search team.
query-runner: go_gc_duration_seconds
This panel indicates maximum go garbage collection duration.
Managed by the Sourcegraph Search team.
Query Runner: Kubernetes monitoring (only available on Kubernetes)
query-runner: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Search team.
Repo Updater
Manages interaction with code hosts, instructs Gitserver to update repositories.
repo-updater: frontend_internal_api_error_responses
This panel indicates frontend-internal API error responses every 5m by route.
Managed by the Sourcegraph Core application team.
Repo Updater: Repositories
repo-updater: syncer_sync_last_time
This panel indicates time since last sync.
A high value here indicates issues synchronizing repository permissions. If the value is persistently high, make sure all external services have valid tokens.
Managed by the Sourcegraph Core application team.
repo-updater: src_repoupdater_max_sync_backoff
This panel indicates time since oldest sync.
Managed by the Sourcegraph Core application team.
repo-updater: src_repoupdater_syncer_sync_errors_total
This panel indicates sync error rate.
Managed by the Sourcegraph Core application team.
repo-updater: syncer_sync_start
This panel indicates sync was started.
Managed by the Sourcegraph Core application team.
repo-updater: syncer_sync_duration
This panel indicates 95th repositories sync duration.
Managed by the Sourcegraph Core application team.
repo-updater: source_duration
This panel indicates 95th repositories source duration.
Managed by the Sourcegraph Core application team.
repo-updater: syncer_synced_repos
This panel indicates repositories synced.
Managed by the Sourcegraph Core application team.
repo-updater: sourced_repos
This panel indicates repositories sourced.
Managed by the Sourcegraph Core application team.
repo-updater: user_added_repos
This panel indicates total number of user added repos.
Managed by the Sourcegraph Core application team.
repo-updater: purge_failed
This panel indicates repositories purge failed.
Managed by the Sourcegraph Core application team.
repo-updater: sched_auto_fetch
This panel indicates repositories scheduled due to hitting a deadline.
Managed by the Sourcegraph Core application team.
repo-updater: sched_manual_fetch
This panel indicates repositories scheduled due to user traffic.
Check repo-updater logs if this value is persistently high. This does not indicate anything if there are no user added code hosts.
Managed by the Sourcegraph Core application team.
repo-updater: sched_known_repos
This panel indicates repositories managed by the scheduler.
Managed by the Sourcegraph Core application team.
repo-updater: sched_update_queue_length
This panel indicates rate of growth of update queue length over 5 minutes.
Managed by the Sourcegraph Core application team.
repo-updater: sched_loops
This panel indicates scheduler loops.
Managed by the Sourcegraph Core application team.
repo-updater: sched_error
This panel indicates repositories schedule error rate.
Managed by the Sourcegraph Core application team.
Repo Updater: Permissions
repo-updater: perms_syncer_perms
This panel indicates time gap between least and most up to date permissions.
Managed by the Sourcegraph Core application team.
repo-updater: perms_syncer_stale_perms
This panel indicates number of entities with stale permissions.
Managed by the Sourcegraph Core application team.
repo-updater: perms_syncer_no_perms
This panel indicates number of entities with no permissions.
Managed by the Sourcegraph Core application team.
repo-updater: perms_syncer_sync_duration
This panel indicates 95th permissions sync duration.
Managed by the Sourcegraph Core application team.
repo-updater: perms_syncer_queue_size
This panel indicates permissions sync queued items.
Managed by the Sourcegraph Core application team.
repo-updater: perms_syncer_sync_errors
This panel indicates permissions sync error rate.
Managed by the Sourcegraph Core application team.
Repo Updater: External services
repo-updater: src_repoupdater_external_services_total
This panel indicates the total number of external services.
Managed by the Sourcegraph Core application team.
repo-updater: src_repoupdater_user_external_services_total
This panel indicates the total number of user added external services.
Managed by the Sourcegraph Core application team.
repo-updater: repoupdater_queued_sync_jobs_total
This panel indicates the total number of queued sync jobs.
Managed by the Sourcegraph Core application team.
repo-updater: repoupdater_completed_sync_jobs_total
This panel indicates the total number of completed sync jobs.
Managed by the Sourcegraph Core application team.
repo-updater: repoupdater_errored_sync_jobs_total
This panel indicates the total number of errored sync jobs.
Managed by the Sourcegraph Core application team.
repo-updater: github_graphql_rate_limit_remaining
This panel indicates remaining calls to GitHub graphql API before hitting the rate limit.
Managed by the Sourcegraph Core application team.
repo-updater: github_rest_rate_limit_remaining
This panel indicates remaining calls to GitHub rest API before hitting the rate limit.
Managed by the Sourcegraph Core application team.
repo-updater: github_search_rate_limit_remaining
This panel indicates remaining calls to GitHub search API before hitting the rate limit.
Managed by the Sourcegraph Core application team.
repo-updater: github_graphql_rate_limit_wait_duration
This panel indicates time spent waiting for the GitHub graphql API rate limiter.
Indicates how long we`re waiting on the rate limit once it has been exceeded
Managed by the Sourcegraph Core application team.
repo-updater: github_rest_rate_limit_wait_duration
This panel indicates time spent waiting for the GitHub rest API rate limiter.
Indicates how long we`re waiting on the rate limit once it has been exceeded
Managed by the Sourcegraph Core application team.
repo-updater: github_search_rate_limit_wait_duration
This panel indicates time spent waiting for the GitHub search API rate limiter.
Indicates how long we`re waiting on the rate limit once it has been exceeded
Managed by the Sourcegraph Core application team.
repo-updater: gitlab_rest_rate_limit_remaining
This panel indicates remaining calls to GitLab rest API before hitting the rate limit.
Managed by the Sourcegraph Core application team.
repo-updater: gitlab_rest_rate_limit_wait_duration
This panel indicates time spent waiting for the GitLab rest API rate limiter.
Indicates how long we`re waiting on the rate limit once it has been exceeded
Managed by the Sourcegraph Core application team.
Repo Updater: Container monitoring (not available on server)
repo-updater: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Core application team.
repo-updater: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Core application team.
repo-updater: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod repo-updater
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p repo-updater
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' repo-updater
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the repo-updater container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs repo-updater
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Core application team.
Repo Updater: Provisioning indicators (not available on server)
repo-updater: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Core application team.
repo-updater: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Core application team.
repo-updater: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Core application team.
repo-updater: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Core application team.
Repo Updater: Golang runtime monitoring
repo-updater: go_goroutines
This panel indicates maximum active goroutines.
A high value here indicates a possible goroutine leak.
Managed by the Sourcegraph Core application team.
repo-updater: go_gc_duration_seconds
This panel indicates maximum go garbage collection duration.
Managed by the Sourcegraph Core application team.
Repo Updater: Kubernetes monitoring (only available on Kubernetes)
repo-updater: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Core application team.
Searcher
Performs unindexed searches (diff and commit search, text search for unindexed branches).
searcher: unindexed_search_request_errors
This panel indicates unindexed search request errors every 5m by code.
Managed by the Sourcegraph Search team.
searcher: replica_traffic
This panel indicates requests per second over 10m.
Managed by the Sourcegraph Search team.
searcher: frontend_internal_api_error_responses
This panel indicates frontend-internal API error responses every 5m by route.
Managed by the Sourcegraph Search team.
Searcher: Container monitoring (not available on server)
searcher: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Search team.
searcher: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Search team.
searcher: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod searcher
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p searcher
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' searcher
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the searcher container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs searcher
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Search team.
Searcher: Provisioning indicators (not available on server)
searcher: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Search team.
searcher: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Search team.
searcher: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Search team.
searcher: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Search team.
Searcher: Golang runtime monitoring
searcher: go_goroutines
This panel indicates maximum active goroutines.
A high value here indicates a possible goroutine leak.
Managed by the Sourcegraph Search team.
searcher: go_gc_duration_seconds
This panel indicates maximum go garbage collection duration.
Managed by the Sourcegraph Search team.
Searcher: Kubernetes monitoring (only available on Kubernetes)
searcher: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Search team.
Symbols
Handles symbol searches for unindexed branches.
symbols: store_fetch_failures
This panel indicates store fetch failures every 5m.
Managed by the Sourcegraph Code-intelligence team.
symbols: current_fetch_queue_size
This panel indicates current fetch queue size.
Managed by the Sourcegraph Code-intelligence team.
symbols: frontend_internal_api_error_responses
This panel indicates frontend-internal API error responses every 5m by route.
Managed by the Sourcegraph Code-intelligence team.
Symbols: Container monitoring (not available on server)
symbols: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Code-intelligence team.
symbols: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Code-intelligence team.
symbols: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod symbols
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p symbols
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' symbols
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the symbols container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs symbols
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Code-intelligence team.
Symbols: Provisioning indicators (not available on server)
symbols: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Code-intelligence team.
symbols: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Code-intelligence team.
symbols: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Code-intelligence team.
symbols: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Code-intelligence team.
Symbols: Golang runtime monitoring
symbols: go_goroutines
This panel indicates maximum active goroutines.
A high value here indicates a possible goroutine leak.
Managed by the Sourcegraph Code-intelligence team.
symbols: go_gc_duration_seconds
This panel indicates maximum go garbage collection duration.
Managed by the Sourcegraph Code-intelligence team.
Symbols: Kubernetes monitoring (only available on Kubernetes)
symbols: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Code-intelligence team.
Syntect Server
Handles syntax highlighting for code files.
syntect-server: syntax_highlighting_errors
This panel indicates syntax highlighting errors every 5m.
Managed by the Sourcegraph Core application team.
syntect-server: syntax_highlighting_timeouts
This panel indicates syntax highlighting timeouts every 5m.
Managed by the Sourcegraph Core application team.
syntect-server: syntax_highlighting_panics
This panel indicates syntax highlighting panics every 5m.
Managed by the Sourcegraph Core application team.
syntect-server: syntax_highlighting_worker_deaths
This panel indicates syntax highlighter worker deaths every 5m.
Managed by the Sourcegraph Core application team.
Syntect Server: Container monitoring (not available on server)
syntect-server: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Core application team.
syntect-server: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Core application team.
syntect-server: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod syntect-server
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p syntect-server
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' syntect-server
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the syntect-server container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs syntect-server
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Core application team.
Syntect Server: Provisioning indicators (not available on server)
syntect-server: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Core application team.
syntect-server: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Core application team.
syntect-server: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Core application team.
syntect-server: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Core application team.
Syntect Server: Kubernetes monitoring (only available on Kubernetes)
syntect-server: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Core application team.
Zoekt Index Server
Indexes repositories and populates the search index.
zoekt-indexserver: average_resolve_revision_duration
This panel indicates average resolve revision duration over 5m.
Managed by the Sourcegraph Search team.
Zoekt Index Server: Container monitoring (not available on server)
zoekt-indexserver: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Search team.
zoekt-indexserver: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Search team.
zoekt-indexserver: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod zoekt-indexserver
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p zoekt-indexserver
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' zoekt-indexserver
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the zoekt-indexserver container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs zoekt-indexserver
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Search team.
zoekt-indexserver: fs_io_operations
This panel indicates filesystem reads and writes rate by instance over 1h.
This value indicates the number of filesystem read and write operations by containers of this service. When extremely high, this can indicate a resource usage problem, or can cause problems with the service itself, especially if high values or spikes correlate with {{CONTAINER_NAME}} issues.
Managed by the Sourcegraph Core application team.
Zoekt Index Server: Provisioning indicators (not available on server)
zoekt-indexserver: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Search team.
zoekt-indexserver: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Search team.
zoekt-indexserver: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Search team.
zoekt-indexserver: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Search team.
Zoekt Index Server: Kubernetes monitoring (only available on Kubernetes)
zoekt-indexserver: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Search team.
Zoekt Web Server
Serves indexed search requests using the search index.
zoekt-webserver: indexed_search_request_errors
This panel indicates indexed search request errors every 5m by code.
Managed by the Sourcegraph Search team.
Zoekt Web Server: Container monitoring (not available on server)
zoekt-webserver: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Search team.
zoekt-webserver: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Search team.
zoekt-webserver: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod zoekt-webserver
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p zoekt-webserver
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' zoekt-webserver
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the zoekt-webserver container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs zoekt-webserver
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Search team.
zoekt-webserver: fs_io_operations
This panel indicates filesystem reads and writes rate by instance over 1h.
This value indicates the number of filesystem read and write operations by containers of this service. When extremely high, this can indicate a resource usage problem, or can cause problems with the service itself, especially if high values or spikes correlate with {{CONTAINER_NAME}} issues.
Managed by the Sourcegraph Core application team.
Zoekt Web Server: Provisioning indicators (not available on server)
zoekt-webserver: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Search team.
zoekt-webserver: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Search team.
zoekt-webserver: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Search team.
zoekt-webserver: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Search team.
Prometheus
Sourcegraph's all-in-one Prometheus and Alertmanager service.
Prometheus: Metrics
prometheus: prometheus_rule_eval_duration
This panel indicates average prometheus rule group evaluation duration over 10m by rule group.
A high value here indicates Prometheus rule evaluation is taking longer than expected. It might indicate that certain rule groups are taking too long to evaluate, or Prometheus is underprovisioned.
Rules that Sourcegraph ships with are grouped under /sg_config_prometheus
. Custom rules are grouped under /sg_prometheus_addons
.
Managed by the Sourcegraph Distribution team.
prometheus: prometheus_rule_eval_failures
This panel indicates failed prometheus rule evaluations over 5m by rule group.
Rules that Sourcegraph ships with are grouped under /sg_config_prometheus
. Custom rules are grouped under /sg_prometheus_addons
.
Managed by the Sourcegraph Distribution team.
Prometheus: Alerts
prometheus: alertmanager_notification_latency
This panel indicates alertmanager notification latency over 1m by integration.
Managed by the Sourcegraph Distribution team.
prometheus: alertmanager_notification_failures
This panel indicates failed alertmanager notifications over 1m by integration.
Managed by the Sourcegraph Distribution team.
Prometheus: Internals
prometheus: prometheus_config_status
This panel indicates prometheus configuration reload status.
A 1
indicates Prometheus reloaded its configuration successfully.
Managed by the Sourcegraph Distribution team.
prometheus: alertmanager_config_status
This panel indicates alertmanager configuration reload status.
A 1
indicates Alertmanager reloaded its configuration successfully.
Managed by the Sourcegraph Distribution team.
prometheus: prometheus_tsdb_op_failure
This panel indicates prometheus tsdb failures by operation over 1m by operation.
Managed by the Sourcegraph Distribution team.
prometheus: prometheus_target_sample_exceeded
This panel indicates prometheus scrapes that exceed the sample limit over 10m.
Managed by the Sourcegraph Distribution team.
prometheus: prometheus_target_sample_duplicate
This panel indicates prometheus scrapes rejected due to duplicate timestamps over 10m.
Managed by the Sourcegraph Distribution team.
Prometheus: Container monitoring (not available on server)
prometheus: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Distribution team.
prometheus: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Distribution team.
prometheus: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod prometheus
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p prometheus
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' prometheus
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the prometheus container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs prometheus
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Distribution team.
Prometheus: Provisioning indicators (not available on server)
prometheus: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Distribution team.
prometheus: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Distribution team.
prometheus: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Distribution team.
prometheus: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Distribution team.
Prometheus: Kubernetes monitoring (only available on Kubernetes)
prometheus: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Distribution team.
Executor Queue
Coordinates the executor work queues.
Executor Queue: Code intelligence queue
executor-queue: codeintel_queue_size
This panel indicates queue size.
Managed by the Sourcegraph Code-intelligence team.
executor-queue: codeintel_queue_growth_rate
This panel indicates queue growth rate over 30m.
Managed by the Sourcegraph Code-intelligence team.
executor-queue: codeintel_job_errors
This panel indicates job errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
executor-queue: codeintel_active_executors
This panel indicates active executors processing codeintel jobs.
Managed by the Sourcegraph Code-intelligence team.
executor-queue: codeintel_active_jobs
This panel indicates active jobs.
Managed by the Sourcegraph Code-intelligence team.
Executor Queue: Stores and clients
executor-queue: codeintel_workerstore_99th_percentile_duration
This panel indicates 99th percentile successful worker store operation duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
executor-queue: codeintel_workerstore_errors
This panel indicates worker store errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
Executor Queue: Internal service requests
executor-queue: frontend_internal_api_error_responses
This panel indicates frontend-internal API error responses every 5m by route.
Managed by the Sourcegraph Code-intelligence team.
Executor Queue: Container monitoring (not available on server)
executor-queue: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Code-intelligence team.
executor-queue: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Code-intelligence team.
executor-queue: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod executor-queue
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p executor-queue
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' executor-queue
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the executor-queue container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs executor-queue
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Code-intelligence team.
Executor Queue: Provisioning indicators (not available on server)
executor-queue: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Code-intelligence team.
executor-queue: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Code-intelligence team.
executor-queue: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Code-intelligence team.
executor-queue: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Code-intelligence team.
Executor Queue: Golang runtime monitoring
executor-queue: go_goroutines
This panel indicates maximum active goroutines.
A high value here indicates a possible goroutine leak.
Managed by the Sourcegraph Code-intelligence team.
executor-queue: go_gc_duration_seconds
This panel indicates maximum go garbage collection duration.
Managed by the Sourcegraph Code-intelligence team.
Executor Queue: Kubernetes monitoring (only available on Kubernetes)
executor-queue: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Indexer
Executes jobs from the "codeintel" work queue.
Precise Code Intel Indexer: Executor
precise-code-intel-indexer: codeintel_job_99th_percentile_duration
This panel indicates 99th percentile successful job duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: codeintel_active_handlers
This panel indicates active handlers processing jobs.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: codeintel_job_errors
This panel indicates job errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Indexer: Stores and clients
precise-code-intel-indexer: executor_apiclient_99th_percentile_duration
This panel indicates 99th percentile successful API request duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: executor_apiclient_errors
This panel indicates aPI errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Indexer: Commands
precise-code-intel-indexer: executor_setup_command_99th_percentile_duration
This panel indicates 99th percentile successful setup command duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: executor_setup_command_errors
This panel indicates setup command errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: executor_exec_command_99th_percentile_duration
This panel indicates 99th percentile successful exec command duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: executor_exec_command_errors
This panel indicates exec command errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: executor_teardown_command_99th_percentile_duration
This panel indicates 99th percentile successful teardown command duration over 5m.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: executor_teardown_command_errors
This panel indicates teardown command errors every 5m.
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Indexer: Container monitoring (not available on server)
precise-code-intel-indexer: container_cpu_usage
This panel indicates container cpu usage total (1m average) across all cores by instance.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: container_memory_usage
This panel indicates container memory usage by instance.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: container_missing
This panel indicates container missing.
This value is the number of times a container has not been seen for more than one minute. If you observe this value change independent of deployment events (such as an upgrade), it could indicate pods are being OOM killed or terminated for some other reasons.
- Kubernetes:
- Determine if the pod was OOM killed using
kubectl describe pod precise-code-intel-worker
(look forOOMKilled: true
) and, if so, consider increasing the memory limit in the relevantDeployment.yaml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingkubectl logs -p precise-code-intel-worker
.
- Determine if the pod was OOM killed using
- Docker Compose:
- Determine if the pod was OOM killed using
docker inspect -f '{{json .State}}' precise-code-intel-worker
(look for"OOMKilled":true
) and, if so, consider increasing the memory limit of the precise-code-intel-worker container indocker-compose.yml
. - Check the logs before the container restarted to see if there are
panic:
messages or similar usingdocker logs precise-code-intel-worker
(note this will include logs from the previous and currently running container).
- Determine if the pod was OOM killed using
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Indexer: Provisioning indicators (not available on server)
precise-code-intel-indexer: provisioning_container_cpu_usage_long_term
This panel indicates container cpu usage total (90th percentile over 1d) across all cores by instance.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: provisioning_container_memory_usage_long_term
This panel indicates container memory usage (1d maximum) by instance.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: provisioning_container_cpu_usage_short_term
This panel indicates container cpu usage total (5m maximum) across all cores by instance.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: provisioning_container_memory_usage_short_term
This panel indicates container memory usage (5m maximum) by instance.
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Indexer: Golang runtime monitoring
precise-code-intel-indexer: go_goroutines
This panel indicates maximum active goroutines.
A high value here indicates a possible goroutine leak.
Managed by the Sourcegraph Code-intelligence team.
precise-code-intel-indexer: go_gc_duration_seconds
This panel indicates maximum go garbage collection duration.
Managed by the Sourcegraph Code-intelligence team.
Precise Code Intel Indexer: Kubernetes monitoring (only available on Kubernetes)
precise-code-intel-indexer: pods_available_percentage
This panel indicates percentage pods available.
Managed by the Sourcegraph Code-intelligence team.