Skip to content

Prometheus / sql_exporter Path

SQLMonitor's classic architecture pulls everything into SQL Server tables. For shops that already run Prometheus + Grafana — or want lower-latency, cardinality-rich metrics — there is a parallel path that publishes the same signal set as Prometheus metrics via burningalchemist/sql_exporter.

The two paths are complementary, not mutually exclusive. Most of the Grafana dashboards can switch between a SQL datasource and a Prometheus datasource using the same panel queries wrapped in $__interval.

What you get

Exporter sql_exporter (Go binary) running on every monitored instance on TCP 9399.
Scrape cadence 15s default (configurable in prometheus.yml).
Collectors 7 YAML files under sql_exporter/, one per signal family.
Prometheus dashboard SQL-Exporter-Metrics-Dashboard-External.json — drop-in replacement for "Monitoring — Live — All Servers".
Alerting rules sql_exporter/alert-engine/Alert-Rules-SQLMonitor.yaml — Grafana unified alerting format.

Files in the repo

File Purpose
sql_exporter.yml Top-level exporter config — targets, credentials, job groups.
mssql_dba_cached.collector.yml Low-churn metrics cached server-side (disk space, databases, file IO).
mssql_dba_regular.collector.yml The bread-and-butter volatile metrics (CPU, memory, active/blocked requests, wait stats).
mssql_dba_stableinfo.collector.yml Instance- and host-level stable facts (SQL version, core count, RAM).
mssql_dba_aghealth.collector.yml AG primary/secondary state, redo/log-send queue.
mssql_dba_whoisactive.collector.yml Top concurrent queries — higher cardinality.
mssql_sqlagent_jobs.collector.yml SQL Agent job status / outcome / duration / next-run / 24h-failure count from msdb.
mssql_backup_history.collector.yml Per-(database, backup_type) last-time / size / duration / age from msdb.dbo.backupset.
mssql_xevent.collector.yml 5-minute aggregates from DBA.dbo.xevent_metrics (guarded with an existence check; no-op where the XEvent collector proc isn't installed).
mssql_standard.collector.yml The upstream sql_exporter stock collector.
windows_exporter_config.yml Matching config for windows_exporter — host-level CPU, disk, network.
tessell-metrics.collector.yml Managed-instance variant (no xp_cmdshell).

Full metric reference: SQL-Exporter-Metrics-Documentation.md and the quick-ref.

Install — Windows

```powershell

1. Drop the binary + config in place

(Download sql_exporter.exe from https://github.com/burningalchemist/sql_exporter/releases)

Copy-Item .\sql_exporter.exe C:\Program Files\sql_exporter\ Copy-Item .\sql_exporter*.yml C:\Program Files\sql_exporter\

2. Register as a service

New-Service -Name sql_exporter -BinaryPathName 'C:\Program Files\sql_exporter\sql_exporter.exe --config.file C:\Program Files\sql_exporter\sql_exporter.yml' -StartupType Automatic ` -DisplayName 'SQL Exporter for Prometheus'

3. Set the service logon account to the sqlmonitor proxy user, then start

Start-Service sql_exporter

4. Open the scrape port

New-NetFirewallRule -DisplayName 'SQL Exporter (TCP/9399)' ` -Direction Inbound -Protocol TCP -LocalPort 9399 -Action Allow

5. Smoke test

Invoke-WebRequest http://localhost:9399/metrics -UseBasicParsing | Select-Object -ExpandProperty Content | Select-Object -First 20 ```

Install — macOS (launchd)

Full step-by-step (including plist and log rotation) is in README-sql_exporter.md. Summary:

bash brew install prometheus grafana sudo cp com.sql_exporter.plist /Library/LaunchDaemons/ sudo launchctl load /Library/LaunchDaemons/com.sql_exporter.plist curl http://localhost:9399/metrics | head

Wire into Prometheus

Add a job to prometheus.yml:

yaml scrape_configs: - job_name: sqlmonitor_mssql scrape_interval: 15s static_configs: - targets: - sqlnode-01.corp:9399 - sqlnode-02.corp:9399 - aghost-1a.corp:9399 - aghost-1b.corp:9399 relabel_configs: - source_labels: [__address__] regex: '([^:]+):.*' target_label: sql_instance replacement: '$1'

For AGs add sqlmonitor_ag with scrape_interval: 30s pointing at the same targets — the mssql_dba_aghealth collector is slightly heavier.

Wire into Grafana

  1. Add a Prometheus datasource named Prometheus (the dashboards query it by that exact name).
  2. Import sql_exporter/SQL-Exporter-Metrics-Dashboard-External.json — this is the Prometheus counterpart of the SQL-backed distributed dashboard.
  3. Optionally import the unified-alerting YAMLs from sql_exporter/alert-engine/ — they define the same CPU / memory / wait / blocking alerts the Python engine provides, but run inside Grafana Alerting instead.

Prometheus-backed dashboard pack

The repo ships a parallel set of Grafana dashboards under sql_exporter/Prometheus-Dashboards/ that mirror the SQL-backed dashboards in Grafana-Dashboards/ but source every panel from Prometheus metrics.

Phase 1 ships 12 dashboards covering the fleet's core signal set:

UID Title Data panels
prom_core_metrics_trend Core Metrics - Trend 9
prom_wait_stats Wait Stats 4
prom_disk_space Disk Space 5
prom_ag_health_state Ag Health State 3
prom_sql_agent_jobs SQL Agent Jobs 6
prom_backup_history Backup History 6
prom_xevent_trend XEvent - Trend 4
prom_database_file_io_stats Database File IO Stats 12
prom_dba_inventory DBA Inventory 6 (+8 deep-links)
prom_monitoring_live_all_servers Monitoring - Live - All Servers 15 (+6 deep-links)
prom_monitoring_live_distributed Monitoring - Live - Distributed 52 (+6 deep-links)
prom_monitoring_perfmon_quest Monitoring - Perfmon Counters - Quest Softwares - Distributed 51 (+4 deep-links)

Panels that depend on the SQLMonitor central inventory database (alert history, AG-vs-nonAG backup split, LAMA config-change deltas, dm_os_memory_clerks snapshot, tempdb/log_space cache tables, sql_server_patching) render as markdown deep-link tiles that jump back to the SQL-backed dashboard so every source section remains visible.

Regenerating the dashboards

Each dashboard is built from a small Python spec:

bash cd sql_exporter/Prometheus-Dashboards python3 generate.py # rebuild every *.json python3 generate.py backup # filter: rebuild only backup_history python3 _tools/validate.py # structural + expr sanity check

High-fidelity PromQL patterns used across the specs:

  • increase(metric[$__range]) — selective-duration deltas (File IO, Wait Stats).
  • @ end() offset $__range — prior-window comparison tables (day-over-day).
  • quantile_over_time($percentile_q, (expr)[$trend_window:]) — percentile trends.
  • topk($top_n, sum by (...) (...)) — XEvent / wait-type / memory-consumer trends.
  • time() - timestamp(up == 1) — data-collection-issue detection.

See the folder's README.md for the full spec-driven workflow and Grafana API bulk-import snippet.

Choosing between the two paths

Consideration SQL table path (classic) Prometheus path
Installation on monitored hosts SQL Agent jobs only Requires sql_exporter + windows_exporter service
Retention You decide (SQL partitioned tables, months-to-years) Prometheus TSDB (weeks) or Mimir/Thanos for long-term
Query shape Ad-hoc T-SQL, great for forensic deep dives PromQL, great for fleet aggregates and alerting
Historical dashboards Fast (pre-aggregated tables) Fast (per-scrape time series)
WhoIsActive / Blitz Native Cardinality spike — run on a separate scrape job
Alerting Alerting/ Python engine Grafana Alerting / Alertmanager

A common production topology uses both: the SQL path for long retention + forensic tooling, and Prometheus for low-latency fleet dashboards and PagerDuty-grade alert routing.

Migrating from Perfmon file-collection to sql_exporter

Historically SQLMonitor captured host Perfmon counters into BLG files and parsed them server-side. That path is still available, but shops that already run Prometheus usually switch to windows_exporter + sql_exporter instead. The migration story (mapping old counters to Prometheus metric names) is tracked in Perfmon-SQL-Exporter-Migration-Progress.md.

PromQL quick-start

A tutorial tailored to the SQLMonitor metric set lives at _README-PromQL-Querying.md — covers the common rate(), increase(), topk() patterns for wait stats, CPU, IOPS and AG health.