The Prometheus monitoring system and time series database.

Overview

Prometheus

CircleCI Docker Repository on Quay Docker Pulls Go Report Card CII Best Practices Gitpod ready-to-code Fuzzing Status

Visit prometheus.io for the full documentation, examples and guides.

Prometheus, a Cloud Native Computing Foundation project, is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts when specified conditions are observed.

The features that distinguish Prometheus from other metrics and monitoring systems are:

  • A multi-dimensional data model (time series defined by metric name and set of key/value dimensions)
  • PromQL, a powerful and flexible query language to leverage this dimensionality
  • No dependency on distributed storage; single server nodes are autonomous
  • An HTTP pull model for time series collection
  • Pushing time series is supported via an intermediary gateway for batch jobs
  • Targets are discovered via service discovery or static configuration
  • Multiple modes of graphing and dashboarding support
  • Support for hierarchical and horizontal federation

Architecture overview

Install

There are various ways of installing Prometheus.

Precompiled binaries

Precompiled binaries for released versions are available in the download section on prometheus.io. Using the latest production release binary is the recommended way of installing Prometheus. See the Installing chapter in the documentation for all the details.

Docker images

Docker images are available on Quay.io or Docker Hub.

You can launch a Prometheus container for trying it out with

$ docker run --name prometheus -d -p 127.0.0.1:9090:9090 prom/prometheus

Prometheus will now be reachable at http://localhost:9090/.

Building from source

To build Prometheus from source code, first ensure that have a working Go environment with version 1.14 or greater installed. You also need Node.js and npm installed in order to build the frontend assets.

You can directly use the go tool to download and install the prometheus and promtool binaries into your GOPATH:

$ GO111MODULE=on go install github.com/prometheus/prometheus/cmd/...
$ prometheus --config.file=your_config.yml

However, when using go install to build Prometheus, Prometheus will expect to be able to read its web assets from local filesystem directories under web/ui/static and web/ui/templates. In order for these assets to be found, you will have to run Prometheus from the root of the cloned repository. Note also that these directories do not include the new experimental React UI unless it has been built explicitly using make assets or make build.

An example of the above configuration file can be found here.

You can also clone the repository yourself and build using make build, which will compile in the web assets so that Prometheus can be run from anywhere:

$ mkdir -p $GOPATH/src/github.com/prometheus
$ cd $GOPATH/src/github.com/prometheus
$ git clone https://github.com/prometheus/prometheus.git
$ cd prometheus
$ make build
$ ./prometheus --config.file=your_config.yml

The Makefile provides several targets:

  • build: build the prometheus and promtool binaries (includes building and compiling in web assets)
  • test: run the tests
  • test-short: run the short tests
  • format: format the source code
  • vet: check the source code for common errors
  • assets: build the new experimental React UI

Building the Docker image

The make docker target is designed for use in our CI system. You can build a docker image locally with the following commands:

$ make promu
$ promu crossbuild -p linux/amd64
$ make npm_licenses
$ make common-docker-amd64

NB if you are on a Mac, you will need gnu-tar.

React UI Development

For more information on building, running, and developing on the new React-based UI, see the React app's README.md.

More information

Contributing

Refer to CONTRIBUTING.md

License

Apache License 2.0, see LICENSE.

Comments
  • TSDB data import tool for OpenMetrics format.

    TSDB data import tool for OpenMetrics format.

    Created a tool to import data formatted according to the Prometheus exposition format. The tool can be accessed via the TSDB CLI.

    closes prometheus/prometheus#535

    Signed-off-by: Dipack P Panjabi [email protected]

    (Port of https://github.com/prometheus/tsdb/pull/671)

    opened by dipack95 126
  • Add mechanism to perform bulk imports

    Add mechanism to perform bulk imports

    Currently the only way to bulk-import data is a hacky one involving client-side timestamps and scrapes with multiple samples per time series. We should offer an API for bulk import. This relies on https://github.com/prometheus/prometheus/issues/481.

    EDIT: It probably won't be an web-based API in Prometheus, but a command-line tool.

    kind/enhancement priority/P2 component/tsdb 
    opened by juliusv 112
  • Create a section ANNOTATIONS with user-defined payload and generalize RUNBOOK, DESCRIPTION, SUMMARY into fields therein.

    Create a section ANNOTATIONS with user-defined payload and generalize RUNBOOK, DESCRIPTION, SUMMARY into fields therein.

    RUNBOOK was added in a hurry in #843 for an internal demo of one of our users, which didn't give it enough time to be fully discussed. The demo has been done, so we can reconsider this.

    I think we should revert this change, and remove RUNBOOK:

    • Our general policy is that if it can be done with labels, do it with labels
    • All notification methods in the alertmanager will need extra code to deal with this
    • In future, all alertmanager notification templates will need extra code to deal with this
    • In general, all user code touching the alertmanager will need extra code to deal with this
    • This presumes a certain workflow in that you have something called a "runbook" (and not any other name - playbook is also common) and that you have exactly one of them

    Runbooks are not a fundamental aspect of an alert, are not in use by all of our users and thus I don't believe they meet the bar for first-class support within prometheus. This is especially true considering that they don't add anything that isn't already possible with labels.

    opened by brian-brazil 102
  • Implement strategies to limit memory usage.

    Implement strategies to limit memory usage.

    Currently, Prometheus simply limits the chunks in memory to a fixed number.

    However, this number doesn't directly imply the total memory usage as many other things take memory as well.

    Prometheus could measure its own memory consumption and (optionally) evict chunks early if it needs too much memory.

    It's non-trivial to measure "actual" memory consumption in a platform independent way.

    kind/enhancement 
    opened by beorn7 90
  • '@ <timestamp>' modifier

    '@ ' modifier

    This PR implements @ <timestamp> modifier as per this design doc.

    An example query:

    rate(process_cpu_seconds_total[1m]) 
      and
    topk(7, rate(process_cpu_seconds_total[1h] @ 1234))
    

    which ranks based on last 1h rate and w.r.t. unix timestamp 1234 but actually plots the 1m rate.

    Closes #7903

    This PR is to be followed up with an easier way to represent the start, end, range of a query in PromQL so that we could do @ <end>, metric[<range>] easily.

    opened by codesome 88
  • Port isolation from old TSDB PR

    Port isolation from old TSDB PR

    The original PR was https://github.com/prometheus/tsdb/pull/306 .

    I tried to carefully adjust to the new world order, but please give this a very careful review, especially around iterator reuse (marked with a TODO).

    On the bright side, I definitely found and fixed a bug in txRing.

    prombench 
    opened by beorn7 78
  • 2.3.0 significatnt memory usage increase.

    2.3.0 significatnt memory usage increase.

    Bug Report

    What did you do? Upgraded to 2.3.0

    What did you expect to see? General improvements.

    What did you see instead? Under which circumstances? Memory usage, possibly driven by queries, has considerably increased. Upgrade at 09:27, the memory usage drops on the graph after then are from container restarts due to OOM.

    container_memory_usage_bytes

    image

    Environment

    Prometheus in kubernetes 1.9

    • System information: Standard docker containers, on docker kubelet on linux.

    • Prometheus version: 2.3.0 insert output of prometheus --version here

    kind/bug 
    opened by tcolgate 77
  • Support for environment variable substitution in configuration file

    Support for environment variable substitution in configuration file

    I think that would be a good idea to substitute environment variables in the configuration file.

    That could be done really easily using os.ExpandEnv on configuration string when loading configuration string.

    That would be much easier to substitute environment variables only on configuration values. go -ini provides a valueMapper but yaml.v2 doesn't have such mechanism.

    opened by dopuskh3 72
  • React UI: Implement more sophisticated autocomplete

    React UI: Implement more sophisticated autocomplete

    It would be great to have more sophisticated expression field autocompletion in the new React UI.

    Currently it only autocompletes metric names, and only when the expression field doesn't contain any other sub-expressions yet.

    Things that would be nice to autocomplete:

    • metric names anywhere within an expression
    • label names
    • label values
    • function names
    • etc.

    For autocomplete functionality not to annoy users, it needs to be as highly performant, correct, and unobtrusive as possible. Grafana does many things right here already, but they also have a few really annoying bugs, like inserting closing parentheses in incorrect locations of an expression.

    Currently @slrtbtfs has indicated interest in building a language-server-based autocomplete implementation.

    component/ui priority/P3 kind/feature 
    opened by juliusv 69
  • Benchmark tsdb master

    Benchmark tsdb master

    DO NOT MERGE

    Benchmark 1

    Benchmark the following PRs against 2.11.1

    1. For queries: https://github.com/prometheus/tsdb/pull/642
    2. For compaction: https://github.com/prometheus/tsdb/pull/643 https://github.com/prometheus/tsdb/pull/654 https://github.com/prometheus/tsdb/pull/653
    3. Opening block: https://github.com/prometheus/tsdb/pull/645

    Results

    Did not test for compaction from on-disk blocks. Could not really see the allocation optimizations in compaction, that might be because the savings are mostly in the number of allocations and not the size of allocation (size is what is showed in the dashboards). That would mean CPU to be saved, but couldn't make a huge difference, but a slight increase in gap during compaction.

    The gains looked good in

    1. Allocations
    2. CPU (because of allocations?)
    3. RSS was also lower (upto 10 GiB lower! ~60 vs ~70).
    4. Also a tiny-good improvement in query inner_eval times.
    5. Compaction time (this should help the increase in compaction time that https://github.com/prometheus/tsdb/pull/627 is going to bring).
    6. System load.

    And bad in

    1. result_sort for the queries. Not sure why.

    Benchmark 2

    Benchmark https://github.com/prometheus/tsdb/pull/627 (which includes all the PRs from above Benchmark 1) against 2.11.1

    opened by codesome 65
  • M-map full chunks of Head from disk

    M-map full chunks of Head from disk

    tl-dr desc for the PR from @krasi-georgiev


    When appending to the head and a chunk is full it is flushed to the disk and m-mapped (memory mapped) to free up memory

    Prom startup now happens in these stages

    • Iterate the m-maped chunks from disk and keep a map of series reference to its slice of mmapped chunks.
    • Iterate the WAL as usual. Whenever we create a new series, look for it's mmapped chunks in the map created before and add it to that series.

    If a head chunk is corrupted the currpted one and all chunks after that are deleted and the data after the corruption is recovered from the existing WAL which means that a corruption in m-mapped files results in NO data loss.

    Mmaped chunks format - main difference is that the chunk for mmaping now also includes series reference because there is no index for mapping series to chunks. The block chunks are accessed from the index which includes the offsets for the chunks in the chunks file - example - chunks of series ID have offsets 200, 500 etc in the chunk files. In case of mmaped chunks, the offsets are stored in memory and accessed from that. During WAL replay, these offsets are restored by iterating all m-mapped chunks as stated above by matching the series id present in the chunk header and offset of that chunk in that file.

    Prombench results

    WAL Replay

    1h Wal reply time 30% less wal reply time - 4m31 vs 3m36 2h Wal reply time 20% less wal reply time - 8m16 vs 7m

    Memory During WAL Replay

    High Churn 10-15% less RAM - 32gb vs 28gb 20% less RAM after compaction 34gb vs 27gb No Churn 20-30% less RAM - 23gb vs 18gb 40% less RAM after compaction 32.5gb vs 20gb

    Screenshots are in this comment


    Prerequisite: https://github.com/prometheus/prometheus/pull/6830 (Merged)

    Closes https://github.com/prometheus/prometheus/issues/6377. More info in the linked issue and the doc in that issue and the doc inside that doc inside that issue :)

    • [x] Add tests
    • [x] Explore possible ways to get rid of new globals added in head.go
    • [x] Wait for https://github.com/prometheus/prometheus/pull/6830 to be merged
    • [x] Fix windows tests
    prombench 
    opened by codesome 64
  • histogram: Remove code replication via generics

    histogram: Remove code replication via generics

    This is only for the sparsehistogram branch!

    I was hoping more of the iterator code could be deduplicated. Turns out, the meat is usually in the Next method, and it is actually rather different for each iterator type. I could still extract most of the At implementations, generify Bucket and the BucketIterator interface and do a few other cleanups.

    This is neither urgent nor critical, and I would not have done it if I had known the outcome wouldn't be more spectacular. Still, now it is done, and I think it helps a tiny bit to make the code more consistent.

    opened by beorn7 2
  • RemoteWrite to aws with just role_arn doesn't work

    RemoteWrite to aws with just role_arn doesn't work

    What did you do?

    Configured a remote write to aws

    remote_write: url: "https://aps-workspaces.us-east-1.amazonaws.com/workspaces/wsID/api/v1/remote_write" sigv4: region: us-east-1 role_arn:

    This doesn't seem to work

    What did you expect to see?

    No response

    What did you see instead? Under which circumstances?

    caller=main.go:1203 level=error msg="Failed to apply configuration" err="could not get SigV4 credentials: NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"

    System information

    No response

    Prometheus version

    No response

    Prometheus configuration file

    No response

    Alertmanager version

    No response

    Alertmanager configuration file

    No response

    Logs

    No response

    opened by KavyaShree25 0
  • Prometheus in agent mode fails to send data to Thanos Receiver via remote_write way too often.

    Prometheus in agent mode fails to send data to Thanos Receiver via remote_write way too often.

    What did you do?

    Running Prometheus in k8s with agent mode enabled and have remote_write endpoint configured to receive metrics from prometheus. Running prometheus v2.36.2, with below remote_write configuration

    `prometheus.yml: |- global: external_labels: monitor: prometheus replica: '${HOSTNAME}' pod: '${HOSTNAME}' scrape_interval: 15s remote_write: - url: 'http://:19291/api/v1/receive' queue_config: capacity: 6000

          # Maximum number of shards, i.e. amount of concurrency.
          max_shards: 1500
        
          # Minimum number of shards, i.e. amount of concurrency.
          min_shards: 1
        
          # Maximum number of samples per send.
          max_samples_per_send: 2000
        
          # Maximum time a sample will wait in buffer.
          batch_send_deadline: 5s
        
          # Initial retry delay. Gets doubled for every retry.
          min_backoff: 30ms
        
          # Maximum retry delay.
          max_backoff: 5s
        `
    

    What did you expect to see?

    Prometheus to send data without any gap to remote_write endpoint.

    What did you see instead? Under which circumstances?

    Metrics are been dropped after certain interval of time, with no log messages in prometheus. But, it recovers after a short interval by itself.

    System information

    No response

    Prometheus version

    prometheus v2.36.2
    

    Prometheus configuration file

    `prometheus.yml: |-
        global:
          external_labels:
            monitor: prometheus
            replica: '${HOSTNAME}'
            pod: '${HOSTNAME}'
          scrape_interval: 15s
        remote_write:
          - url: 'http://<thanos-receive-endpoint>:19291/api/v1/receive'
            queue_config:
              capacity: 6000
            
              # Maximum number of shards, i.e. amount of concurrency.
              max_shards: 1500
            
              # Minimum number of shards, i.e. amount of concurrency.
              min_shards: 1
            
              # Maximum number of samples per send.
              max_samples_per_send: 2000
            
              # Maximum time a sample will wait in buffer.
              batch_send_deadline: 5s
            
              # Initial retry delay. Gets doubled for every retry.
              min_backoff: 30ms
            
              # Maximum retry delay.
              max_backoff: 5s
            `
    

    Alertmanager version

    No response

    Alertmanager configuration file

    No response

    Logs

    No response

    opened by apoorva-marisomaradhya 0
  • Revisit making time zone configurable

    Revisit making time zone configurable

    Proposal

    Use case. Why is this important?

    Prometheus currently logs timestamps in UTC like ts=2022-09-20T09:00:37.982Z. There is no way to set the timestamp to include the time zone offset. I.e. to replace Z with e.g. +01:00 to log in a specific local time, possibly set via the TZ variable. Which follows the iana Time Zone Database.

    This is important because in some organizations with a widespread amount of microservices/applications they require a consistent logging format in regards to the time zone. A predominant logging format in Unix is the syslog protocol, defined in https://www.rfc-editor.org/rfc/rfc5424.html. If organizations require this to be followed in logs, there is no way to achieve this in Prometheus.

    Considerations Both the FAQ and the previous issue https://github.com/prometheus/prometheus/issues/500 regarding this were closed without action. But this is a very valid feature request and as such brought up again.

    It is also understood logging can be normalized in aggregation services when inserted into databases (such as Open Search or similar), but the question here is about the actual logs going to stdout/file directly since that might still be a place where applications want to follow the syslog format, logging in a particular time zone with a time zone offset.

    Please do not close this as an invalid request as there are strong use cases in organizations wanting to follow the syslog specification.

    opened by thernstig 0
  • remote/read_handler: pool input to Marshal()

    remote/read_handler: pool input to Marshal()

    Use a sync.Pool to reuse byte slices between calls to Marshal() in the remote read handler.

    Fixes #11232.

    Signed-off-by: Giedrius Statkevičius [email protected]

    opened by GiedriusS 1
  • wrap api error on get series/labels on `returnAPIError` function

    wrap api error on get series/labels on `returnAPIError` function

    Fix: https://github.com/prometheus/prometheus/issues/11355

    GetLabels and GetSeries should return 422 for all errors. Instead, we should wrap the error with the returnAPIError as it is done on the query api:

    See: https://github.com/prometheus/prometheus/blob/734772f82824db11344ea3c39a166449d0e7e468/web/api/v1/api.go#L416-L418

    opened by alanprot 0
Releases(v2.37.1)
Dashboard and code-driven configuration for Laravel queues.

Introduction Horizon provides a beautiful dashboard and code-driven configuration for your Laravel powered Redis queues. Horizon allows you to easily

The Laravel Framework 3.5k Sep 22, 2022
A self-hosted metrics and notifications platform for Laravel apps

Larametrics A self-hosted metrics and notifications platform for Laravel apps, Larametrics records and notifies you of changes made to models, incomin

Andrew Schmelyun 573 Sep 13, 2022
Shelly Plug Prometheus exporter.

Shelly Plug Prometheus Exporter I am a simple man. I have a Shelly Plug. I don't want to flash a different firmware on it just to get a Prometheus end

Jeff Geerling 36 Sep 21, 2022
AirGradient Prometheus exporter.

AirGradient Prometheus Exporter AirGradient has a DIY air sensor. I built one (actually, more than one). I want to integrate sensor data into my in-ho

Jeff Geerling 89 Sep 14, 2022
Export Laravel Horizon metrics using this Prometheus exporter.

Laravel Horizon Prometheus Exporter Export Laravel Horizon metrics using this Prometheus exporter. This package leverages Exporter Contracts. ?? Suppo

Renoki Co. 22 Sep 18, 2022
Export Laravel Octane metrics using this Prometheus exporter.

Laravel Octane Prometheus Exporter Export Laravel Octane metrics using this Prometheus exporter. ?? Supporting If you are using one or more Renoki Co.

Renoki Co. 16 Sep 22, 2022
Prometheus exporter for Yii2

yii2-prometheus Prometheus Extension for Yii 2 This extension provides a Prometheus exporter component for Yii framework 2.0 applications. This extens

Mehdi Achour 3 Oct 27, 2021
Simple Magento 2 Prometheus Exporter.

Magento 2 Prometheus Exporter This Magento 2 Module exposes a new route under /metrics with Magento 2 specific metrics in the format of prometheus. Th

run_as_root GmbH 43 Aug 30, 2022
Centreon is a network, system and application monitoring tool. Centreon is the only AIOps Platform Providing Holistic Visibility to Complex IT Workflows from Cloud to Edge.

Centreon - IT and Application monitoring software Introduction Centreon is one of the most flexible and powerful monitoring softwares on the market;

Centreon 572 Sep 26, 2022
LibreNMS is an auto-discovering PHP/MySQL/SNMP based network monitoring system

LibreNMS is an auto-discovering PHP/MySQL/SNMP based network monitoring which includes support for a wide range of network hardware and operating systems including Cisco, Linux, FreeBSD, Juniper, Brocade, Foundry, HP and many more.

LibreNMS Project 2.9k Sep 29, 2022
A Computer Vision based speed monitoring system.

A Computer Vision based speed monitoring system. This project is developed as the submission for Smart City Hackathon 2021

Veeramanohar 1 Jun 15, 2022
Self Hosted Movie, Series and Anime Watch List

Flox Flox is a self hosted Movie, Series and Animes watch list. It's build on top of Laravel and Vue.js and uses The Movie Database API. The rating ba

Viktor Geringer 1.1k Sep 21, 2022
Multipurpose Laravel and Livewire Application. This is a part of tutorial series on Youtube.

Multipurpose Laravel and Livewire Application This is a part of YouTube tutorial series on building application using Laravel and Livewire. Here is th

Clovon 82 Sep 25, 2022
Downloads new lessons and series from laracasts if there are updates. Or the whole catalogue.

Laracasts Downloader Downloads new lessons and series from laracasts if there are updates. Or the whole catalogue. Currently looking for maintainers.

Carlos Florêncio 602 Aug 17, 2022
phpReel is a free, MIT open-source subscription-based video streaming service that lets you create your platform for distributing video content in the form of movies or series.

phpReel is a free, MIT open-source subscription-based video streaming service that lets you create your platform for distributing video content in the form of movies or series.

null 116 Sep 8, 2022
The API for my blog series about creating a simple stock portfolio

Stockportfolio API This repository is the API part of my blog series 'How to create a simple application using Symfony and React Native' which can be

Wouter Carabain 4 Sep 5, 2022
The forum is a base for our Youtube tutorial series on "how to build a forum"

About Laravel Laravel is a web application framework with expressive, elegant syntax. We believe development must be an enjoyable and creative experie

AngelJay 14 Jul 19, 2022
The source code behind the Laracasts Series: Image Uploading with Vue + Laravel

About Laravel Laravel is a web application framework with expressive, elegant syntax. We believe development must be an enjoyable and creative experie

Andrew Schmelyun 4 Aug 8, 2022
A series of methods that let you manipulate colors. Just incase you ever need different shades of one color on the fly.

PHPColors A series of methods that let you manipulate colors. Just incase you ever need different shades of one color on the fly. Requirements PHPColo

Arlo Carreon 418 Aug 26, 2022
Service that helps you to send notifications for a series of failed exceptions.

Laravel Failure Notifier This package helps you to track your exceptions and do what you want to do with them such as sending an SMS or and Email. You

Kamyar Gerami 5 Jun 27, 2022