Why doesn't my prometheus relabel_config work?

When configuring prometheus scrape_configs, you may use relabel_configs to filter your metrics or change some fields. Your config may look like this:

scrape_configs
        - job_name: kubernetes-service-endpoints
          sample_limit: 10000
          kubernetes_sd_configs:
          - role: endpoints
          relabel_configs:
          - action: kee...

load balance unix sockets to UDP destinations

Nginx is capable of forwarding a unix socket to UDP backend servers. This is quite handy for load balance syslog traffic.

Example nginx configuration

load_module /usr/share/nginx/modules/ngx_stream_module.so;

stream {
    upstream syslog_servers {
        server 192.0.2.10:514;
        server 192.0.2.11:514;
        server 192.0.2.12:514;
    }
    server {
        listen unix:/run/nginx/log.sock udp;
        proxy_pass syslog_server;
    }
}

Testing the connection

echo "Hello Syslog!" | socat - /run/nginx/log.sock...

Use systemd-run as an alternative for screen

You might use screen or tmux to run a temporary command on a server which continues to run after the SSH session is closed.

Consider systemd-run as alternative. It will turn every command in a systemd service unit:

# Run `openssl speed` as unit run-benchmark.service
$ sudo systemd-run --unit=run-benchmark openssl speed

# Query the current status
$ systemctl status run-benchmark.service
● run-benchmark.service - /usr/bin/openssl speed
   Loaded: loaded (/run/systemd/transient/run-benchmark.serv...

Networking restart on FreeBSD

If you try to restart you may encouter the problem that your networking connection gets shutdown but not start again. Here is the right way to restart networking on FreeBSD:

service netif restart && service routing restart

GitLab scheduled pipeline: Don't notify owner

The owner of a scheduled CI/CD pipeline in GitLab will always be notified if the pipeline fails.

Follow these steps if you don't want this:

  1. Create a Project Access Token with api scope
  2. Create the scheduled pipeline with that token:
    curl --request POST --header "PRIVATE-TOKEN: ${TOKEN}" \
      --form description="Daily pipeline check" \
      --form ref="master" \
      --form cron="0 10 * * *" \
      --form cron_timezone="UTC" \
      --form active="true" \
      "https://${GITLAB_URL}/api/v4/projects/${PROJECT_ID}/pip...
    

HowTo: Fix nginx not reloading with long gzip_types lines

When using many or very long entries of MIME-types that shoudl be gziped in gzip_types directives in nginx you might not be able to successfully reload the service and get this error message instead:

nginx: [emerg] could not build the test_types_hash, you should increase test_types_hash_bucket_size: 64
nginx: configuration file /etc/nginx/nginx.conf test failed

Option 1: Use gzip_types *;

If you don't care about which MIME type gets the gzip handling, just tell nginx that any MIME type should be gziped.

Especially f...

Repair broken etcd node

If one etcd node is no longer a member of the remaining etcd cluster or fails to connect you need to remove it from the cluster and then add it again:

  1. Stop etcd on the broken node : sudo stop etcd
  2. delete the data on the broken node sudo rm -r /var/lib/etcd/data/*
  3. delete the wal data on the broken node: sudo rm -r /var/lib/etcd/wal/*
  4. Follow the instructions for etcd runtime-configuration, remove the broken node from the cluster, then re-add it again and update t...

Restoring old Postgres dumps with pg_restore v11 and higher

There is an issue with when restoring a PostgreSQL dump created with pg_dump < v11 with pg_restore > v10:

pg_restore: [archiver (db)] could not execute query: ERROR:  schema "public"
already exists
    Command was: CREATE SCHEMA public;

Convert

If you want to restore this dump you should convert the dump to the text format first and comment out the CREATE SCHEMA public;` statement. For further information see linked content.

##...

Find unmaintained packages with apt-forktracer

If you use third party APT sources you might end up with unmaintained packages after removing the external source or performing a dist-upgrade. The reason for this is how external sources overwrite official package versions.

apt-forktracer helps you to identify such packages:

APT will not warn you when newer versions of official packages (point releases, security updates) will appear in the stable release. This means you may miss some important change.

Example output

This is the output of `apt...

Disable AWS Free Tier Usage Alerts

Ever felt annoyed by AWS Free Tier limit alert emails?

Just disable them:

Billing preferences -> Cost Management Preferences -> Receive Free Tier Usage Alerts

Redis Sentinel manual failover

Hint

You're not able to control which redis replica will chosen for the failover.

  1. Connect to your sentinel instance:

    redis-cli -p <SENTINEL-PORT>
    
  2. Have a look at the configured masters, current master and the available replicas

    INFO sentinel
    
    SENTINEL master <master name>
    SENTINEL get-master-addr-by-name <master name> # IP and port only
    
    SENTINEL slaves <master name>
    
  3. Force a failover

    SENTINEL failover <master name> 
    

...

Terragrunt/terraform: fork/exec argument list too long

When terragrunt is relaying information to input variables it's happening via environment variables. Depending on the size of the content of the variable it might exceed your OS limits. This is independent of your shell.

A possible workaround is to use a generated file to load the input instead of the env variable, e.g.

# WORKAROUND
# the variable my_huge_input cannot be loaded as part of the inputs
generate "dependencies" {
  path      = "dependencies.auto.tfvars"
  if_exists = "overwrite_terragrunt"
  contents = <<EOF
my_input    ...

HowTo: Clone and refresh all repos in a GitLab Group

If the project you're working on has, say, 39 repositories and counting in GitLab and you need all the repos checked out for some reason, here's how to do it.

Checking out all repos

  1. Create a personal access token for GitLab that has the API permissions. In your terminal, store this key in an env variable.
  2. For each group you want to check out:
    1. Create a new directory where you want all the checkouts to live.
    2. In GitLab, navigate to the Group's overview page so you can see the Group ID.
    3. In the directory you created...

HowTo: Get postgres shell in kubernetes

If your postgres database is only accessible from inside a kubernetes cluster, e.g. if it's configured in AWS RDS and not available to the public (as it should be!), here's how to open a psql shell inside Kubernetes and connect to the database. Make sure to replace the variables appropriately.

$ kubectl run postgresql-client \
  --image=postgres      \
  --namespace=$NAMESPACE \
  --stdin=true --tty=true \
  --rm=true                \
  --env="PGPASSWORD=$PASSWORD_FOR_POSTGRES \
  --command -- \
  psql --host=$HOSTNAME_FOR_POSTG...

HowTo: Get kubernetes secrets in plaintext

Here's a one-liner to view base64 encoded secrets in kubernetes. Make sure you have jq installed.

$ kubectl get -n $NAMESPACE secret/$SECRET_NAME -o json| jq '.data | map_values(@base64d)'
{
  "database": "secret1",
  "endpoint": "secret2",
  "username": "secret3",
  "password": "secret4"
}

ACM certificate not showing up in CloudFront

Preface

Before you continue, ensure that you've created your certificate in the region us-east-1 (N. Virginia). Otherwise the certificate is not available for CloudFront.

The issue

At some point in time you may be confronted with the following issue:

  • you've requested an SSL certificate via ACM
  • the validation was successful
  • you try to add the freshly issued ACM certificate to a CloudFront configuration via AWS console
  • the certificate is not selectable from the dropdown in the distribution configuration

Fixing the is...

Don't use puppet `exec` type without `cwd` and `user` parameter

  1. Don't use exec without user parameter

    If you use exec without user parameter, the command will get executed as root. You mostly don't want this.

  2. There is a difference in the env variables of the exec if you run puppet manually or if the daemon runs.

  3. Never ever use exec without cwd parameter

    If you use exec without cwd parameter, the command get executed in the cwd of your puppet run. This can cause problems if you run the puppet agent manually.

    Example:

    # exec resource:
    e...
    

Bolt: Run commands from a file

There's a simple way in bolt to run commands from a file without caring about BASH escaping:

# /home/user/foo.sh
echo "$(hostname -f): $(uptime)"
echo "${USER}"
echo "${SERVERLIST}" | bolt command run @foo.sh --run-as root --targets -

Use script run to run a ruby script:

#!/usr/bin/env ruby
# /home/user/bar.rb

puts 'Hello, world!'...

Delete unresponsive rabbitmq queue

In our monitoring, RabbitMQ queues like aliveness-test may show up as unresponsive, with a ping timeout after 10 seconds. The logfile will generally read like this:

operation queue.delete caused a channel exception not_found: failed to perform operation on queue 'example' in vhost '/' due to timeout

For the aliveness-test queue, you can can use this command to delete it:

rabbitmqctl eval 'rabbit_amqqueue:internal_delete({resource,<<"/">>,queue,<<"aliveness-test">>}).'

This queue is only used for monitoring if RabbitMQ...

Pay attention to trailing slashes when using rsync

When you synchronize directories with rsync you have to pay attention to use (or not use) trailing /.

Example:

# without trailing slash
$ mkdir -p a/foo/bar/baz
$ mkdir b
$ rsync -a a b
$ find b
b
b/a
b/a/foo
b/a/foo/bar
b/a/foo/bar/baz

# with trailing slash
$ mkdir -p a/foo/bar/baz
$ mkdir b
$ rsync -a a/ b/
$ find b
b
b/foo
b/foo/bar
b/foo/bar/baz

Replacing exported resources with puppetdb queries

Instead of using Puppet exported resources you can use the puppetdb_query feature.

This can result in more complex code but has several benefits:

  • you can use more complex puppetdb queries to get the resources you want than with the limited filtering options of exported resources
  • because you receive a data object of the resources you can only use a part of the information contained
  • you...

Barman recovery fails with missing history file

When restoring a barman PITR backup you may encounter this error:

Copying required WAL segments.
EXCEPTION: {'ret': 2, 'err': '/bin/sh: 1: cannot open /var/lib/barman/foopostgres/wals/00000007.history: No such file\n', 'out': ''}

The reason is that the barman backups xlog.db file contains a history file which is no longer present in the wals directory of your backup. The most likely reason is that someone deleted this file in the past. If you do not need this file for restoring your current backup (maybe because it's very old a...

Keepalived VRRP FAQ

How can I configure virtual IP's?

There are two parameter to set up virtual ips in Keepalived:

virtual_ipaddress

Addresses defined here are included into the VRRP Packages and are therefore limited in number, especially with IPv6.

Address families cannot be mixed here.

If this contains IPv6 addresses, Keepalived will use VRRP over IPv6.

The inclusion of the addresses into the VRRP packages is for troubleshooting reasons. See RFC5798 Section 5.2.9 and [RFC3768 Secti...

Terraform: Deploying code for lambda functions

If you're deploying code for your lambda function via terraform, this code is usually zipped and uploaded to Amazon S3 by terraform. The ZIP file's hash is then stored to terraform's state. However we have observed that zipping files can create ZIP archives with different hashes on different machines. This means that if you're collaborating with colleages e.g. via git, each run of terraform will possibly see a different hash of the code's ZIP archive and try to replace the lambda function.

Workaround

This workaround is for single file l...