I was trying to simply standby clone. Pretty basic. But I was getting a “segmentation fault”, no further log output.

I didn’t know what a segfault is, so I googled it:

A segmentation fault  occurs when a program attempts to access a memory location that it is  not allowed to access, or attempts to access a memory location in a way  that is not allowed (for example, attempting to write to a read-only  location, or to overwrite part of the operating system)

It hit me that prior to running the standby clone that I had run rm -rf $PGDATA . So, repmgr was trying to standby clone writing into a nonexisting directory. The simple fix was mkdir -p $PGDATA , and on subsequent runs, using rm -rf $PGDATA/* instead.

In our production system we have repmgr for managing failover for our Postgres databases. However, in one incident we had a while ago, our Primary (master) node went down due to a network problem, and repmgr didn’t perform the failover as expected. On reviewing the logs we saw that one of the standby(slave) instances recognized the master failure, and initiated a vote, but the other nodes ignored it and no failover occurred.

Eventually we discovered the reason why- the “ignoring” nodes and the primary server were all on the same physical compute. (In an Openstack cloud environment). This may be worth looking into!

I was reconfiguring defaults for postgres configurations, when I got this error:

ERROR: 4 is outside the valid range for parameter “work_mem” (64 .. 2097151)

I thought that was pretty weird, since the default value is 4 and every guide and documentation all shows that you can even set kilobytes. SO why was I getting this error?

Turns out that I had made the very simple mistake of not specifying what type of value I wanted= I had to write 4MB instead of 4 . Changing this immediately resolved the issue.

How to keep your system reasonably safe and distro-agnostic.

Okay look, we all love Linux as our daily driver. We love to flout how great it is. After all, Linux powers the vast majority of all webservers in the world, so certainly it is robust…

Well, here’s the thing. Every self-respecting sysadmin has good backup policies for a reason…

The title describes the problem pretty well- I was trying to create an NFS share between 3 openstack VMs on a subnet, but the client machines could not see the share on the server.

The ports were opened as per documentation, 2049 and 111 in both udp and tcp. Telnet…

This is an annoyingly badly printed error message. It just means that the directory you are trying to install to isn’t accessible to ansible-galaxy. You should be able to fix it by using one (or more, as the case may be) of the following:

  1. -p, --roles-path to specify where your roles should install to,
  2. Edit ansible.cfg for the default roles location,
  3. chown -R you:you /your/role/directory

Kobi Rosenstein

Devops DBA. This blog chronicles my “gotcha” moments — Each post contains an answer I would have like to have found when searching for those pesky errors I get.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store