postgresql/src/include/postmaster/bgworker_internals.h
Tom Lane b99b6b9d6c Fix race condition between shutdown and unstarted background workers.
If a database shutdown (smart or fast) is commanded between the time
some process decides to request a new background worker and the time
that the postmaster can launch that worker, then nothing happens
because the postmaster won't launch any bgworkers once it's exited
PM_RUN state.  This is fine ... unless the requesting process is
waiting for that worker to finish (or even for it to start); in that
case the requestor is stuck, and only manual intervention will get us
to the point of being able to shut down.

To fix, cancel pending requests for workers when the postmaster sends
shutdown (SIGTERM) signals, and similarly cancel any new requests that
arrive after that point.  (We can optimize things slightly by only
doing the cancellation for workers that have waiters.)  To fit within
the existing bgworker APIs, the "cancel" is made to look like the
worker was started and immediately stopped, causing deregistration of
the bgworker entry.  Waiting processes would have to deal with
premature worker exit anyway, so this should introduce no bugs that
weren't there before.  We do have a side effect that registration
records for restartable bgworkers might disappear when theoretically
they should have remained in place; but since we're shutting down,
that shouldn't matter.

Back-patch to v10.  There might be value in putting this into 9.6
as well, but the management of bgworkers is a bit different there
(notably see 8ff518699) and I'm not convinced it's worth the effort
to validate the patch for that branch.

Discussion: https://postgr.es/m/661570.1608673226@sss.pgh.pa.us
2020-12-24 17:00:43 -05:00

64 lines
2.1 KiB
C

/*--------------------------------------------------------------------
* bgworker_internals.h
* POSTGRES pluggable background workers internals
*
* Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* IDENTIFICATION
* src/include/postmaster/bgworker_internals.h
*--------------------------------------------------------------------
*/
#ifndef BGWORKER_INTERNALS_H
#define BGWORKER_INTERNALS_H
#include "datatype/timestamp.h"
#include "lib/ilist.h"
#include "postmaster/bgworker.h"
/* GUC options */
/*
* Maximum possible value of parallel workers.
*/
#define MAX_PARALLEL_WORKER_LIMIT 1024
/*
* List of background workers, private to postmaster.
*
* A worker that requests a database connection during registration will have
* rw_backend set, and will be present in BackendList. Note: do not rely on
* rw_backend being non-NULL for shmem-connected workers!
*/
typedef struct RegisteredBgWorker
{
BackgroundWorker rw_worker; /* its registry entry */
struct bkend *rw_backend; /* its BackendList entry, or NULL */
pid_t rw_pid; /* 0 if not running */
int rw_child_slot;
TimestampTz rw_crashed_at; /* if not 0, time it last crashed */
int rw_shmem_slot;
bool rw_terminate;
slist_node rw_lnode; /* list link */
} RegisteredBgWorker;
extern slist_head BackgroundWorkerList;
extern Size BackgroundWorkerShmemSize(void);
extern void BackgroundWorkerShmemInit(void);
extern void BackgroundWorkerStateChange(bool allow_new_workers);
extern void ForgetBackgroundWorker(slist_mutable_iter *cur);
extern void ReportBackgroundWorkerPID(RegisteredBgWorker *);
extern void ReportBackgroundWorkerExit(slist_mutable_iter *cur);
extern void BackgroundWorkerStopNotifications(pid_t pid);
extern void ForgetUnstartedBackgroundWorkers(void);
extern void ResetBackgroundWorkerCrashTimes(void);
/* Function to start a background worker, called from postmaster.c */
extern void StartBackgroundWorker(void) pg_attribute_noreturn();
#ifdef EXEC_BACKEND
extern BackgroundWorker *BackgroundWorkerEntry(int slotno);
#endif
#endif /* BGWORKER_INTERNALS_H */