After closedir() dirent->d_name is not valid anymore. As there alerady are a
few places relying on the limited lifetime of pg_waldump, do so here as well,
and just pg_strdup() the string.
The bug was introduced in fc49e24fa6.
Found by UBSan, run locally.
Backpatch: 11-, like fc49e24fa6 itself.
When opening a WAL file smaller than XLOG_BLCKSZ (e.g. 0 bytes long) while
determining the wal_segment_size, pg_waldump checked errno, despite errno not
being set by the short read. Resulting in a bogus error message.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220214.181847.775024684568733277.horikyota.ntt@gmail.com
Backpatch: 11-, the bug was introducedin fc49e24fa
This set of commits has some bugs with known fixes, but at this late
stage in the release cycle it seems best to revert and resubmit next
time, along with some new automated test coverage for this whole area.
Commits reverted:
dc88460c: Doc: Review for "Optionally prefetch referenced data in recovery."
1d257577: Optionally prefetch referenced data in recovery.
f003d9f8: Add circular WAL decoding buffer.
323cbe7c: Remove read_page callback from XLogReader.
Remove the new GUC group WAL_RECOVERY recently added by a55a9847, as the
corresponding section of config.sgml is now reverted.
Discussion: https://postgr.es/m/CAOuzzgrn7iKnFRsB4MHp3UisEQAGgZMbk_ViTN4HV4-Ksq8zCg%40mail.gmail.com
Teach xlogreader.c to decode its output into a circular buffer, to
support optimizations based on looking ahead.
* XLogReadRecord() works as before, consuming records one by one, and
allowing them to be examined via the traditional XLogRecGetXXX()
macros.
* An alternative new interface XLogNextRecord() is added that returns
pointers to DecodedXLogRecord structs that can be examined directly.
* XLogReadAhead() provides a second cursor that lets you see
further ahead, as long as data is available and there is enough space
in the decoding buffer. This returns DecodedXLogRecord pointers to the
caller, but also adds them to a queue of records that will later be
consumed by XLogNextRecord()/XLogReadRecord().
The buffer's size is controlled with wal_decode_buffer_size. The buffer
could potentially be placed into shared memory, for future projects.
Large records that don't fit in the circular buffer are called
"oversized" and allocated separately with palloc().
Discussion: https://postgr.es/m/CA+hUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq=AovOddfHpA@mail.gmail.com
Previously, the XLogReader module would fetch new input data using a
callback function. Redesign the interface so that it tells the caller
to insert more data with a special return value instead. This API suits
later patches for prefetching, encryption and maybe other future
projects that would otherwise require continually extending the callback
interface.
As incidental cleanup work, move global variables readOff, readLen and
readSegNo inside XlogReaderState.
Author: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
Author: Heikki Linnakangas <hlinnaka@iki.fi> (parts of earlier version)
Reviewed-by: Antonin Houska <ah@cybertec.at>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Takashi Menjo <takashi.menjo@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20190418.210257.43726183.horiguchi.kyotaro%40lab.ntt.co.jp
pg_waldump --stats=record identifies a record by a combination
of the RmgrId and the four bits of the xl_info field of the record.
But XACT records use the first bit of those four bits for an optional
flag variable, and the following three bits for the opcode to
identify a record. So previously the same type of XACT record
could have different four bits (three bits are the same but the
first one bit is different), and which could cause
pg_waldump --stats=record to show two lines of per-record statistics
for the same XACT record. This is a bug.
This commit changes pg_waldump --stats=record so that it processes
only XACT record differently, i.e., filters the opcode out of xl_info
and uses a combination of the RmgrId and those three bits as
the identifier of a record, only for XACT record. For other records,
the four bits of the xl_info field are still used.
Back-patch to all supported branches.
Author: Kyotaro Horiguchi
Reviewed-by: Shinya Kato, Fujii Masao
Discussion: https://postgr.es/m/2020100913412132258847@highgo.ca
Reuse cautionary language from src/test/ssl/README in
src/test/kerberos/README. SLRUs have had access to six-character
segments names since commit 73c986adde,
and recovery stopped calling HeapTupleHeaderAdvanceLatestRemovedXid() in
commit 558a9165e0. The other corrections
are more self-evident.
* Have both physical and logical walsender share a 'xlogreader' state
struct for tracking state. This replaces the existing globals sendSeg
and sendCxt.
* Change WALRead not to receive XLogReaderState->seg and ->segcxt as
separate arguments anymore; just use the ones from 'state'. This is
made possible by the above change.
* have the XLogReader segment_open contract require the callbacks to
install the file descriptor in the state struct themselves instead of
returning it. xlogreader was already ignoring any possible failed
return from the callbacks, relying solely on them never returning.
(This point is not altogether excellent, as it means the callbacks
have to know more of XLogReaderState; but to really improve on that
we would have to pass back error info from the callbacks to
xlogreader. And the complexity would not be saved but instead just
transferred to the callers of WALRead, which would have to learn how
to throw errors from the open_segment callback in addition of, as
currently, from pg_pread.)
* segment_open no longer receives the 'segcxt' as a separate argument,
since it's part of the XLogReaderState argument.
Per comments from Kyotaro Horiguchi.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200511203336.GA9913@alvherre.pgsql
Code review for 0dc8ead463, prompted by a bug closed by 91c40548d5.
XLogReader's system for opening and closing segments had gotten too
complicated, with callbacks being passed at both the XLogReaderAllocate
level (read_page) as well as at the WALRead level (segment_open). This
was confusing and hard to follow, so restructure things so that these
callbacks are passed together at XLogReaderAllocate time, and add
another callback to the set (segment_close) to make it a coherent whole.
Also, ensure XLogReaderState is an argument to all the callbacks, so
that they can grab at the ->private data if necessary.
Document the whole arrangement more clearly.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20200422175754.GA19858@alvherre.pgsql
The primary motivation for this change is that it will be used by the
upcoming patch to add backup manifests, but it also seems to have some
potential more general use.
Andres Freund and Robert Haas
Discussion: http://postgr.es/m/20200330020814.nspra4mvby42yoa4@alap3.anarazel.de
The signature of XLogReadRecord() required the caller to pass the starting
WAL position as argument, or InvalidXLogRecPtr to continue reading at the
end of previous record. That's slightly awkward to the callers, as most
of them don't want to randomly jump around in the WAL stream, but start
reading at one position and then read everything from that point onwards.
Remove the 'RecPtr' argument and add a new function XLogBeginRead() to
specify the starting position instead. That's more convenient for the
callers. Also, xlogreader holds state that is reset when you change the
starting position, so having a separate function for doing that feels like
a more natural fit.
This changes XLogFindNextRecord() function so that it doesn't reset the
xlogreader's state to what it was before the call anymore. Instead, it
positions the xlogreader to the found record, like XLogBeginRead().
Reviewed-by: Kyotaro Horiguchi, Alvaro Herrera
Discussion: https://www.postgresql.org/message-id/5382a7a3-debe-be31-c860-cb810c08f366%40iki.fi
Since d6c55de1, src/port/snprintf.c is able to use %m instead of
strerror(). A couple of utilities in src/bin/ have already done the
switch, and do it now for pg_waldump as this reduces the workload for
translators.
Note that more could be done, particularly with pgbench. Thanks to
Kyotaro Horiguchi for the discussion.
Discussion: https://postgr.es/m/20191129065115.GM2505@paquier.xyz
XLogReader, walsender and pg_waldump all had their own routines to read
data from WAL files to memory, with slightly different approaches
according to the particular conditions of each environment. There's a
lot of commonality, so we can refactor that into a single routine
WALRead in XLogReader, and move the differences to a separate (simpler)
callback that just opens the next WAL-segment. This results in a
clearer (ahem) code flow.
The error reporting needs are covered by filling in a new error-info
struct, WALReadError, and it's the caller's responsibility to act on it.
The backend has WALReadRaiseError() to do so.
We no longer ever need to seek in this interface; switch to using
pg_pread().
Author: Antonin Houska, with contributions from Álvaro Herrera
Reviewed-by: Michaël Paquier, Kyotaro Horiguchi
Discussion: https://postgr.es/m/14984.1554998742@spoje.net
There's plenty places in frontend code that could benefit from a
string buffer implementation. Some because it yields simpler and
faster code, and some others because of the desire to share code
between backend and frontend.
While there is a string buffer implementation available to frontend
code, libpq's PQExpBuffer, it is clunkier than stringinfo, it
introduces a libpq dependency, doesn't allow for sharing between
frontend and backend code, and has a higher API/ABI stability
requirement due to being exposed via libpq.
Therefore it seems best to just making StringInfo being usable by
frontend code. There's not much to do for that, except for rewriting
two subsequent elog/ereport calls into others types of error
reporting, and deciding on a maximum string length.
For the maximum string size I decided to privately define MaxAllocSize
to the same value as used in the backend. It seems likely that we'll
want to reconsider this for both backend and frontend code in the not
too far away future.
For now I've left stringinfo.h in lib/, rather than common/, to reduce
the likelihood of unnecessary breakage. We could alternatively decide
to provide a redirecting stringinfo.h in lib/, or just not provide
compatibility.
Author: Andres Freund
Reviewed-By: Kyotaro Horiguchi, Daniel Gustafsson
Discussion: https://postgr.es/m/20190920051857.2fhnvhvx4qdddviz@alap3.anarazel.de
When maintaining or merging patches, one of the most common sources
for conflicts are the list of objects in makefiles. Especially when
the split across lines has been changed on both sides, which is
somewhat common due to attempting to stay below 80 columns, those
conflicts are unnecessarily laborious to resolve.
By splitting, and alphabetically sorting, OBJS style lines into one
object per line, conflicts should be less frequent, and easier to
resolve when they still occur.
Author: Andres Freund
Discussion: https://postgr.es/m/20191029200901.vww4idgcxv74cwes@alap3.anarazel.de
The additional newline seems to have accidentally been introduced in
2c03216d83, in 9.5. The newline is only issued when an FPW is
present for the block reference.
While there could be an argument that removing the newlines in the
back branches could cause a problem for somebody parsing the
pg_waldump output, the likelihood of that seems small enough. It seems
at least equally likely that the randomness of when newlines are
issued causes problems.
Author: Andres Freund
Discussion: https://postgr.es/m/20191029233341.4gnyau7e5v2lh5sc@alap3.anarazel.de
Backpatch: 9.5, like 2c03216d83.
This would be all right, maybe, if it didn't also match a file that
definitely should not be ignored. We don't add rmgrs so often that
manual maintenance of this file list is impractical, so just write
out the list.
(I find the equivalent wildcard use in the Makefile pretty lazy and
unsafe as well, but will leave that alone until it actually causes a
problem.)
Per bug #16042 from Denis Stuchalin.
Discussion: https://postgr.es/m/16042-c174ee692ac21cbd@postgresql.org