2022-09-04 00:33:31 -04:00
|
|
|
%top{
|
2011-01-14 10:30:33 -05:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
|
*
|
|
|
|
|
* repl_scanner.l
|
|
|
|
|
* a lexical scanner for the replication commands
|
|
|
|
|
*
|
2024-01-03 20:49:05 -05:00
|
|
|
* Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
|
2011-01-14 10:30:33 -05:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
|
*
|
|
|
|
|
*
|
|
|
|
|
* IDENTIFICATION
|
|
|
|
|
* src/backend/replication/repl_scanner.l
|
|
|
|
|
*
|
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
|
*/
|
|
|
|
|
#include "postgres.h"
|
|
|
|
|
|
Split up guc.c for better build speed and ease of maintenance.
guc.c has grown to be one of our largest .c files, making it
a bottleneck for compilation. It's also acquired a bunch of
knowledge that'd be better kept elsewhere, because of our not
very good habit of putting variable-specific check hooks here.
Hence, split it up along these lines:
* guc.c itself retains just the core GUC housekeeping mechanisms.
* New file guc_funcs.c contains the SET/SHOW interfaces and some
SQL-accessible functions for GUC manipulation.
* New file guc_tables.c contains the data arrays that define the
built-in GUC variables, along with some already-exported constant
tables.
* GUC check/assign/show hook functions are moved to the variable's
home module, whenever that's clearly identifiable. A few hard-
to-classify hooks ended up in commands/variable.c, which was
already a home for miscellaneous GUC hook functions.
To avoid cluttering a lot more header files with #include "guc.h",
I also invented a new header file utils/guc_hooks.h and put all
the GUC hook functions' declarations there, regardless of their
originating module. That allowed removal of #include "guc.h"
from some existing headers. The fallout from that (hopefully
all caught here) demonstrates clearly why such inclusions are
best minimized: there are a lot of files that, for example,
were getting array.h at two or more levels of remove, despite
not having any connection at all to GUCs in themselves.
There is some very minor code beautification here, such as
renaming a couple of inconsistently-named hook functions
and improving some comments. But mostly this just moves
code from point A to point B and deals with the ensuing
needs for #include adjustments and exporting a few functions
that previously weren't exported.
Patch by me, per a suggestion from Andres Freund; thanks also
to Michael Paquier for the idea to invent guc_funcs.c.
Discussion: https://postgr.es/m/587607.1662836699@sss.pgh.pa.us
2022-09-13 11:05:07 -04:00
|
|
|
#include "nodes/parsenodes.h"
|
Allow a streaming replication standby to follow a timeline switch.
Before this patch, streaming replication would refuse to start replicating
if the timeline in the primary doesn't exactly match the standby. The
situation where it doesn't match is when you have a master, and two
standbys, and you promote one of the standbys to become new master.
Promoting bumps up the timeline ID, and after that bump, the other standby
would refuse to continue.
There's significantly more timeline related logic in streaming replication
now. First of all, when a standby connects to primary, it will ask the
primary for any timeline history files that are missing from the standby.
The missing files are sent using a new replication command TIMELINE_HISTORY,
and stored in standby's pg_xlog directory. Using the timeline history files,
the standby can follow the latest timeline present in the primary
(recovery_target_timeline='latest'), just as it can follow new timelines
appearing in an archive directory.
START_REPLICATION now takes a TIMELINE parameter, to specify exactly which
timeline to stream WAL from. This allows the standby to request the primary
to send over WAL that precedes the promotion. The replication protocol is
changed slightly (in a backwards-compatible way although there's little hope
of streaming replication working across major versions anyway), to allow
replication to stop when the end of timeline reached, putting the walsender
back into accepting a replication command.
Many thanks to Amit Kapila for testing and reviewing various versions of
this patch.
2012-12-13 12:00:00 -05:00
|
|
|
#include "utils/builtins.h"
|
2014-01-31 22:45:17 -05:00
|
|
|
#include "parser/scansup.h"
|
Allow a streaming replication standby to follow a timeline switch.
Before this patch, streaming replication would refuse to start replicating
if the timeline in the primary doesn't exactly match the standby. The
situation where it doesn't match is when you have a master, and two
standbys, and you promote one of the standbys to become new master.
Promoting bumps up the timeline ID, and after that bump, the other standby
would refuse to continue.
There's significantly more timeline related logic in streaming replication
now. First of all, when a standby connects to primary, it will ask the
primary for any timeline history files that are missing from the standby.
The missing files are sent using a new replication command TIMELINE_HISTORY,
and stored in standby's pg_xlog directory. Using the timeline history files,
the standby can follow the latest timeline present in the primary
(recovery_target_timeline='latest'), just as it can follow new timelines
appearing in an archive directory.
START_REPLICATION now takes a TIMELINE parameter, to specify exactly which
timeline to stream WAL from. This allows the standby to request the primary
to send over WAL that precedes the promotion. The replication protocol is
changed slightly (in a backwards-compatible way although there's little hope
of streaming replication working across major versions anyway), to allow
replication to stop when the end of timeline reached, putting the walsender
back into accepting a replication command.
Many thanks to Amit Kapila for testing and reviewing various versions of
this patch.
2012-12-13 12:00:00 -05:00
|
|
|
|
2022-09-04 00:33:31 -04:00
|
|
|
/*
|
|
|
|
|
* NB: include repl_gram.h only AFTER including walsender_private.h, because
|
|
|
|
|
* walsender_private includes headers that define XLogRecPtr.
|
|
|
|
|
*/
|
|
|
|
|
#include "replication/walsender_private.h"
|
|
|
|
|
#include "repl_gram.h"
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
%{
|
2011-01-14 10:30:33 -05:00
|
|
|
/* Avoid exit() on fatal scanner errors (a bit ugly -- see yy_fatal_error) */
|
|
|
|
|
#undef fprintf
|
Improve handling of ereport(ERROR) and elog(ERROR).
In commit 71450d7fd6c7cf7b3e38ac56e363bff6a681973c, we added code to inform
suitably-intelligent compilers that ereport() doesn't return if the elevel
is ERROR or higher. This patch extends that to elog(), and also fixes a
double-evaluation hazard that the previous commit created in ereport(),
as well as reducing the emitted code size.
The elog() improvement requires the compiler to support __VA_ARGS__, which
should be available in just about anything nowadays since it's required by
C99. But our minimum language baseline is still C89, so add a configure
test for that.
The previous commit assumed that ereport's elevel could be evaluated twice,
which isn't terribly safe --- there are already counterexamples in xlog.c.
On compilers that have __builtin_constant_p, we can use that to protect the
second test, since there's no possible optimization gain if the compiler
doesn't know the value of elevel. Otherwise, use a local variable inside
the macros to prevent double evaluation. The local-variable solution is
inferior because (a) it leads to useless code being emitted when elevel
isn't constant, and (b) it increases the optimization level needed for the
compiler to recognize that subsequent code is unreachable. But it seems
better than not teaching non-gcc compilers about unreachability at all.
Lastly, if the compiler has __builtin_unreachable(), we can use that
instead of abort(), resulting in a noticeable code savings since no
function call is actually emitted. However, it seems wise to do this only
in non-assert builds. In an assert build, continue to use abort(), so that
the behavior will be predictable and debuggable if the "impossible"
happens.
These changes involve making the ereport and elog macros emit do-while
statement blocks not just expressions, which forces small changes in
a few call sites.
Andres Freund, Tom Lane, Heikki Linnakangas
2013-01-13 18:39:20 -05:00
|
|
|
#define fprintf(file, fmt, msg) fprintf_to_ereport(fmt, msg)
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
fprintf_to_ereport(const char *fmt, const char *msg)
|
|
|
|
|
{
|
|
|
|
|
ereport(ERROR, (errmsg_internal("%s", msg)));
|
|
|
|
|
}
|
2011-01-14 10:30:33 -05:00
|
|
|
|
|
|
|
|
/* Handle to the buffer that the lexer uses internally */
|
|
|
|
|
static YY_BUFFER_STATE scanbufhandle;
|
|
|
|
|
|
Fix limitations on what SQL commands can be issued to a walsender.
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands. Poor design of the replication-command
parser caused it to fail in various cases, notably:
* semicolons embedded in a command, or multiple SQL commands
sent in a single message;
* dollar-quoted literals containing odd numbers of single
or double quote marks;
* commands starting with a comment.
The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command. Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.
We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command. That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token. If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing. Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.
(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)
In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type. (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)
I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it. The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.
Per bug #17379 from Greg Rychlewski. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
2022-01-24 15:33:34 -05:00
|
|
|
/* Pushed-back token (we only handle one) */
|
|
|
|
|
static int repl_pushed_back_token;
|
|
|
|
|
|
|
|
|
|
/* Work area for collecting literals */
|
2011-01-14 10:30:33 -05:00
|
|
|
static StringInfoData litbuf;
|
|
|
|
|
|
|
|
|
|
static void startlit(void);
|
|
|
|
|
static char *litbufdup(void);
|
|
|
|
|
static void addlit(char *ytext, int yleng);
|
|
|
|
|
static void addlitchar(unsigned char ychar);
|
|
|
|
|
|
2017-08-10 23:33:47 -04:00
|
|
|
/* LCOV_EXCL_START */
|
|
|
|
|
|
2011-01-14 10:30:33 -05:00
|
|
|
%}
|
|
|
|
|
|
|
|
|
|
%option 8bit
|
|
|
|
|
%option never-interactive
|
|
|
|
|
%option nodefault
|
|
|
|
|
%option noinput
|
|
|
|
|
%option nounput
|
|
|
|
|
%option noyywrap
|
|
|
|
|
%option warn
|
|
|
|
|
%option prefix="replication_yy"
|
|
|
|
|
|
Fix limitations on what SQL commands can be issued to a walsender.
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands. Poor design of the replication-command
parser caused it to fail in various cases, notably:
* semicolons embedded in a command, or multiple SQL commands
sent in a single message;
* dollar-quoted literals containing odd numbers of single
or double quote marks;
* commands starting with a comment.
The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command. Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.
We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command. That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token. If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing. Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.
(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)
In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type. (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)
I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it. The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.
Per bug #17379 from Greg Rychlewski. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
2022-01-24 15:33:34 -05:00
|
|
|
/*
|
|
|
|
|
* Exclusive states:
|
|
|
|
|
* <xd> delimited identifiers (double-quoted identifiers)
|
|
|
|
|
* <xq> standard single-quoted strings
|
|
|
|
|
*/
|
|
|
|
|
%x xd
|
|
|
|
|
%x xq
|
|
|
|
|
|
Handle \v as a whitespace character in parsers
This commit comes as a continuation of the discussion that has led to
d522b05, as \v was handled inconsistently when parsing array values or
anything going through the parsers, and changing a parser behavior in
stable branches is a scary thing to do. The parsing of array values now
uses the more central scanner_isspace() and array_isspace() is removed.
As pointing out by Peter Eisentraut, fix a confusing reference to
horizontal space in the parsers with the term "horiz_space". \f was
included in this set since 3cfdd8f from 2000, but it is not horizontal.
"horiz_space" is renamed to "non_newline_space", to refer to all
whitespace characters except newlines.
The changes impact the parsers for the backend, psql, seg, cube, ecpg
and replication commands. Note that JSON should not escape \v, as per
RFC 7159, so these are not touched.
Reviewed-by: Peter Eisentraut, Tom Lane
Discussion: https://postgr.es/m/ZJKcjNwWHHvw9ksQ@paquier.xyz
2023-07-05 19:16:24 -04:00
|
|
|
space [ \t\n\r\f\v]
|
Fix limitations on what SQL commands can be issued to a walsender.
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands. Poor design of the replication-command
parser caused it to fail in various cases, notably:
* semicolons embedded in a command, or multiple SQL commands
sent in a single message;
* dollar-quoted literals containing odd numbers of single
or double quote marks;
* commands starting with a comment.
The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command. Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.
We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command. That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token. If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing. Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.
(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)
In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type. (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)
I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it. The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.
Per bug #17379 from Greg Rychlewski. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
2022-01-24 15:33:34 -05:00
|
|
|
|
|
|
|
|
quote '
|
|
|
|
|
quotestop {quote}
|
2011-01-14 10:30:33 -05:00
|
|
|
|
|
|
|
|
/* Extended quote
|
|
|
|
|
* xqdouble implements embedded quote, ''''
|
|
|
|
|
*/
|
|
|
|
|
xqstart {quote}
|
|
|
|
|
xqdouble {quote}{quote}
|
|
|
|
|
xqinside [^']+
|
|
|
|
|
|
2014-01-31 22:45:17 -05:00
|
|
|
/* Double quote
|
|
|
|
|
* Allows embedded spaces and other special characters into identifiers.
|
|
|
|
|
*/
|
|
|
|
|
dquote \"
|
|
|
|
|
xdstart {dquote}
|
|
|
|
|
xdstop {dquote}
|
|
|
|
|
xddouble {dquote}{dquote}
|
|
|
|
|
xdinside [^"]+
|
|
|
|
|
|
Fix limitations on what SQL commands can be issued to a walsender.
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands. Poor design of the replication-command
parser caused it to fail in various cases, notably:
* semicolons embedded in a command, or multiple SQL commands
sent in a single message;
* dollar-quoted literals containing odd numbers of single
or double quote marks;
* commands starting with a comment.
The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command. Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.
We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command. That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token. If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing. Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.
(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)
In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type. (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)
I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it. The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.
Per bug #17379 from Greg Rychlewski. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
2022-01-24 15:33:34 -05:00
|
|
|
digit [0-9]
|
|
|
|
|
hexdigit [0-9A-Fa-f]
|
2011-01-14 10:30:33 -05:00
|
|
|
|
2014-01-31 22:45:17 -05:00
|
|
|
ident_start [A-Za-z\200-\377_]
|
|
|
|
|
ident_cont [A-Za-z\200-\377_0-9\$]
|
|
|
|
|
|
|
|
|
|
identifier {ident_start}{ident_cont}*
|
|
|
|
|
|
2011-01-14 10:30:33 -05:00
|
|
|
%%
|
|
|
|
|
|
Fix limitations on what SQL commands can be issued to a walsender.
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands. Poor design of the replication-command
parser caused it to fail in various cases, notably:
* semicolons embedded in a command, or multiple SQL commands
sent in a single message;
* dollar-quoted literals containing odd numbers of single
or double quote marks;
* commands starting with a comment.
The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command. Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.
We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command. That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token. If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing. Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.
(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)
In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type. (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)
I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it. The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.
Per bug #17379 from Greg Rychlewski. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
2022-01-24 15:33:34 -05:00
|
|
|
%{
|
|
|
|
|
/* This code is inserted at the start of replication_yylex() */
|
|
|
|
|
|
|
|
|
|
/* If we have a pushed-back token, return that. */
|
|
|
|
|
if (repl_pushed_back_token)
|
|
|
|
|
{
|
|
|
|
|
int result = repl_pushed_back_token;
|
|
|
|
|
|
|
|
|
|
repl_pushed_back_token = 0;
|
|
|
|
|
return result;
|
|
|
|
|
}
|
|
|
|
|
%}
|
|
|
|
|
|
2011-01-14 10:30:33 -05:00
|
|
|
BASE_BACKUP { return K_BASE_BACKUP; }
|
|
|
|
|
IDENTIFY_SYSTEM { return K_IDENTIFY_SYSTEM; }
|
2021-10-24 18:40:42 -04:00
|
|
|
READ_REPLICATION_SLOT { return K_READ_REPLICATION_SLOT; }
|
2017-01-24 16:59:18 -05:00
|
|
|
SHOW { return K_SHOW; }
|
Allow a streaming replication standby to follow a timeline switch.
Before this patch, streaming replication would refuse to start replicating
if the timeline in the primary doesn't exactly match the standby. The
situation where it doesn't match is when you have a master, and two
standbys, and you promote one of the standbys to become new master.
Promoting bumps up the timeline ID, and after that bump, the other standby
would refuse to continue.
There's significantly more timeline related logic in streaming replication
now. First of all, when a standby connects to primary, it will ask the
primary for any timeline history files that are missing from the standby.
The missing files are sent using a new replication command TIMELINE_HISTORY,
and stored in standby's pg_xlog directory. Using the timeline history files,
the standby can follow the latest timeline present in the primary
(recovery_target_timeline='latest'), just as it can follow new timelines
appearing in an archive directory.
START_REPLICATION now takes a TIMELINE parameter, to specify exactly which
timeline to stream WAL from. This allows the standby to request the primary
to send over WAL that precedes the promotion. The replication protocol is
changed slightly (in a backwards-compatible way although there's little hope
of streaming replication working across major versions anyway), to allow
replication to stop when the end of timeline reached, putting the walsender
back into accepting a replication command.
Many thanks to Amit Kapila for testing and reviewing various versions of
this patch.
2012-12-13 12:00:00 -05:00
|
|
|
TIMELINE { return K_TIMELINE; }
|
2011-01-14 10:30:33 -05:00
|
|
|
START_REPLICATION { return K_START_REPLICATION; }
|
2014-01-31 22:45:17 -05:00
|
|
|
CREATE_REPLICATION_SLOT { return K_CREATE_REPLICATION_SLOT; }
|
|
|
|
|
DROP_REPLICATION_SLOT { return K_DROP_REPLICATION_SLOT; }
|
Allow a streaming replication standby to follow a timeline switch.
Before this patch, streaming replication would refuse to start replicating
if the timeline in the primary doesn't exactly match the standby. The
situation where it doesn't match is when you have a master, and two
standbys, and you promote one of the standbys to become new master.
Promoting bumps up the timeline ID, and after that bump, the other standby
would refuse to continue.
There's significantly more timeline related logic in streaming replication
now. First of all, when a standby connects to primary, it will ask the
primary for any timeline history files that are missing from the standby.
The missing files are sent using a new replication command TIMELINE_HISTORY,
and stored in standby's pg_xlog directory. Using the timeline history files,
the standby can follow the latest timeline present in the primary
(recovery_target_timeline='latest'), just as it can follow new timelines
appearing in an archive directory.
START_REPLICATION now takes a TIMELINE parameter, to specify exactly which
timeline to stream WAL from. This allows the standby to request the primary
to send over WAL that precedes the promotion. The replication protocol is
changed slightly (in a backwards-compatible way although there's little hope
of streaming replication working across major versions anyway), to allow
replication to stop when the end of timeline reached, putting the walsender
back into accepting a replication command.
Many thanks to Amit Kapila for testing and reviewing various versions of
this patch.
2012-12-13 12:00:00 -05:00
|
|
|
TIMELINE_HISTORY { return K_TIMELINE_HISTORY; }
|
2014-01-31 22:45:17 -05:00
|
|
|
PHYSICAL { return K_PHYSICAL; }
|
2015-09-06 07:17:23 -04:00
|
|
|
RESERVE_WAL { return K_RESERVE_WAL; }
|
2014-03-10 13:50:28 -04:00
|
|
|
LOGICAL { return K_LOGICAL; }
|
2014-01-31 22:45:17 -05:00
|
|
|
SLOT { return K_SLOT; }
|
2016-12-08 12:00:00 -05:00
|
|
|
TEMPORARY { return K_TEMPORARY; }
|
2021-06-29 23:15:47 -04:00
|
|
|
TWO_PHASE { return K_TWO_PHASE; }
|
2017-03-14 17:13:56 -04:00
|
|
|
EXPORT_SNAPSHOT { return K_EXPORT_SNAPSHOT; }
|
|
|
|
|
NOEXPORT_SNAPSHOT { return K_NOEXPORT_SNAPSHOT; }
|
2017-03-23 08:36:36 -04:00
|
|
|
USE_SNAPSHOT { return K_USE_SNAPSHOT; }
|
2017-09-01 07:44:14 -04:00
|
|
|
WAIT { return K_WAIT; }
|
2023-12-20 09:49:12 -05:00
|
|
|
UPLOAD_MANIFEST { return K_UPLOAD_MANIFEST; }
|
2014-01-31 22:45:17 -05:00
|
|
|
|
Fix limitations on what SQL commands can be issued to a walsender.
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands. Poor design of the replication-command
parser caused it to fail in various cases, notably:
* semicolons embedded in a command, or multiple SQL commands
sent in a single message;
* dollar-quoted literals containing odd numbers of single
or double quote marks;
* commands starting with a comment.
The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command. Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.
We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command. That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token. If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing. Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.
(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)
In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type. (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)
I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it. The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.
Per bug #17379 from Greg Rychlewski. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
2022-01-24 15:33:34 -05:00
|
|
|
{space}+ { /* do nothing */ }
|
2011-01-14 10:30:33 -05:00
|
|
|
|
Allow a streaming replication standby to follow a timeline switch.
Before this patch, streaming replication would refuse to start replicating
if the timeline in the primary doesn't exactly match the standby. The
situation where it doesn't match is when you have a master, and two
standbys, and you promote one of the standbys to become new master.
Promoting bumps up the timeline ID, and after that bump, the other standby
would refuse to continue.
There's significantly more timeline related logic in streaming replication
now. First of all, when a standby connects to primary, it will ask the
primary for any timeline history files that are missing from the standby.
The missing files are sent using a new replication command TIMELINE_HISTORY,
and stored in standby's pg_xlog directory. Using the timeline history files,
the standby can follow the latest timeline present in the primary
(recovery_target_timeline='latest'), just as it can follow new timelines
appearing in an archive directory.
START_REPLICATION now takes a TIMELINE parameter, to specify exactly which
timeline to stream WAL from. This allows the standby to request the primary
to send over WAL that precedes the promotion. The replication protocol is
changed slightly (in a backwards-compatible way although there's little hope
of streaming replication working across major versions anyway), to allow
replication to stop when the end of timeline reached, putting the walsender
back into accepting a replication command.
Many thanks to Amit Kapila for testing and reviewing various versions of
this patch.
2012-12-13 12:00:00 -05:00
|
|
|
{digit}+ {
|
2022-09-04 00:33:31 -04:00
|
|
|
replication_yylval.uintval = strtoul(yytext, NULL, 10);
|
2013-08-14 23:18:49 -04:00
|
|
|
return UCONST;
|
Allow a streaming replication standby to follow a timeline switch.
Before this patch, streaming replication would refuse to start replicating
if the timeline in the primary doesn't exactly match the standby. The
situation where it doesn't match is when you have a master, and two
standbys, and you promote one of the standbys to become new master.
Promoting bumps up the timeline ID, and after that bump, the other standby
would refuse to continue.
There's significantly more timeline related logic in streaming replication
now. First of all, when a standby connects to primary, it will ask the
primary for any timeline history files that are missing from the standby.
The missing files are sent using a new replication command TIMELINE_HISTORY,
and stored in standby's pg_xlog directory. Using the timeline history files,
the standby can follow the latest timeline present in the primary
(recovery_target_timeline='latest'), just as it can follow new timelines
appearing in an archive directory.
START_REPLICATION now takes a TIMELINE parameter, to specify exactly which
timeline to stream WAL from. This allows the standby to request the primary
to send over WAL that precedes the promotion. The replication protocol is
changed slightly (in a backwards-compatible way although there's little hope
of streaming replication working across major versions anyway), to allow
replication to stop when the end of timeline reached, putting the walsender
back into accepting a replication command.
Many thanks to Amit Kapila for testing and reviewing various versions of
this patch.
2012-12-13 12:00:00 -05:00
|
|
|
}
|
|
|
|
|
|
2011-01-14 10:30:33 -05:00
|
|
|
{hexdigit}+\/{hexdigit}+ {
|
2012-06-24 11:51:37 -04:00
|
|
|
uint32 hi,
|
|
|
|
|
lo;
|
|
|
|
|
if (sscanf(yytext, "%X/%X", &hi, &lo) != 2)
|
2022-09-04 00:33:31 -04:00
|
|
|
replication_yyerror("invalid streaming start location");
|
|
|
|
|
replication_yylval.recptr = ((uint64) hi) << 32 | lo;
|
2011-01-14 10:30:33 -05:00
|
|
|
return RECPTR;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
{xqstart} {
|
|
|
|
|
BEGIN(xq);
|
|
|
|
|
startlit();
|
|
|
|
|
}
|
2014-01-31 22:45:17 -05:00
|
|
|
|
2011-01-14 10:30:33 -05:00
|
|
|
<xq>{quotestop} {
|
|
|
|
|
yyless(1);
|
|
|
|
|
BEGIN(INITIAL);
|
2022-09-04 00:33:31 -04:00
|
|
|
replication_yylval.str = litbufdup();
|
2011-01-14 10:30:33 -05:00
|
|
|
return SCONST;
|
|
|
|
|
}
|
2014-01-31 22:45:17 -05:00
|
|
|
|
|
|
|
|
<xq>{xqdouble} {
|
2011-01-14 10:30:33 -05:00
|
|
|
addlitchar('\'');
|
|
|
|
|
}
|
2014-01-31 22:45:17 -05:00
|
|
|
|
2011-01-14 10:30:33 -05:00
|
|
|
<xq>{xqinside} {
|
|
|
|
|
addlit(yytext, yyleng);
|
|
|
|
|
}
|
|
|
|
|
|
2014-01-31 22:45:17 -05:00
|
|
|
{xdstart} {
|
|
|
|
|
BEGIN(xd);
|
|
|
|
|
startlit();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
<xd>{xdstop} {
|
2022-05-13 01:17:29 -04:00
|
|
|
int len;
|
|
|
|
|
|
2014-01-31 22:45:17 -05:00
|
|
|
yyless(1);
|
|
|
|
|
BEGIN(INITIAL);
|
2022-09-04 00:33:31 -04:00
|
|
|
replication_yylval.str = litbufdup();
|
|
|
|
|
len = strlen(replication_yylval.str);
|
|
|
|
|
truncate_identifier(replication_yylval.str, len, true);
|
2014-01-31 22:45:17 -05:00
|
|
|
return IDENT;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
<xd>{xdinside} {
|
|
|
|
|
addlit(yytext, yyleng);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
{identifier} {
|
2022-05-13 01:17:29 -04:00
|
|
|
int len = strlen(yytext);
|
2014-01-31 22:45:17 -05:00
|
|
|
|
2022-09-04 00:33:31 -04:00
|
|
|
replication_yylval.str = downcase_truncate_identifier(yytext, len, true);
|
2014-01-31 22:45:17 -05:00
|
|
|
return IDENT;
|
|
|
|
|
}
|
|
|
|
|
|
Fix limitations on what SQL commands can be issued to a walsender.
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands. Poor design of the replication-command
parser caused it to fail in various cases, notably:
* semicolons embedded in a command, or multiple SQL commands
sent in a single message;
* dollar-quoted literals containing odd numbers of single
or double quote marks;
* commands starting with a comment.
The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command. Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.
We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command. That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token. If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing. Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.
(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)
In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type. (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)
I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it. The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.
Per bug #17379 from Greg Rychlewski. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
2022-01-24 15:33:34 -05:00
|
|
|
. {
|
|
|
|
|
/* Any char not recognized above is returned as itself */
|
|
|
|
|
return yytext[0];
|
|
|
|
|
}
|
|
|
|
|
|
2022-09-04 00:33:31 -04:00
|
|
|
<xq,xd><<EOF>> { replication_yyerror("unterminated quoted string"); }
|
2011-01-14 10:30:33 -05:00
|
|
|
|
|
|
|
|
|
|
|
|
|
<<EOF>> {
|
|
|
|
|
yyterminate();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
%%
|
|
|
|
|
|
2017-08-10 23:33:47 -04:00
|
|
|
/* LCOV_EXCL_STOP */
|
2011-01-14 10:30:33 -05:00
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
startlit(void)
|
|
|
|
|
{
|
2011-11-01 15:50:00 -04:00
|
|
|
initStringInfo(&litbuf);
|
2011-01-14 10:30:33 -05:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static char *
|
|
|
|
|
litbufdup(void)
|
|
|
|
|
{
|
2011-11-01 15:50:00 -04:00
|
|
|
return litbuf.data;
|
2011-01-14 10:30:33 -05:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
addlit(char *ytext, int yleng)
|
|
|
|
|
{
|
|
|
|
|
appendBinaryStringInfo(&litbuf, ytext, yleng);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static void
|
|
|
|
|
addlitchar(unsigned char ychar)
|
|
|
|
|
{
|
2011-11-01 15:50:00 -04:00
|
|
|
appendStringInfoChar(&litbuf, ychar);
|
2011-01-14 10:30:33 -05:00
|
|
|
}
|
|
|
|
|
|
2015-03-11 09:19:54 -04:00
|
|
|
void
|
2022-09-04 00:33:31 -04:00
|
|
|
replication_yyerror(const char *message)
|
2011-01-14 10:30:33 -05:00
|
|
|
{
|
2011-11-01 15:50:00 -04:00
|
|
|
ereport(ERROR,
|
2011-01-14 10:30:33 -05:00
|
|
|
(errcode(ERRCODE_SYNTAX_ERROR),
|
|
|
|
|
errmsg_internal("%s", message)));
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
void
|
|
|
|
|
replication_scanner_init(const char *str)
|
|
|
|
|
{
|
|
|
|
|
Size slen = strlen(str);
|
|
|
|
|
char *scanbuf;
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Might be left over after ereport()
|
|
|
|
|
*/
|
|
|
|
|
if (YY_CURRENT_BUFFER)
|
|
|
|
|
yy_delete_buffer(YY_CURRENT_BUFFER);
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Make a scan buffer with special termination needed by flex.
|
|
|
|
|
*/
|
|
|
|
|
scanbuf = (char *) palloc(slen + 2);
|
|
|
|
|
memcpy(scanbuf, str, slen);
|
|
|
|
|
scanbuf[slen] = scanbuf[slen + 1] = YY_END_OF_BUFFER_CHAR;
|
|
|
|
|
scanbufhandle = yy_scan_buffer(scanbuf, slen + 2);
|
2022-01-24 12:09:46 -05:00
|
|
|
|
|
|
|
|
/* Make sure we start in proper state */
|
|
|
|
|
BEGIN(INITIAL);
|
Fix limitations on what SQL commands can be issued to a walsender.
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands. Poor design of the replication-command
parser caused it to fail in various cases, notably:
* semicolons embedded in a command, or multiple SQL commands
sent in a single message;
* dollar-quoted literals containing odd numbers of single
or double quote marks;
* commands starting with a comment.
The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command. Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.
We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command. That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token. If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing. Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.
(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)
In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type. (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)
I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it. The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.
Per bug #17379 from Greg Rychlewski. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
2022-01-24 15:33:34 -05:00
|
|
|
repl_pushed_back_token = 0;
|
2011-01-14 10:30:33 -05:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
void
|
2015-08-15 11:25:00 -04:00
|
|
|
replication_scanner_finish(void)
|
2011-01-14 10:30:33 -05:00
|
|
|
{
|
|
|
|
|
yy_delete_buffer(scanbufhandle);
|
|
|
|
|
scanbufhandle = NULL;
|
|
|
|
|
}
|
Fix limitations on what SQL commands can be issued to a walsender.
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands. Poor design of the replication-command
parser caused it to fail in various cases, notably:
* semicolons embedded in a command, or multiple SQL commands
sent in a single message;
* dollar-quoted literals containing odd numbers of single
or double quote marks;
* commands starting with a comment.
The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command. Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.
We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command. That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token. If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing. Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.
(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)
In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type. (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)
I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it. The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.
Per bug #17379 from Greg Rychlewski. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
2022-01-24 15:33:34 -05:00
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Check to see if the first token of a command is a WalSender keyword.
|
|
|
|
|
*
|
|
|
|
|
* To keep repl_scanner.l minimal, we don't ask it to know every construct
|
|
|
|
|
* that the core lexer knows. Therefore, we daren't lex more than the
|
|
|
|
|
* first token of a general SQL command. That will usually look like an
|
|
|
|
|
* IDENT token here, although some other cases are possible.
|
|
|
|
|
*/
|
|
|
|
|
bool
|
|
|
|
|
replication_scanner_is_replication_command(void)
|
|
|
|
|
{
|
|
|
|
|
int first_token = replication_yylex();
|
|
|
|
|
|
|
|
|
|
switch (first_token)
|
|
|
|
|
{
|
|
|
|
|
case K_IDENTIFY_SYSTEM:
|
|
|
|
|
case K_BASE_BACKUP:
|
|
|
|
|
case K_START_REPLICATION:
|
|
|
|
|
case K_CREATE_REPLICATION_SLOT:
|
|
|
|
|
case K_DROP_REPLICATION_SLOT:
|
|
|
|
|
case K_READ_REPLICATION_SLOT:
|
|
|
|
|
case K_TIMELINE_HISTORY:
|
2023-12-20 09:49:12 -05:00
|
|
|
case K_UPLOAD_MANIFEST:
|
Fix limitations on what SQL commands can be issued to a walsender.
In logical replication mode, a WalSender is supposed to be able
to execute any regular SQL command, as well as the special
replication commands. Poor design of the replication-command
parser caused it to fail in various cases, notably:
* semicolons embedded in a command, or multiple SQL commands
sent in a single message;
* dollar-quoted literals containing odd numbers of single
or double quote marks;
* commands starting with a comment.
The basic problem here is that we're trying to run repl_scanner.l
across the entire input string even when it's not a replication
command. Since repl_scanner.l does not understand all of the
token types known to the core lexer, this is doomed to have
failure modes.
We certainly don't want to make repl_scanner.l as big as scan.l,
so instead rejigger stuff so that we only lex the first token of
a non-replication command. That will usually look like an IDENT
to repl_scanner.l, though a comment would end up getting reported
as a '-' or '/' single-character token. If the token is a replication
command keyword, we push it back and proceed normally with repl_gram.y
parsing. Otherwise, we can drop out of exec_replication_command()
without examining the rest of the string.
(It's still theoretically possible for repl_scanner.l to fail on
the first token; but that could only happen if it's an unterminated
single- or double-quoted string, in which case you'd have gotten
largely the same error from the core lexer too.)
In this way, repl_gram.y isn't involved at all in handling general
SQL commands, so we can get rid of the SQLCmd node type. (In
the back branches, we can't remove it because renumbering enum
NodeTag would be an ABI break; so just leave it sit there unused.)
I failed to resist the temptation to clean up some other sloppy
coding in repl_scanner.l while at it. The only externally-visible
behavior change from that is it now accepts \r and \f as whitespace,
same as the core lexer.
Per bug #17379 from Greg Rychlewski. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
2022-01-24 15:33:34 -05:00
|
|
|
case K_SHOW:
|
|
|
|
|
/* Yes; push back the first token so we can parse later. */
|
|
|
|
|
repl_pushed_back_token = first_token;
|
|
|
|
|
return true;
|
|
|
|
|
default:
|
|
|
|
|
/* Nope; we don't bother to push back the token. */
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
}
|