redis

mirror of https://github.com/redis/redis.git synced 2026-05-28 04:02:46 -04:00

Author	SHA1	Message	Date
Tian	7dae142a2e	Reclaim page cache of RDB file (#11248 ) # Background The RDB file is usually generated and used once and seldom used again, but the content would reside in page cache until OS evicts it. A potential problem is that once the free memory exhausts, the OS have to reclaim some memory from page cache or swap anonymous page out, which may result in a jitters to the Redis service. Supposing an exact scenario, a high-capacity machine hosts many redis instances, and we're upgrading the Redis together. The page cache in host machine increases as RDBs are generated. Once the free memory drop into low watermark(which is more likely to happen in older Linux kernel like 3.10, before [watermark_scale_factor](https://lore.kernel.org/lkml/1455813719-2395-1-git-send-email-hannes@cmpxchg.org/) is introduced, the `low watermark` is linear to `min watermark`, and there'is not too much buffer space for `kswapd` to be wake up to reclaim memory), a `direct reclaim` happens, which means the process would stall to wait for memory allocation. # What the PR does The PR introduces a capability to reclaim the cache when the RDB is operated. Generally there're two cases, read and write the RDB. For read it's a little messy to address the incremental reclaim, so the reclaim is done in one go in background after the load is finished to avoid blocking the work thread. For write, incremental reclaim amortizes the work of reclaim so no need to put it into background, and the peak watermark of cache can be reduced in this way. Two cases are addresses specially, replication and restart, for both of which the cache is leveraged to speed up the processing, so the reclaim is postponed to a right time. To do this, a flag is added to`rdbSave` and `rdbLoad` to control whether the cache need to be kept, with the default value false. # Something deserve noting 1. Though `posix_fadvise` is the POSIX standard, but only few platform support it, e.g. Linux, FreeBSD 10.0. 2. In Linux `posix_fadvise` only take effect on writeback-ed pages, so a `sync`(or `fsync`, `fdatasync`) is needed to flush the dirty page before `posix_fadvise` if we reclaim write cache. # About test A unit test is added to verify the effect of `posix_fadvise`. In integration test overall cache increase is checked, as well as the cache backed by RDB as a specific TCL test is executed in isolated Github action job.	2023-02-12 09:23:29 +02:00
DarrenJiang13	bb1de082ea	Adds isolated netstats for replication. (#10062 ) The amount of `server.stat_net_output_bytes/server.stat_net_input_bytes` is actually the sum of replication flow and users' data flow. It may cause confusions like this: "Why does my server get such a large output_bytes while I am doing nothing? ". After discussions and revisions, now here is the change about what this PR brings (final version before merge): - 2 server variables to count the network bytes during replication, including fullsync and propagate bytes. - `server.stat_net_repl_output_bytes`/`server.stat_net_repl_input_bytes` - 3 info fields to print the input and output of repl bytes and instantaneous value of total repl bytes. - `total_net_repl_input_bytes` / `total_net_repl_output_bytes` - `instantaneous_repl_total_kbps` - 1 new API `rioCheckType()` to check the type of rio. So we can use this to distinguish between diskless and diskbased replication - 2 new counting items to keep network statistics consistent between master and slave - rdb portion during diskless replica. in `rdbLoadProgressCallback()` - first line of the full sync payload. in `readSyncBulkPayload()` Co-authored-by: Oran Agra <oran@redislabs.com>	2022-05-31 08:07:33 +03:00
Oran Agra	5a47794606	diskless replication rdb transfer uses pipe, and writes to sockets form the parent process. misc: - handle SSL_has_pending by iterating though these in beforeSleep, and setting timeout of 0 to aeProcessEvents - fix issue with epoll signaling EPOLLHUP and EPOLLERR only to the write handlers. (needed to detect the rdb pipe was closed) - add key-load-delay config for testing - trim connShutdown which is no longer needed - rioFdsetWrite -> rioFdWrite - simplified since there's no longer need to write to multiple FDs - don't detect rdb child exited (don't call wait3) until we detect the pipe is closed - Cleanup bad optimization from rio.c, add another one	2019-10-07 21:06:30 +03:00
Yossi Gottlieb	b087dd1db6	TLS: Connections refactoring and TLS support. * Introduce a connection abstraction layer for all socket operations and integrate it across the code base. * Provide an optional TLS connections implementation based on OpenSSL. * Pull a newer version of hiredis with TLS support. * Tests, redis-cli updates for TLS support.	2019-10-07 21:06:13 +03:00
antirez	b2e10131c0	Rio: fix flag name, function is never used btw. Thanks to @tnclong for reporting the problem.	2019-09-04 13:01:07 +02:00
antirez	6b72b04a37	Rio: when in error condition avoid doing the operation.	2019-07-17 16:46:22 +02:00
antirez	48d91cf4cc	Rio: remember read/write error conditions.	2019-07-17 16:46:22 +02:00
Oran Agra	2de544cfcc	diskless replication on slave side (don't store rdb to file), plus some other related fixes The implementation of the diskless replication was currently diskless only on the master side. The slave side was still storing the received rdb file to the disk before loading it back in and parsing it. This commit adds two modes to load rdb directly from socket: 1) when-empty 2) using "swapdb" the third mode of using diskless slave by flushdb is risky and currently not included. other changes: -------------- distinguish between aof configuration and state so that we can re-enable aof only when sync eventually succeeds (and not when exiting from readSyncBulkPayload after a failed attempt) also a CONFIG GET and INFO during rdb loading would have lied When loading rdb from the network, don't kill the server on short read (that can be a network error) Fix rdb check when performed on preamble AOF tests: run replication tests for diskless slave too make replication test a bit more aggressive Add test for diskless load swapdb	2019-07-08 15:37:48 +03:00
Oran Agra	60a4f12f8b	fix processing of large bulks (above 2GB) - protocol parsing (processMultibulkBuffer) was limitted to 32big positions in the buffer readQueryFromClient potential overflow - rioWriteBulkCount used int, although rioWriteBulkString gave it size_t - several places in sds.c that used int for string length or index. - bugfix in RM_SaveAuxField (return was 1 or -1 and not length) - RM_SaveStringBuffer was limitted to 32bit length	2017-12-29 12:24:19 +02:00
antirez	8ec28002be	Modules: support for modules native data types.	2016-06-03 18:14:04 +02:00
Oran Agra	5e3880a492	various cleanups and minor fixes	2016-04-25 16:49:57 +03:00
antirez	10aafdad56	Diskless replication: rio fdset target new supports buffering. To perform a socket write() for each RDB rio API write call was extremely unefficient, so now rio has minimal buffering capabilities. Writes are accumulated into a buffer and only when a given limit is reacehd are actually wrote to the N slaves FDs. Trivia: rio lacked support for buffering since our targets were: 1) Memory buffers. 2) C standard I/O. Both were buffered already.	2014-10-17 11:36:12 +02:00
antirez	2a436aaeab	rio.c fdset target: tolerate (and report) a subset of FDs in error. Fdset target is used when we want to write an RDB file directly to slave's sockets. In this setup as long as there is a single slave that is still receiving our payload, we want to continue sennding instead of aborting. However rio calls should abort of no FD is ok. Also we want the errors reported so that we can signal the parent who is ok and who is broken, so there is a new set integers with the state of each fd. Zero is ok, non-zero is the errno of the failure, if avaialble, or a generic EIO.	2014-10-14 17:19:42 +02:00
antirez	850ea57c37	rio.c: draft implementation of fdset target implemented.	2014-10-10 17:44:06 +02:00
antirez	f590dd82ce	Fixed typo in rio.h, simgle -> single.	2013-07-16 15:43:36 +02:00
yoav	63d15dfc87	Chunked loading of RDB to prevent redis from stalling reading very large keys.	2013-07-16 15:41:24 +02:00
antirez	91f4213ddf	rio.c: added ability to fdatasync() from time to time while writing.	2013-04-24 10:26:30 +02:00
antirez	8419397665	Make rio.c comment 80-columns friendly.	2013-04-03 12:41:14 +02:00
antirez	4365e5b2d3	BSD license added to every C source and header file.	2012-11-08 18:31:32 +01:00
antirez	c44ab51da1	Make inline functions rioRead/Write/Tell static. This fixes issue #447 .	2012-04-11 11:58:32 +02:00
antirez	8491f1d9fd	Fixed compilation of new rio.c changes (typos and so forth.)	2012-04-09 12:36:44 +02:00
antirez	736b7c3f04	Add checksum computation to rio.c	2012-04-09 12:33:09 +02:00
antirez	f96a8a8054	rioInitWithFile nad rioInitWithBuffer functions now take a rio structure pointer to avoid copying a structure to return value to the caller.	2011-09-22 16:00:40 +02:00
antirez	4c0462972e	comment on top of the _rio structure modified for correctness as actually fwrite/fread semantics is different in general, but was 0/1 in our old usage before rio.c as we always used 1 as number items, and the actual number of bytes to read as item length.	2011-09-22 15:47:48 +02:00
Pieter Noordhuis	2e4b0e7727	Abstract file/buffer I/O to support in-memory serialization	2011-05-13 17:31:00 +02:00

25 commits