Commit graph

26 commits

Author SHA1 Message Date
Marian Beermann
fafd5e0399 hashindex: separate endian-dependent defs from endian detection
also make macro style consistent with other macros in the codebase.
2017-01-21 17:25:38 +01:00
Marian Beermann
90ae9076a4 hashindex: detect mingw byte order 2017-01-21 15:04:07 +01:00
Marian Beermann
02bb79dcbb
hashindex.c: hashindex_resize check hashindex_set rc (contract) 2016-07-09 01:35:01 +02:00
Marian Beermann
29ebdbadae
refcounting: use uint32_t, protect against overflows, fix merging for BE 2016-04-14 23:38:56 +02:00
Marian Beermann
c90745cdbb Port hashindex_summarize into ChunkIndex.summarize 2016-04-14 11:46:12 +02:00
Marian Beermann
90a9fbd21d
hashindex_summarize: fix missing byte-order conversion
Fixes #886
2016-04-11 22:22:24 +02:00
Alexander Pyhalov
e494d24d6a failing hashindex tests on netbsd, fixes #804 2016-03-31 21:32:27 +02:00
Alexander Pyhalov
f63be63347 Fix build on illumos 2016-03-21 17:39:55 +03:00
Thomas Waldmann
5cb47cbedd hashindex: explain hash_sizes 2016-01-14 14:39:59 +01:00
Thomas Waldmann
083f5e31ef hashindex: fix upper limit
use num_buckets (== fully use what we currently have allocated)
2016-01-14 14:39:59 +01:00
Thomas Waldmann
09665805e8 move func defs to avoid implicit declaration compiler warning 2016-01-14 14:39:59 +01:00
Thomas Waldmann
91cde721b4 hashindex: minor refactor
- rename BUCKET_(LOWER|UPPER)_LIMIT to HASH_(MIN|MAX)_LOAD
   as this value is usually called the hash table's minimum/maximum load factor.
- remove MAX_BUCKET_SIZE (not used)
- regroup/reorder definitions
2016-01-14 14:39:59 +01:00
Thomas Waldmann
d88df3edc6 hashtable size follows a growth policy, fixes #527
also: refactor / dedupe some code into functions
2016-01-14 14:39:59 +01:00
Thomas Waldmann
720fc49498 hashindex_add C implementation
this was also the loop contents of hashindex_merge, but we also need it callable from Cython/Python code.

this saves some cycles, esp. if the key is already present in the index.
2015-12-07 19:13:58 +01:00
Thomas Waldmann
610300c1ce misc. hash table tuning
BUCKET_UPPER_LIMIT: 90% load degrades hash table performance severely,
so I lowered that to 75% (which is a usual value - java uses 75%, python uses 66%).
I chose the higher value of both because we also should not consume too much
memory, considering the RAM usage already is rather high.

MIN_BUCKETS: I can't explain why, but benchmarks showed that choosing 2^N as
table size severely degrades performance (by 3 orders of magnitude!). So a prime
start value improves this a lot, even if we later still use the grow-by-2x algorithm.

hashindex_resize: removed the hashindex_get() call as we already know that the values
come at key + key_size address.

hashindex_init: do not calloc X*Y elements of size 1, but rather X elements of size Y.
Makes the code simpler, not sure if it affects performance.

The tests needed fixing as the resulting hashtable blob is now of course different due
to the above changes, so its sha hash changed.
2015-12-01 21:18:58 +01:00
Thomas Waldmann
7247043db0 get rid of C compiler warnings, fixes #391 2015-11-21 22:08:30 +01:00
Thomas Waldmann
d779057b79 fix issue with negative "all archives" size, fixes #165
This fixes a infrequent problem when (refcount * chunksize) overflowed a int32_t.
chunksize is always <= 8MiB and usually rather ~64KiB (with default chunker params).
Thus, this happened only for high refcounts and/or unusually big chunks.
2015-08-29 04:46:13 +02:00
Thomas Waldmann
e06b0b3612 use C99's uintmax_t and %ju format
whatever size_t and off_t is, should even fit in there
2015-08-12 01:04:03 +02:00
Thomas Waldmann
197ca9c0d3 C merge code: cast to correct pointer type, silences warning 2015-08-09 16:19:53 +02:00
Thomas Waldmann
a1e039ba21 reimplement the chunk index merging in C
the python code could take a rather long time and likely most of it was converting stuff from python to C and back.
2015-08-06 23:32:53 +02:00
Thomas Waldmann
6d0a00496a determine and report chunk counts in chunks index
borg info repo::archive now reports unique chunks count, total chunks count

also: use index->key_size instead of hardcoded value
2015-06-19 23:53:23 +02:00
Thomas Waldmann
614261604e don't hardcode MAGIC length 2015-06-02 02:41:23 +02:00
Thomas Waldmann
926454c0d8 explicitely specify binary mode to open binary files
on POSIX OSes, it doesn't make a difference, but it is cleaner and also good for portability.
2015-05-31 17:57:45 +02:00
Thomas Waldmann
776bb9fabc hashindex: improve error messages 2015-05-31 17:48:19 +02:00
Thomas Waldmann
91e10fec5f Merge branch 'master' of github.com:jborg/attic 2015-05-31 17:37:02 +02:00
Thomas Waldmann
78bfc58b47 rename package directory to borg 2015-05-22 17:48:54 +02:00
Renamed from attic/_hashindex.c (Browse further)