function changes the size of the previously allocated memory referenced by
\fIptr\fR
to
\fIsize\fR
bytes\&. The contents of the memory are unchanged up to the lesser of the new and old sizes\&. If the new size is larger, the contents of the newly allocated portion of the memory are undefined\&. Upon success, the memory referenced by
\fIptr\fR
is freed and a pointer to the newly allocated memory is returned\&. Note that
argument that can be used to specify options\&. The functions only check the options that are contextually relevant\&. Use bitwise or (|) operations to specify one or more of the following:
.PP
\fBMALLOCX_LG_ALIGN(\fR\fB\fIla\fR\fR\fB) \fR
.RS4
Align the memory allocation to start at an address that is a multiple of
(1 << \fIla\fR)\&. This macro does not validate that
\fIla\fR
is within the valid range\&.
.RE
.PP
\fBMALLOCX_ALIGN(\fR\fB\fIa\fR\fR\fB) \fR
.RS4
Align the memory allocation to start at an address that is a multiple of
\fIa\fR, where
\fIa\fR
is a power of two\&. This macro does not validate that
\fIa\fR
is a power of 2\&.
.RE
.PP
\fBMALLOCX_ZERO\fR
.RS4
Initialize newly allocated memory to contain zero bytes\&. In the growing reallocation case, the real size prior to reallocation defines the boundary between untouched bytes and those that are initialized to contain zero bytes\&. If this macro is absent, newly allocated memory is uninitialized\&.
bytes, and returns a pointer to the base address of the resulting allocation, which may or may not have moved from its original location\&. Behavior is undefined if
parameter to allow the caller to pass in the allocation size as an optimization\&. The minimum valid input size is the original requested size of the allocation, and the maximum valid input size is the corresponding value returned by
function provides a general interface for introspecting the memory allocator, as well as setting modifiable parameters and triggering actions\&. The period\-separated
\fIname\fR
argument specifies a location in a tree\-structured namespace; see the
MALLCTL NAMESPACE
section for documentation on the tree contents\&. To read a value, pass a pointer via
\fIoldp\fR
to adequate space to contain the value, and a pointer to its length via
\fIoldlenp\fR; otherwise pass
\fBNULL\fR
and
\fBNULL\fR\&. Similarly, to write a value, pass a pointer to the value via
function provides a way to avoid repeated name lookups for applications that repeatedly query the same portion of the namespace, by translating a name to a
that is smaller than the number of period\-separated name components, which results in a partial MIB that can be used as the basis for constructing a complete MIB\&. For name components that are integers (e\&.g\&. the 2 in
arenas\&.bin\&.2\&.size), the corresponding MIB component will always be that integer\&. Therefore, it is legitimate to construct code like the following:
\fBNULL\fR\&. The statistics are presented in human\-readable form unless
\(lqJ\(rq
is specified as a character within the
\fIopts\fR
string, in which case the statistics are presented in
\m[blue]\fBJSON format\fR\m[]\&\s-2\u[2]\d\s+2\&. This function can be called repeatedly\&. General information that never changes during execution can be omitted by specifying
can be specified to omit merged arena, destroyed merged arena, and per arena statistics, respectively;
\(lqb\(rq
and
\(lql\(rq
can be specified to omit per size class statistics for bins and large objects, respectively;
\(lqx\(rq
can be specified to omit all mutex statistics\&. Unrecognized characters are silently ignored\&. Note that thread caching may prevent some statistics from being completely up to date, since extra locking would be required to merge counters that track thread cache operations\&.
realloc(); rather it is provided solely as a tool for introspection purposes\&. Any discrepancy between the requested allocation size and the size reported by
Once, when the first call is made to one of the memory allocation routines, the allocator initializes its internals based in part on various options that can be specified at compile\- or run\-time\&.
options\&. Some options have boolean values (true/false), others have integer values (base 8, 10, or 16, depending on prefix), and yet others have raw string values\&.
.SH"IMPLEMENTATION NOTES"
.PP
Traditionally, allocators have used
\fBsbrk\fR(2)
to obtain memory, which is suboptimal for several reasons, including race conditions, increased fragmentation, and artificial limitations on maximum usable memory\&. If
This allocator uses multiple arenas in order to reduce lock contention for threaded programs on multi\-processor systems\&. This works well with regard to threading scalability, but incurs some costs\&. There is a small fixed per\-arena overhead, and additionally, arenas manage memory completely independently of each other, which means a small fixed increase in overall memory fragmentation\&. These overheads are not generally an issue, given the number of arenas normally used\&. Note that using substantially more arenas than the default is not likely to improve performance, mainly due to reduced cache performance\&. However, it may make sense to reduce the number of arenas if an application does not make much use of the allocation functions\&.
In addition to multiple arenas, this allocator supports thread\-specific caching, in order to make it possible to completely avoid synchronization for most allocation requests\&. Such caching allows very fast allocation in the common case, but it increases memory usage and fragmentation, since a bounded number of objects can remain allocated in each thread cache\&.
Memory is conceptually broken into extents\&. Extents are always aligned to multiples of the page size\&. This alignment makes it possible to find metadata for user objects quickly\&. User objects are broken into two categories according to size: small and large\&. Contiguous small objects comprise a slab, which resides within a single extent, whereas large objects each have their own extents backing them\&.
Small objects are managed in groups by slabs\&. Each slab maintains a bitmap to track which regions are in use\&. Allocation requests that are no more than half the quantum (8 or 16, depending on architecture) are rounded up to the nearest power of two that is at least
sizeof(\fBdouble\fR)\&. All other object size classes are multiples of the quantum, spaced such that there are four size classes for each doubling in size, which limits internal fragmentation to approximately 20% for all but the smallest size classes\&. Small size classes are smaller than four times the page size, and large size classes extend from four times the page size up to the largest size class that does not exceed
Allocations are packed tightly together, which can be an issue for multi\-threaded applications\&. If you need to assure that allocations do not suffer from cacheline sharing, round your allocation requests up to the nearest multiple of the cacheline size, or specify cacheline alignment when allocating\&.
to grow e\&.g\&. a 9\-byte allocation to 16 bytes, or shrink a 16\-byte allocation to 9 bytes\&. Growth and shrinkage trivially succeeds in place as long as the pre\-size and post\-size both round up to the same size class\&. No other API guarantees are made regarding in\-place resizing, but the current implementation also tries to resize large allocations in place, as long as the pre\-size and post\-size are both large\&. For shrinkage to succeed, the extent allocator must support splitting (see
arena\&.<i>\&.extent_hooks)\&. Growth only succeeds if the trailing memory is currently available, and the extent allocator supports merging\&.
functions report values, and increment the epoch\&. Return the current epoch\&. This is useful for detecting whether another thread caused a refresh\&.
Enable/disable internal background worker threads\&. When set to true, background threads are created on demand (the number of background threads will be no more than the number of CPUs or active arenas)\&. Threads run periodically, and handle
purging
asynchronously\&. When switching off, background threads are terminated synchronously\&. Note that after
\fBfork\fR(2)
function, the state in the child process will be disabled regardless the state in parent process\&. See
stats\&.background_thread
for related stats\&.
opt\&.background_thread
can be used to set the default option\&. This option is only available on selected pthread\-based platforms\&.
Abort\-on\-invalid\-configuration enabled/disabled\&. If true, invalid runtime options are fatal\&. The process will call
\fBabort\fR(3)
in these cases\&. This option is disabled by default unless
\fB\-\-enable\-debug\fR
is specified during configuration, in which case it is enabled by default\&.
.RE
.PP
opt\&.retain (\fBbool\fR) r\-
.RS4
If true, retain unused virtual memory for later reuse rather than discarding it by calling
\fBmunmap\fR(2)
or equivalent (see
stats\&.retained
for related details)\&. This option is disabled by default unless discarding virtual memory is known to trigger platform\-specific performance problems, e\&.g\&. for [64\-bit] Linux, which has a quirk in its virtual memory allocation algorithm that causes semi\-permanent VM map holes under normal jemalloc operation\&. Although
\fBmunmap\fR(2)
causes issues on 32\-bit Linux as well, retaining virtual memory for 32\-bit Linux is disabled by default due to the practical possibility of address space exhaustion\&.
Maximum number of arenas to use for automatic multiplexing of threads and arenas\&. The default is four times the number of CPUs, or one if there is a single CPU\&.
setting to enable this feature, which uses number of CPUs to determine number of arenas, and bind threads to arenas dynamically based on the CPU the thread runs on currently\&.
\(lqphycpu\(rq
setting uses one arena per physical CPU, which means the two hyper threads on the same CPU share one arena\&. Note that no runtime checking regarding the availability of hyper threading is done at the moment\&. When set to
\(lqdisabled\(rq, narenas and thread to arena association will not be impacted by this option\&. The default is
Approximate time in milliseconds from the creation of a set of unused dirty pages until an equivalent set of unused dirty pages is purged (i\&.e\&. converted to muzzy via e\&.g\&.
madvise(\fI\&.\&.\&.\fR\fI\fBMADV_FREE\fR\fR)
if supported by the operating system, or converted to clean otherwise) and/or reused\&. Dirty pages are defined as previously having been potentially written to by the application, and therefore consuming physical memory, yet having no current use\&. The pages are incrementally purged according to a sigmoidal decay curve that starts and ends with zero purge rate\&. A decay time of 0 causes all unused dirty pages to be purged immediately upon creation\&. A decay time of \-1 disables purging\&. The default decay time is 10 seconds\&. See
Approximate time in milliseconds from the creation of a set of unused muzzy pages until an equivalent set of unused muzzy pages is purged (i\&.e\&. converted to clean) and/or reused\&. Muzzy pages are defined as previously having been unused dirty pages that were subsequently purged in a manner that left them subject to the reclamation whims of the operating system (e\&.g\&.
madvise(\fI\&.\&.\&.\fR\fI\fBMADV_FREE\fR\fR)), and therefore in an indeterminate state\&. The pages are incrementally purged according to a sigmoidal decay curve that starts and ends with zero purge rate\&. A decay time of 0 causes all unused muzzy pages to be purged immediately upon creation\&. A decay time of \-1 disables purging\&. The default decay time is 10 seconds\&. See
is specified during configuration, this has the potential to cause deadlock for a multi\-threaded process that exits while one or more threads are executing in the memory allocation functions\&. Furthermore,
function with equivalent functionality)\&. Therefore, this option should only be used with care; it is primarily intended as a performance tuning aid during application development\&. This option is disabled by default\&.
Zero filling enabled/disabled\&. If enabled, each byte of uninitialized allocated memory will be initialized to 0\&. Note that this initialization only happens once for each byte, so
calls do not zero memory that was previously allocated\&. This is intended for debugging and will impact performance negatively\&. This option is disabled by default\&.
Abort\-on\-out\-of\-memory enabled/disabled\&. If enabled, rather than returning failure for any allocation function, display a diagnostic message on
\fBSTDERR_FILENO\fR
and cause the program to drop core (using
\fBabort\fR(3))\&. If an application is designed to depend on this behavior, set the option at compile time by including the following in the source code:
Thread\-specific caching (tcache) enabled/disabled\&. When there are multiple threads, each thread uses a tcache for objects up to a certain size\&. Thread\-specific caching allows many allocations to be satisfied without performing any thread synchronization, at the cost of increased memory use\&. See the
Maximum size class (log base 2) to cache in the thread\-specific cache (tcache)\&. At a minimum, all small size classes are cached, and at a maximum all large size classes are cached\&. The default maximum is 32 KiB (2^15)\&.
Filename prefix for profile dumps\&. If the prefix is set to the empty string, no automatic dumps will occur; this is primarily useful for disabling the automatic final heap dump (which also disables leak reporting, if enabled)\&. The default prefix is
Profiling activated/deactivated\&. This is a secondary control mechanism that makes it possible to start the application with profiling enabled (see the
Average interval (log base 2) between allocation samples, as measured in bytes of allocation activity\&. Increasing the sampling interval decreases profile fidelity, but also decreases the computational overhead\&. The default sample interval is 512 KiB (2^19 B)\&.
Reporting of cumulative object/byte counts in profile dumps enabled/disabled\&. If this option is enabled, every unique backtrace must be stored for the duration of execution\&. Depending on the application, this can impose a large memory overhead, and the cumulative counts are not always of interest\&. This option is disabled by default\&.
Average interval (log base 2) between memory profile dumps, as measured in bytes of allocation activity\&. The actual interval between dumps may be sporadic because decentralized allocation counters are used to avoid synchronization bottlenecks\&. Profiles are dumped to files named according to the pattern
prof\&.gdump, which when enabled triggers a memory profile dump every time the total virtual memory exceeds the previous maximum\&. This option is disabled by default\&.
Get the total number of bytes ever allocated by the calling thread\&. This counter has the potential to wrap around; it is up to the application to appropriately interpret the counter in such cases\&.
Get the total number of bytes ever deallocated by the calling thread\&. This counter has the potential to wrap around; it is up to the application to appropriately interpret the counter in such cases\&.
Flush calling thread\*(Aqs thread\-specific cache (tcache)\&. This interface releases all cached objects and internal data structures associated with the calling thread\*(Aqs tcache\&. Ordinarily, this interface need not be called, since automatic periodic incremental garbage collection occurs, and the thread cache is automatically discarded when a thread exits\&. However, garbage collection is triggered by allocation activity, so it is possible for a thread that stops allocating/deallocating to retain its cache indefinitely, in which case the developer may find manual flushing useful\&.
Get/set the descriptive name associated with the calling thread in memory profile dumps\&. An internal copy of the name string is created, so the input string need not be maintained after this interface completes execution\&. The output string of this interface should be copied for non\-ephemeral uses, because multiple implementation details can cause asynchronous string deallocation\&. Furthermore, each invocation of this interface can only read or write; simultaneous read/write is not supported due to string lifetime limitations\&. The name string must be nil\-terminated and comprised only of characters in the sets recognized by
Create an explicit thread\-specific cache (tcache) and return an identifier that can be passed to the
\fBMALLOCX_TCACHE(\fR\fB\fItc\fR\fR\fB)\fR
macro to explicitly use the specified cache rather than the automatically managed one that is used by default\&. Each explicit cache can be used by only one thread at a time; the application must assure that this constraint holds\&.
Get whether the specified arena\*(Aqs statistics are initialized (i\&.e\&. the arena was initialized prior to the current epoch)\&. This interface can also be nominally used to query whether the merged statistics corresponding to
Discard all of the arena\*(Aqs extant allocations\&. This interface can only be used with arenas explicitly created via
arenas\&.create\&. None of the arena\*(Aqs discarded/cached allocations may accessed afterward\&. As part of this requirement, all thread caches which were used to allocate/deallocate in conjunction with the arena must be flushed beforehand\&.
.RE
.PP
arena\&.<i>\&.destroy (\fBvoid\fR) \-\-
.RS4
Destroy the arena\&. Discard all of the arena\*(Aqs extant allocations using the same mechanism as for
arena\&.<i>\&.reset
(with all the same constraints and side effects), merge the arena stats into those accessible at arena index
\fBMALLCTL_ARENAS_DESTROYED\fR, and then completely discard all metadata associated with the arena\&. Future calls to
arenas\&.create
may recycle the arena index\&. Destruction will fail if any threads are currently associated with the arena as a result of calls to
Current per\-arena approximate time in milliseconds from the creation of a set of unused dirty pages until an equivalent set of unused dirty pages is purged and/or reused\&. Each time this interface is set, all currently unused dirty pages are considered to have fully decayed, which causes immediate purging of all unused dirty pages unless the decay time is set to \-1 (i\&.e\&. purging disabled)\&. See
Current per\-arena approximate time in milliseconds from the creation of a set of unused muzzy pages until an equivalent set of unused muzzy pages is purged and/or reused\&. Each time this interface is set, all currently unused muzzy pages are considered to have fully decayed, which causes immediate purging of all unused muzzy pages unless the decay time is set to \-1 (i\&.e\&. purging disabled)\&. See
Get or set the extent management hook functions for arena <i>\&. The functions must be capable of operating on all extant extents associated with arena <i>, usually by passing unknown extents to the replaced functions\&. In practice, it is feasible to control allocation for arenas explicitly created via
arenas\&.create
such that all extents originate from an application\-supplied extent allocator (by specifying the custom extent hook functions during arena creation), but the automatically created arenas will have already created extents prior to the application having an opportunity to take over extent allocation\&.
structure comprises function pointers which are described individually below\&. jemalloc uses these functions to manage extent lifetime, which starts off with allocation of mapped committed memory, in the simplest case followed by deallocation\&. However, there are performance and platform reasons to retain extents for later reuse\&. Cleanup attempts cascade from deallocation to decommit to forced purging to lazy purging, which gives the extent management functions opportunities to reject the most permanent cleanup operations in favor of less permanent (and often less costly) operations\&. All operations except allocation can be universally opted out of by setting the hook pointers to
\fBNULL\fR, or selectively opted out of by returning failure\&.
on error\&. Committed memory may be committed in absolute terms as on a system that does not overcommit, or in implicit terms as on a system that overcommits and satisfies physical memory needs on demand via soft page faults\&. Note that replacing the default extent allocation function makes the arena\*(Aqs
\fIarena_ind\fR, returning false upon success\&. If the function returns true, this indicates opt\-out from deallocation; the virtual memory mapping associated with the extent remains mapped, in the same commit state, and available for future use, in which case it will be automatically retained for later reuse\&.
\fIarena_ind\fR, returning false upon success\&. Committed memory may be committed in absolute terms as on a system that does not overcommit, or in implicit terms as on a system that overcommits and satisfies physical memory needs on demand via soft page faults\&. If the function returns true, this indicates insufficient physical memory to satisfy the request\&.
\fIarena_ind\fR, returning false upon success, in which case the pages will be committed via the extent commit function before being reused\&. If the function returns true, this indicates opt\-out from decommit; the memory remains committed and available for future use, in which case it will be automatically retained for later reuse\&.
\fIarena_ind\fR\&. A lazy extent purge function (e\&.g\&. implemented via
madvise(\fI\&.\&.\&.\fR\fI\fBMADV_FREE\fR\fR)) can delay purging indefinitely and leave the pages within the purged virtual memory range in an indeterminite state, whereas a forced extent purge function immediately purges, and the pages within the virtual memory range will be zero\-filled the next time they are accessed\&. If the function returns true, this indicates failure to purge\&.
\fIarena_ind\fR, returning false upon success\&. If the function returns true, this indicates that the extent remains unsplit and therefore should continue to be operated on as a whole\&.
\fIarena_ind\fR, returning false upon success\&. If the function returns true, this indicates that the extents remain distinct mappings and therefore should continue to be operated on independently\&.
Current default per\-arena approximate time in milliseconds from the creation of a set of unused dirty pages until an equivalent set of unused dirty pages is purged and/or reused, used to initialize
Current default per\-arena approximate time in milliseconds from the creation of a set of unused muzzy pages until an equivalent set of unused muzzy pages is purged and/or reused, used to initialize
Explicitly create a new arena outside the range of automatically managed arenas, with optionally specified extent hooks, and return the new arena index\&.
When enabled, trigger a memory profile dump every time the total virtual memory exceeds the previous maximum\&. Profiles are dumped to files named according to the pattern
Maximum number of bytes in physically resident data pages mapped by the allocator, comprising all pages dedicated to allocator metadata, pages backing active allocations, and unused dirty pages\&. This is a maximum rather than precise because pages may not actually be physically resident if they correspond to demand\-zeroed virtual memory that has not yet been touched\&. This is a multiple of the page size, and is larger than
Total number of bytes in active extents mapped by the allocator\&. This is larger than
stats\&.active\&. This does not include inactive extents, even those that contain unused dirty pages, which means that there is no strict ordering between this and
stats\&.mutexes\&.ctl\&.{counter}; (\fBcounter specific type\fR) r\- [\fB\-\-enable\-stats\fR]
.RS4
Statistics on
\fIctl\fR
mutex (global scope; mallctl related)\&.
{counter}
is one of the counters below:
.PP
.RS4
\fInum_ops\fR
(\fBuint64_t\fR): Total number of lock acquisition operations on this mutex\&.
.sp
\fInum_spin_acq\fR
(\fBuint64_t\fR): Number of times the mutex was spin\-acquired\&. When the mutex is currently locked and cannot be acquired immediately, a short period of spin\-retry within jemalloc will be performed\&. Acquired through spin generally means the contention was lightweight and not causing context switches\&.
.sp
\fInum_wait\fR
(\fBuint64_t\fR): Number of times the mutex was wait\-acquired, which means the mutex contention was not solved by spin\-retry, and blocking operation was likely involved in order to acquire the mutex\&. This event generally implies higher cost / longer delay, and should be investigated if it happens often\&.
.sp
\fImax_wait_time\fR
(\fBuint64_t\fR): Maximum length of time in nanoseconds spent on a single wait\-acquired lock operation\&. Note that to avoid profiling overhead on the common path, this does not consider spin\-acquired cases\&.
.sp
\fItotal_wait_time\fR
(\fBuint64_t\fR): Cumulative time in nanoseconds spent on wait\-acquired lock operations\&. Similarly, spin\-acquired cases are not considered\&.
.sp
\fImax_num_thds\fR
(\fBuint32_t\fR): Maximum number of threads waiting on this mutex simultaneously\&. Similarly, spin\-acquired cases are not considered\&.
.sp
\fInum_owner_switch\fR
(\fBuint64_t\fR): Number of times the current mutex owner is different from the previous one\&. This event does not generally imply an issue; rather it is an indicator of how often the protected data are accessed by different threads\&.
.RE
.RE
.PP
stats\&.mutexes\&.background_thread\&.{counter} (\fBcounter specific type\fR) r\- [\fB\-\-enable\-stats\fR]
.RS4
Statistics on
\fIbackground_thread\fR
mutex (global scope;
background_thread
related)\&.
{counter}
is one of the counters in
mutex profiling counters\&.
.RE
.PP
stats\&.mutexes\&.prof\&.{counter} (\fBcounter specific type\fR) r\- [\fB\-\-enable\-stats\fR]
Approximate time in milliseconds from the creation of a set of unused dirty pages until an equivalent set of unused dirty pages is purged and/or reused\&. See
Approximate time in milliseconds from the creation of a set of unused muzzy pages until an equivalent set of unused muzzy pages is purged and/or reused\&. See
Number of bytes dedicated to internal allocations\&. Internal allocations differ from application\-originated allocations in that they are for internal use, and that they are omitted from heap profiles\&.
Maximum number of bytes in physically resident data pages mapped by the arena, comprising all pages dedicated to allocator metadata, pages backing active allocations, and unused dirty pages\&. This is a maximum rather than precise because pages may not actually be physically resident if they correspond to demand\-zeroed virtual memory that has not yet been touched\&. This is a multiple of the page size\&.
\m[blue]\fBgperftools package\fR\m[]\&\s-2\u[3]\d\s+2, the addition of per thread heap profiling functionality required a different heap profile format\&. The
When debugging, it is a good idea to configure/build jemalloc with the
\fB\-\-enable\-debug\fR
and
\fB\-\-enable\-fill\fR
options, and recompile the program with suitable options and symbols for debugger support\&. When so configured, jemalloc incorporates a wide variety of run\-time assertions that catch application errors such as double\-free, write\-after\-free, etc\&.
option) eliminates the symptoms of such bugs\&. Between these two options, it is usually possible to quickly detect, diagnose, and eliminate such bugs\&.
This implementation does not provide much detail about the problems it detects, because the performance impact for storing such information would be prohibitive\&.
malloc_stats_print(), followed by a string pointer\&. Please note that doing anything which tries to allocate memory in this function is likely to result in a crash or deadlock\&.