Systemd/src/basic/hashmap.h

432 lines
19 KiB
C
Raw Normal View History

/* SPDX-License-Identifier: LGPL-2.1+ */
#pragma once
2009-11-18 00:42:52 +01:00
#include <limits.h>
2009-11-18 00:42:52 +01:00
#include <stdbool.h>
#include <stddef.h>
2009-11-18 00:42:52 +01:00
#include "hash-funcs.h"
#include "macro.h"
#include "util.h"
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
/*
* A hash table implementation. As a minor optimization a NULL hashmap object
* will be treated as empty hashmap for all read operations. That way it is not
* necessary to instantiate an object for each Hashmap use.
*
* If ENABLE_DEBUG_HASHMAP is defined (by configuring with --enable-debug=hashmap),
* the implementation will:
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
* - store extra data for debugging and statistics (see tools/gdb-sd_dump_hashmaps.py)
* - perform extra checks for invalid use of iterators
*/
2009-11-18 00:42:52 +01:00
#define HASH_KEY_SIZE 16
typedef void* (*hashmap_destroy_t)(void *p);
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
/* The base type for all hashmap and set types. Many functions in the
* implementation take (HashmapBase*) parameters and are run-time polymorphic,
* though the API is not meant to be polymorphic (do not call functions
* internal_*() directly). */
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
typedef struct HashmapBase HashmapBase;
/* Specific hashmap/set types */
typedef struct Hashmap Hashmap; /* Maps keys to values */
typedef struct OrderedHashmap OrderedHashmap; /* Like Hashmap, but also remembers entry insertion order */
typedef struct Set Set; /* Stores just keys */
basic: implement the IteratedCache Adds the basics of the IteratedCache and constructor support for the Hashmap and OrderedHashmap types. iterated_cache_get() is responsible for synchronizing the cache with the associated Hashmap and making it available to the caller at the supplied result pointers. Since iterated_cache_get() may need to allocate memory, it may fail, so callers must check the return value. On success, pointer arrays containing pointers to the associated Hashmap's keys and values, in as-iterated order, are returned in res_keys and res_values, respectively. Either may be supplied as NULL to inhibit caching of the keys or values, respectively. Note that if the cached Hashmap hasn't changed since the previous call to iterated_cache_get(), and it's not a call activating caching of the values or keys, the cost is effectively zero as the resulting pointers will simply refer to the previously returned arrays as-is. A cleanup function has also been added, iterated_cache_free(). This only frees the IteratedCache container and related arrays. The associated Hashmap, its keys, and values are not affected. Also note that the associated Hashmap does not automatically free its associated IteratedCache when freed. One could, in theory, safely access the arrays returned by a successful iterated_cache_get() call after its associated Hashmap has been freed, including the referenced values and keys. Provided the iterated_cache_get() was performed prior to the hashmap free, and that the type of hashmap free performed didn't free keys and/or values as well.
2018-01-27 01:38:01 +01:00
typedef struct IteratedCache IteratedCache; /* Caches the iterated order of one of the above */
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
/* Ideally the Iterator would be an opaque struct, but it is instantiated
* by hashmap users, so the definition has to be here. Do not use its fields
* directly. */
typedef struct {
unsigned idx; /* index of an entry to be iterated next */
const void *next_key; /* expected value of that entry's key pointer */
#if ENABLE_DEBUG_HASHMAP
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
unsigned put_count; /* hashmap's put_count recorded at start of iteration */
unsigned rem_count; /* hashmap's rem_count in previous iteration */
unsigned prev_idx; /* idx in previous iteration */
#endif
} Iterator;
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
#define _IDX_ITERATOR_FIRST (UINT_MAX - 1)
#define ITERATOR_FIRST ((Iterator) { .idx = _IDX_ITERATOR_FIRST, .next_key = NULL })
2009-11-18 00:42:52 +01:00
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
/* Macros for type checking */
#define PTR_COMPATIBLE_WITH_HASHMAP_BASE(h) \
(__builtin_types_compatible_p(typeof(h), HashmapBase*) || \
__builtin_types_compatible_p(typeof(h), Hashmap*) || \
__builtin_types_compatible_p(typeof(h), OrderedHashmap*) || \
__builtin_types_compatible_p(typeof(h), Set*))
#define PTR_COMPATIBLE_WITH_PLAIN_HASHMAP(h) \
(__builtin_types_compatible_p(typeof(h), Hashmap*) || \
__builtin_types_compatible_p(typeof(h), OrderedHashmap*)) \
#define HASHMAP_BASE(h) \
__builtin_choose_expr(PTR_COMPATIBLE_WITH_HASHMAP_BASE(h), \
(HashmapBase*)(h), \
(void)0)
#define PLAIN_HASHMAP(h) \
__builtin_choose_expr(PTR_COMPATIBLE_WITH_PLAIN_HASHMAP(h), \
(Hashmap*)(h), \
(void)0)
#if ENABLE_DEBUG_HASHMAP
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
# define HASHMAP_DEBUG_PARAMS , const char *func, const char *file, int line
# define HASHMAP_DEBUG_SRC_ARGS , __func__, __FILE__, __LINE__
# define HASHMAP_DEBUG_PASS_ARGS , func, file, line
#else
# define HASHMAP_DEBUG_PARAMS
# define HASHMAP_DEBUG_SRC_ARGS
# define HASHMAP_DEBUG_PASS_ARGS
#endif
Hashmap *internal_hashmap_new(const struct hash_ops *hash_ops HASHMAP_DEBUG_PARAMS);
OrderedHashmap *internal_ordered_hashmap_new(const struct hash_ops *hash_ops HASHMAP_DEBUG_PARAMS);
#define hashmap_new(ops) internal_hashmap_new(ops HASHMAP_DEBUG_SRC_ARGS)
#define ordered_hashmap_new(ops) internal_ordered_hashmap_new(ops HASHMAP_DEBUG_SRC_ARGS)
HashmapBase *internal_hashmap_free(HashmapBase *h, free_func_t default_free_key, free_func_t default_free_value);
static inline Hashmap *hashmap_free(Hashmap *h) {
return (void*) internal_hashmap_free(HASHMAP_BASE(h), NULL, NULL);
}
static inline OrderedHashmap *ordered_hashmap_free(OrderedHashmap *h) {
return (void*) internal_hashmap_free(HASHMAP_BASE(h), NULL, NULL);
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
}
static inline Hashmap *hashmap_free_free(Hashmap *h) {
return (void*) internal_hashmap_free(HASHMAP_BASE(h), NULL, free);
}
static inline OrderedHashmap *ordered_hashmap_free_free(OrderedHashmap *h) {
return (void*) internal_hashmap_free(HASHMAP_BASE(h), NULL, free);
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
static inline Hashmap *hashmap_free_free_key(Hashmap *h) {
return (void*) internal_hashmap_free(HASHMAP_BASE(h), free, NULL);
}
static inline OrderedHashmap *ordered_hashmap_free_free_key(OrderedHashmap *h) {
return (void*) internal_hashmap_free(HASHMAP_BASE(h), free, NULL);
}
static inline Hashmap *hashmap_free_free_free(Hashmap *h) {
return (void*) internal_hashmap_free(HASHMAP_BASE(h), free, free);
}
static inline OrderedHashmap *ordered_hashmap_free_free_free(OrderedHashmap *h) {
return (void*) internal_hashmap_free(HASHMAP_BASE(h), free, free);
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
basic: implement the IteratedCache Adds the basics of the IteratedCache and constructor support for the Hashmap and OrderedHashmap types. iterated_cache_get() is responsible for synchronizing the cache with the associated Hashmap and making it available to the caller at the supplied result pointers. Since iterated_cache_get() may need to allocate memory, it may fail, so callers must check the return value. On success, pointer arrays containing pointers to the associated Hashmap's keys and values, in as-iterated order, are returned in res_keys and res_values, respectively. Either may be supplied as NULL to inhibit caching of the keys or values, respectively. Note that if the cached Hashmap hasn't changed since the previous call to iterated_cache_get(), and it's not a call activating caching of the values or keys, the cost is effectively zero as the resulting pointers will simply refer to the previously returned arrays as-is. A cleanup function has also been added, iterated_cache_free(). This only frees the IteratedCache container and related arrays. The associated Hashmap, its keys, and values are not affected. Also note that the associated Hashmap does not automatically free its associated IteratedCache when freed. One could, in theory, safely access the arrays returned by a successful iterated_cache_get() call after its associated Hashmap has been freed, including the referenced values and keys. Provided the iterated_cache_get() was performed prior to the hashmap free, and that the type of hashmap free performed didn't free keys and/or values as well.
2018-01-27 01:38:01 +01:00
IteratedCache *iterated_cache_free(IteratedCache *cache);
int iterated_cache_get(IteratedCache *cache, const void ***res_keys, const void ***res_values, unsigned *res_n_entries);
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
HashmapBase *internal_hashmap_copy(HashmapBase *h);
static inline Hashmap *hashmap_copy(Hashmap *h) {
return (Hashmap*) internal_hashmap_copy(HASHMAP_BASE(h));
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
static inline OrderedHashmap *ordered_hashmap_copy(OrderedHashmap *h) {
return (OrderedHashmap*) internal_hashmap_copy(HASHMAP_BASE(h));
}
2009-11-18 00:42:52 +01:00
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
int internal_hashmap_ensure_allocated(Hashmap **h, const struct hash_ops *hash_ops HASHMAP_DEBUG_PARAMS);
int internal_ordered_hashmap_ensure_allocated(OrderedHashmap **h, const struct hash_ops *hash_ops HASHMAP_DEBUG_PARAMS);
#define hashmap_ensure_allocated(h, ops) internal_hashmap_ensure_allocated(h, ops HASHMAP_DEBUG_SRC_ARGS)
#define ordered_hashmap_ensure_allocated(h, ops) internal_ordered_hashmap_ensure_allocated(h, ops HASHMAP_DEBUG_SRC_ARGS)
basic: implement the IteratedCache Adds the basics of the IteratedCache and constructor support for the Hashmap and OrderedHashmap types. iterated_cache_get() is responsible for synchronizing the cache with the associated Hashmap and making it available to the caller at the supplied result pointers. Since iterated_cache_get() may need to allocate memory, it may fail, so callers must check the return value. On success, pointer arrays containing pointers to the associated Hashmap's keys and values, in as-iterated order, are returned in res_keys and res_values, respectively. Either may be supplied as NULL to inhibit caching of the keys or values, respectively. Note that if the cached Hashmap hasn't changed since the previous call to iterated_cache_get(), and it's not a call activating caching of the values or keys, the cost is effectively zero as the resulting pointers will simply refer to the previously returned arrays as-is. A cleanup function has also been added, iterated_cache_free(). This only frees the IteratedCache container and related arrays. The associated Hashmap, its keys, and values are not affected. Also note that the associated Hashmap does not automatically free its associated IteratedCache when freed. One could, in theory, safely access the arrays returned by a successful iterated_cache_get() call after its associated Hashmap has been freed, including the referenced values and keys. Provided the iterated_cache_get() was performed prior to the hashmap free, and that the type of hashmap free performed didn't free keys and/or values as well.
2018-01-27 01:38:01 +01:00
IteratedCache *internal_hashmap_iterated_cache_new(HashmapBase *h);
static inline IteratedCache *hashmap_iterated_cache_new(Hashmap *h) {
return (IteratedCache*) internal_hashmap_iterated_cache_new(HASHMAP_BASE(h));
}
static inline IteratedCache *ordered_hashmap_iterated_cache_new(OrderedHashmap *h) {
return (IteratedCache*) internal_hashmap_iterated_cache_new(HASHMAP_BASE(h));
}
2009-11-18 00:42:52 +01:00
int hashmap_put(Hashmap *h, const void *key, void *value);
static inline int ordered_hashmap_put(OrderedHashmap *h, const void *key, void *value) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return hashmap_put(PLAIN_HASHMAP(h), key, value);
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
int hashmap_update(Hashmap *h, const void *key, void *value);
static inline int ordered_hashmap_update(OrderedHashmap *h, const void *key, void *value) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return hashmap_update(PLAIN_HASHMAP(h), key, value);
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
int hashmap_replace(Hashmap *h, const void *key, void *value);
static inline int ordered_hashmap_replace(OrderedHashmap *h, const void *key, void *value) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return hashmap_replace(PLAIN_HASHMAP(h), key, value);
}
void *internal_hashmap_get(HashmapBase *h, const void *key);
static inline void *hashmap_get(Hashmap *h, const void *key) {
return internal_hashmap_get(HASHMAP_BASE(h), key);
}
static inline void *ordered_hashmap_get(OrderedHashmap *h, const void *key) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return internal_hashmap_get(HASHMAP_BASE(h), key);
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
2013-04-26 18:40:08 +02:00
void *hashmap_get2(Hashmap *h, const void *key, void **rkey);
static inline void *ordered_hashmap_get2(OrderedHashmap *h, const void *key, void **rkey) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return hashmap_get2(PLAIN_HASHMAP(h), key, rkey);
}
bool internal_hashmap_contains(HashmapBase *h, const void *key);
static inline bool hashmap_contains(Hashmap *h, const void *key) {
return internal_hashmap_contains(HASHMAP_BASE(h), key);
}
static inline bool ordered_hashmap_contains(OrderedHashmap *h, const void *key) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return internal_hashmap_contains(HASHMAP_BASE(h), key);
}
void *internal_hashmap_remove(HashmapBase *h, const void *key);
static inline void *hashmap_remove(Hashmap *h, const void *key) {
return internal_hashmap_remove(HASHMAP_BASE(h), key);
}
static inline void *ordered_hashmap_remove(OrderedHashmap *h, const void *key) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return internal_hashmap_remove(HASHMAP_BASE(h), key);
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
void *hashmap_remove2(Hashmap *h, const void *key, void **rkey);
static inline void *ordered_hashmap_remove2(OrderedHashmap *h, const void *key, void **rkey) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return hashmap_remove2(PLAIN_HASHMAP(h), key, rkey);
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
void *internal_hashmap_remove_value(HashmapBase *h, const void *key, void *value);
static inline void *hashmap_remove_value(Hashmap *h, const void *key, void *value) {
return internal_hashmap_remove_value(HASHMAP_BASE(h), key, value);
}
static inline void *ordered_hashmap_remove_value(OrderedHashmap *h, const void *key, void *value) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return hashmap_remove_value(PLAIN_HASHMAP(h), key, value);
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
int hashmap_remove_and_put(Hashmap *h, const void *old_key, const void *new_key, void *value);
static inline int ordered_hashmap_remove_and_put(OrderedHashmap *h, const void *old_key, const void *new_key, void *value) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return hashmap_remove_and_put(PLAIN_HASHMAP(h), old_key, new_key, value);
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
int hashmap_remove_and_replace(Hashmap *h, const void *old_key, const void *new_key, void *value);
static inline int ordered_hashmap_remove_and_replace(OrderedHashmap *h, const void *old_key, const void *new_key, void *value) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return hashmap_remove_and_replace(PLAIN_HASHMAP(h), old_key, new_key, value);
}
2009-11-18 00:42:52 +01:00
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
/* Since merging data from a OrderedHashmap into a Hashmap or vice-versa
* should just work, allow this by having looser type-checking here. */
int internal_hashmap_merge(Hashmap *h, Hashmap *other);
#define hashmap_merge(h, other) internal_hashmap_merge(PLAIN_HASHMAP(h), PLAIN_HASHMAP(other))
#define ordered_hashmap_merge(h, other) hashmap_merge(h, other)
int internal_hashmap_reserve(HashmapBase *h, unsigned entries_add);
static inline int hashmap_reserve(Hashmap *h, unsigned entries_add) {
return internal_hashmap_reserve(HASHMAP_BASE(h), entries_add);
}
static inline int ordered_hashmap_reserve(OrderedHashmap *h, unsigned entries_add) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return internal_hashmap_reserve(HASHMAP_BASE(h), entries_add);
}
int internal_hashmap_move(HashmapBase *h, HashmapBase *other);
/* Unlike hashmap_merge, hashmap_move does not allow mixing the types. */
static inline int hashmap_move(Hashmap *h, Hashmap *other) {
return internal_hashmap_move(HASHMAP_BASE(h), HASHMAP_BASE(other));
}
static inline int ordered_hashmap_move(OrderedHashmap *h, OrderedHashmap *other) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return internal_hashmap_move(HASHMAP_BASE(h), HASHMAP_BASE(other));
}
int internal_hashmap_move_one(HashmapBase *h, HashmapBase *other, const void *key);
static inline int hashmap_move_one(Hashmap *h, Hashmap *other, const void *key) {
return internal_hashmap_move_one(HASHMAP_BASE(h), HASHMAP_BASE(other), key);
}
static inline int ordered_hashmap_move_one(OrderedHashmap *h, OrderedHashmap *other, const void *key) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return internal_hashmap_move_one(HASHMAP_BASE(h), HASHMAP_BASE(other), key);
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
unsigned internal_hashmap_size(HashmapBase *h) _pure_;
static inline unsigned hashmap_size(Hashmap *h) {
return internal_hashmap_size(HASHMAP_BASE(h));
}
static inline unsigned ordered_hashmap_size(OrderedHashmap *h) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return internal_hashmap_size(HASHMAP_BASE(h));
}
static inline bool hashmap_isempty(Hashmap *h) {
return hashmap_size(h) == 0;
}
static inline bool ordered_hashmap_isempty(OrderedHashmap *h) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return ordered_hashmap_size(h) == 0;
}
unsigned internal_hashmap_buckets(HashmapBase *h) _pure_;
static inline unsigned hashmap_buckets(Hashmap *h) {
return internal_hashmap_buckets(HASHMAP_BASE(h));
}
static inline unsigned ordered_hashmap_buckets(OrderedHashmap *h) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return internal_hashmap_buckets(HASHMAP_BASE(h));
}
2009-11-18 00:42:52 +01:00
bool internal_hashmap_iterate(HashmapBase *h, Iterator *i, void **value, const void **key);
static inline bool hashmap_iterate(Hashmap *h, Iterator *i, void **value, const void **key) {
return internal_hashmap_iterate(HASHMAP_BASE(h), i, value, key);
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
}
static inline bool ordered_hashmap_iterate(OrderedHashmap *h, Iterator *i, void **value, const void **key) {
return internal_hashmap_iterate(HASHMAP_BASE(h), i, value, key);
}
2009-11-18 00:42:52 +01:00
void internal_hashmap_clear(HashmapBase *h, free_func_t default_free_key, free_func_t default_free_value);
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
static inline void hashmap_clear(Hashmap *h) {
internal_hashmap_clear(HASHMAP_BASE(h), NULL, NULL);
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
}
static inline void ordered_hashmap_clear(OrderedHashmap *h) {
internal_hashmap_clear(HASHMAP_BASE(h), NULL, NULL);
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
}
static inline void hashmap_clear_free(Hashmap *h) {
internal_hashmap_clear(HASHMAP_BASE(h), NULL, free);
}
static inline void ordered_hashmap_clear_free(OrderedHashmap *h) {
internal_hashmap_clear(HASHMAP_BASE(h), NULL, free);
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
static inline void hashmap_clear_free_key(Hashmap *h) {
internal_hashmap_clear(HASHMAP_BASE(h), free, NULL);
}
static inline void ordered_hashmap_clear_free_key(OrderedHashmap *h) {
internal_hashmap_clear(HASHMAP_BASE(h), free, NULL);
}
static inline void hashmap_clear_free_free(Hashmap *h) {
internal_hashmap_clear(HASHMAP_BASE(h), free, free);
}
static inline void ordered_hashmap_clear_free_free(OrderedHashmap *h) {
internal_hashmap_clear(HASHMAP_BASE(h), free, free);
}
2012-07-03 16:09:36 +02:00
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
/*
* Note about all *_first*() functions
*
* For plain Hashmaps and Sets the order of entries is undefined.
* The functions find whatever entry is first in the implementation
* internal order.
*
* Only for OrderedHashmaps the order is well defined and finding
* the first entry is O(1).
*/
void *internal_hashmap_first_key_and_value(HashmapBase *h, bool remove, void **ret_key);
static inline void *hashmap_steal_first_key_and_value(Hashmap *h, void **ret) {
return internal_hashmap_first_key_and_value(HASHMAP_BASE(h), true, ret);
}
static inline void *ordered_hashmap_steal_first_key_and_value(OrderedHashmap *h, void **ret) {
return internal_hashmap_first_key_and_value(HASHMAP_BASE(h), true, ret);
}
static inline void *hashmap_first_key_and_value(Hashmap *h, void **ret) {
return internal_hashmap_first_key_and_value(HASHMAP_BASE(h), false, ret);
}
static inline void *ordered_hashmap_first_key_and_value(OrderedHashmap *h, void **ret) {
return internal_hashmap_first_key_and_value(HASHMAP_BASE(h), false, ret);
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
static inline void *hashmap_steal_first(Hashmap *h) {
return internal_hashmap_first_key_and_value(HASHMAP_BASE(h), true, NULL);
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
}
static inline void *ordered_hashmap_steal_first(OrderedHashmap *h) {
return internal_hashmap_first_key_and_value(HASHMAP_BASE(h), true, NULL);
}
static inline void *hashmap_first(Hashmap *h) {
return internal_hashmap_first_key_and_value(HASHMAP_BASE(h), false, NULL);
}
static inline void *ordered_hashmap_first(OrderedHashmap *h) {
return internal_hashmap_first_key_and_value(HASHMAP_BASE(h), false, NULL);
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
}
static inline void *internal_hashmap_first_key(HashmapBase *h, bool remove) {
void *key = NULL;
(void) internal_hashmap_first_key_and_value(HASHMAP_BASE(h), remove, &key);
return key;
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
static inline void *hashmap_steal_first_key(Hashmap *h) {
return internal_hashmap_first_key(HASHMAP_BASE(h), true);
}
static inline void *ordered_hashmap_steal_first_key(OrderedHashmap *h) {
return internal_hashmap_first_key(HASHMAP_BASE(h), true);
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
static inline void *hashmap_first_key(Hashmap *h) {
return internal_hashmap_first_key(HASHMAP_BASE(h), false);
}
static inline void *ordered_hashmap_first_key(OrderedHashmap *h) {
return internal_hashmap_first_key(HASHMAP_BASE(h), false);
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
}
#define hashmap_clear_with_destructor(_s, _f) \
({ \
void *_item; \
while ((_item = hashmap_steal_first(_s))) \
_f(_item); \
})
#define hashmap_free_with_destructor(_s, _f) \
({ \
hashmap_clear_with_destructor(_s, _f); \
hashmap_free(_s); \
})
#define ordered_hashmap_clear_with_destructor(_s, _f) \
({ \
void *_item; \
while ((_item = ordered_hashmap_steal_first(_s))) \
_f(_item); \
})
#define ordered_hashmap_free_with_destructor(_s, _f) \
({ \
ordered_hashmap_clear_with_destructor(_s, _f); \
ordered_hashmap_free(_s); \
})
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
/* no hashmap_next */
void *ordered_hashmap_next(OrderedHashmap *h, const void *key);
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
char **internal_hashmap_get_strv(HashmapBase *h);
static inline char **hashmap_get_strv(Hashmap *h) {
return internal_hashmap_get_strv(HASHMAP_BASE(h));
}
static inline char **ordered_hashmap_get_strv(OrderedHashmap *h) {
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
return internal_hashmap_get_strv(HASHMAP_BASE(h));
}
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
/*
* Hashmaps are iterated in unpredictable order.
* OrderedHashmaps are an exception to this. They are iterated in the order
* the entries were inserted.
* It is safe to remove the current entry.
*/
#define HASHMAP_FOREACH(e, h, i) \
for ((i) = ITERATOR_FIRST; hashmap_iterate((h), &(i), (void**)&(e), NULL); )
2009-11-18 00:42:52 +01:00
#define ORDERED_HASHMAP_FOREACH(e, h, i) \
for ((i) = ITERATOR_FIRST; ordered_hashmap_iterate((h), &(i), (void**)&(e), NULL); )
#define HASHMAP_FOREACH_KEY(e, k, h, i) \
for ((i) = ITERATOR_FIRST; hashmap_iterate((h), &(i), (void**)&(e), (const void**) &(k)); )
2010-01-19 04:15:20 +01:00
#define ORDERED_HASHMAP_FOREACH_KEY(e, k, h, i) \
for ((i) = ITERATOR_FIRST; ordered_hashmap_iterate((h), &(i), (void**)&(e), (const void**) &(k)); )
DEFINE_TRIVIAL_CLEANUP_FUNC(Hashmap*, hashmap_free);
DEFINE_TRIVIAL_CLEANUP_FUNC(Hashmap*, hashmap_free_free);
DEFINE_TRIVIAL_CLEANUP_FUNC(Hashmap*, hashmap_free_free_key);
DEFINE_TRIVIAL_CLEANUP_FUNC(Hashmap*, hashmap_free_free_free);
DEFINE_TRIVIAL_CLEANUP_FUNC(OrderedHashmap*, ordered_hashmap_free);
DEFINE_TRIVIAL_CLEANUP_FUNC(OrderedHashmap*, ordered_hashmap_free_free);
DEFINE_TRIVIAL_CLEANUP_FUNC(OrderedHashmap*, ordered_hashmap_free_free_key);
DEFINE_TRIVIAL_CLEANUP_FUNC(OrderedHashmap*, ordered_hashmap_free_free_free);
hashmap: rewrite the implementation This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-15 01:27:16 +02:00
#define _cleanup_hashmap_free_ _cleanup_(hashmap_freep)
#define _cleanup_hashmap_free_free_ _cleanup_(hashmap_free_freep)
#define _cleanup_hashmap_free_free_free_ _cleanup_(hashmap_free_free_freep)
#define _cleanup_ordered_hashmap_free_ _cleanup_(ordered_hashmap_freep)
#define _cleanup_ordered_hashmap_free_free_ _cleanup_(ordered_hashmap_free_freep)
#define _cleanup_ordered_hashmap_free_free_free_ _cleanup_(ordered_hashmap_free_free_freep)
basic: implement the IteratedCache Adds the basics of the IteratedCache and constructor support for the Hashmap and OrderedHashmap types. iterated_cache_get() is responsible for synchronizing the cache with the associated Hashmap and making it available to the caller at the supplied result pointers. Since iterated_cache_get() may need to allocate memory, it may fail, so callers must check the return value. On success, pointer arrays containing pointers to the associated Hashmap's keys and values, in as-iterated order, are returned in res_keys and res_values, respectively. Either may be supplied as NULL to inhibit caching of the keys or values, respectively. Note that if the cached Hashmap hasn't changed since the previous call to iterated_cache_get(), and it's not a call activating caching of the values or keys, the cost is effectively zero as the resulting pointers will simply refer to the previously returned arrays as-is. A cleanup function has also been added, iterated_cache_free(). This only frees the IteratedCache container and related arrays. The associated Hashmap, its keys, and values are not affected. Also note that the associated Hashmap does not automatically free its associated IteratedCache when freed. One could, in theory, safely access the arrays returned by a successful iterated_cache_get() call after its associated Hashmap has been freed, including the referenced values and keys. Provided the iterated_cache_get() was performed prior to the hashmap free, and that the type of hashmap free performed didn't free keys and/or values as well.
2018-01-27 01:38:01 +01:00
DEFINE_TRIVIAL_CLEANUP_FUNC(IteratedCache*, iterated_cache_free);
#define _cleanup_iterated_cache_free_ _cleanup_(iterated_cache_freep)