Skip to content

Commit af2431f

Browse files
committed
BUG#22246001 UNDOCUMENTED NEGATIVE LOW ESTIMATES IN P_S.MEMORY_SUMMARY_GLOBAL_BY_EVENT_NAME
Problem: ======== When monitoring the content of table performance_schema.memory_summary_global_by_event_name, with a production workload: - low water marks (LOW_COUNT_USED, LOW_NUMBER_OF_BYTES_USED) can have negative values - high watermarks (HIGH_COUNT_USED, HIGH_NUMBER_OF_BYTES_USED) have forever growing values, even when the server does not really keep increasing memory usage. Context: ======== The memory instrumentation maintains statistics as described below. When an ALLOC is instrumented: - the current count / size is increased - the 'alloc count / size capacity', which represents the distance between the current level and the high water mark, is (conditionally) reduced. - the 'free count / size capacity', which represents the distance between the current level and the low water mark, is (always) increased. Likewise, when a FREE is instrumented: - the current count / size is decreased - the 'alloc count / size capacity' is (always) increased. - the 'free count / size capacity' is (conditionally) increased. When an ALLOC followed by a corresponding FREE is instrumented, for the *same* stat buffer: - the current count / size is unchanged - the high watermark may be increased to reflect the top memory usage - the low watermark is unchanged. When repeating N balanced ALLOC + FREE operations as part of a workload, still for the same stat buffer: - the current count / size is unchanged, - the high watermark reflects the top usage, and is stable, - the low watermark reflects the bottom usage, and is stable. Root cause: =========== For some memory instruments however, the pattern seen at runtime is different. For memory instruments which are not flagged 'global', statistics are maintained per thread, then account / user / host, and then finally in a global buffer. For workloads where one thread acts as a producer of data, and *another* thread acts as a consumer of that data, and if the piece of memory transferred between the producer and the consumer is: - never UN CLAIMED (as in pfs_memory_claim_vc(false)) by the producer - never CLAIMED (as in pfs_memory_claim_vc(true)) by the consumer then the instrumentation behaves as follows. For the producer thread (here, the workload): - the current count / size continuously grows, because the workload is not balanced - the 'alloc count / size capacity' is reduced to 0 as the current level grows, which reflects that this thread is and stays at the high water mark - the 'free count / size capacity' is continuously growing, which reflects that the current level is further and further away from the low water mark For the consumer thread (typically a system thread like innodb cleaners): - the current count / size continuously decreases, is negative, and continues top decrease without limit, - the 'free count / size capacity' is reduced to 0 as the current level decreases, which reflects that this thread is and stays at the low water mark - the 'alloc count / size capacity' is continuously growing, which reflects that the current level is further and further away from the high water mark When aggregating each parts to table memory_summary_global_by_event_name: - the current count / size balances are resolved, and the result is accurate - the high watermark diverges to plus infinity, leading to forever growing high water marks - the low watermark diverges to negative infinity, leading to negative and forever decreasing low water marks The fundamental root cause is that the system fails to reconcile ALLOC and FREE statistics counted in different buckets, when a producer thread ALLOC and gives (without telling the instrumentation) and memory to FREE by a consumer thread. Fix: ==== A) A preliminary fix is a refactoring, to move complex inline code from pfs_stat.h to a new implementation file pfs_stat.cc. The functional fix itself is composed of several parts. B) Without explicit instrumentation (UN CLAIM), it is impossible to know if allocated memory will be later freed by the workload, or contributed by a producer thread, so alloc stats are unchanged when a thread runs. Upon thread termination however, when a thread statistics are aggregated to a parent account / user / host, the instrumentation: - inspects the thread balance - detects if the thread contributes net memory - aggregates the net memory contributed directly to the global buffer, instead of the parent. This is to avoid having intermediate buffers (account / user / host) hold statistics for unbalanced contributions. C) Given that the memory instrumentation marks an allocated block with the owner, it is possible upon FREE to detect that a consumer thread releases memory it did not allocate in the first place. In this case, in pfs_memory_free_vc(), the free operation is counted against the global bucket directly. The intent of B) and C) is to force the system to use the *same* statistics bucket in the producer / consumer scenario. D) For every memory summary tables: - memory_summary_by_thread_by_event_name - memory_summary_by_account_by_event_name - memory_summary_by_user_by_event_name - memory_summary_by_host_by_event_name - memory_summary_global_by_event_name the aggregation code is more complex, and inspects intermediate statistics closer, while performing sums. In particular, when a given bucket contains a net loss (current count / size is negative), we know that the FREE operation counted did not consume 'free count / size capacity' because there was none present from a previous ALLOC, so the capacity that should have been consumed is counted in members m_missing_free_count_capacity (respectively m_missing_free_size_capacity). Once the aggregation is complete, which means that the other bucket containing the matching net gain was also aggregated, a final normalization step occurs, to consume the low watermarks collected with the missing capacity collected. This helps to balance the low watermarks. E) In table memory_summary_global_by_event_name, low watermarks are by definition positive or zero, so the code adjust them if low estimates diverged. F) Finally, a performance improvement fix is implemented. When extending watermarks with the former ::apply_delta() method, the same code was used to adjust both high and low watermarks. This is un necessary, causing too much code to be executed, as only one watermark is moved at a time: the instrumentation records either an ALLOC or a FREE, but never both. As a result, apply_delta() is split into apply_alloc_delta() and apply_free_delta(), and all the calling code adjusted. A side effect is that realloc is no more instrumented as a realloc operation. It is now an ALLOC(new size) followed by a FREE(old size), which is actually more correct for the high water marks, because the total memory usage can reach old size + new size. Approved by: Chris Powers <[email protected]>
1 parent c72fa97 commit af2431f

21 files changed

+1414
-768
lines changed

storage/perfschema/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ pfs_server.h
6767
pfs_setup_actor.h
6868
pfs_setup_object.h
6969
pfs_stat.h
70+
pfs_stat.cc
7071
pfs_status.h
7172
pfs_timer.h
7273
pfs_tls_channel.h

storage/perfschema/pfs.cc

Lines changed: 54 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -7483,16 +7483,16 @@ PSI_memory_key pfs_memory_alloc_vc(PSI_memory_key key, size_t size,
74837483

74847484
PFS_memory_safe_stat *event_name_array;
74857485
PFS_memory_safe_stat *stat;
7486-
PFS_memory_stat_delta delta_buffer;
7487-
PFS_memory_stat_delta *delta;
7486+
PFS_memory_stat_alloc_delta delta_buffer;
7487+
PFS_memory_stat_alloc_delta *delta;
74887488

74897489
/* Aggregate to MEMORY_SUMMARY_BY_THREAD_BY_EVENT_NAME */
74907490
event_name_array = pfs_thread->write_instr_class_memory_stats();
74917491
stat = &event_name_array[index];
74927492
delta = stat->count_alloc(size, &delta_buffer);
74937493

74947494
if (delta != nullptr) {
7495-
pfs_thread->carry_memory_stat_delta(delta, index);
7495+
pfs_thread->carry_memory_stat_alloc_delta(delta, index);
74967496
}
74977497

74987498
/* Flag this memory as owned by the current thread. */
@@ -7524,8 +7524,6 @@ PSI_memory_key pfs_memory_realloc_vc(PSI_memory_key key, size_t old_size,
75247524
}
75257525

75267526
uint index = klass->m_event_name_index;
7527-
PFS_memory_stat_delta delta_buffer;
7528-
PFS_memory_stat_delta *delta;
75297527

75307528
if (flag_thread_instrumentation && !klass->is_global()) {
75317529
PFS_thread *pfs_thread = my_thread_get_THR_PFS();
@@ -7548,17 +7546,25 @@ PSI_memory_key pfs_memory_realloc_vc(PSI_memory_key key, size_t old_size,
75487546
event_name_array = pfs_thread->write_instr_class_memory_stats();
75497547
stat = &event_name_array[index];
75507548

7549+
PFS_memory_stat_free_delta free_delta_buffer;
7550+
PFS_memory_stat_free_delta *free_delta;
7551+
75517552
if (flag_global_instrumentation && klass->m_enabled) {
7552-
delta = stat->count_realloc(old_size, new_size, &delta_buffer);
7553+
PFS_memory_stat_alloc_delta alloc_delta_buffer;
7554+
PFS_memory_stat_alloc_delta *alloc_delta;
7555+
alloc_delta = stat->count_alloc(new_size, &alloc_delta_buffer);
7556+
if (alloc_delta != nullptr) {
7557+
pfs_thread->carry_memory_stat_alloc_delta(alloc_delta, index);
7558+
}
75537559
*owner_thread_hdl = pfs_thread;
75547560
} else {
7555-
delta = stat->count_free(old_size, &delta_buffer);
75567561
*owner_thread_hdl = nullptr;
75577562
key = PSI_NOT_INSTRUMENTED;
75587563
}
75597564

7560-
if (delta != nullptr) {
7561-
pfs_thread->carry_memory_stat_delta(delta, index);
7565+
free_delta = stat->count_free(old_size, &free_delta_buffer);
7566+
if (free_delta != nullptr) {
7567+
pfs_thread->carry_memory_stat_free_delta(free_delta, index);
75627568
}
75637569
return key;
75647570
}
@@ -7572,7 +7578,8 @@ PSI_memory_key pfs_memory_realloc_vc(PSI_memory_key key, size_t old_size,
75727578
stat = &event_name_array[index];
75737579

75747580
if (flag_global_instrumentation && klass->m_enabled) {
7575-
stat->count_global_realloc(old_size, new_size);
7581+
stat->count_global_alloc(new_size);
7582+
stat->count_global_free(old_size);
75767583
} else {
75777584
stat->count_global_free(old_size);
75787585
key = PSI_NOT_INSTRUMENTED;
@@ -7606,8 +7613,6 @@ PSI_memory_key pfs_memory_claim_vc(PSI_memory_key key, size_t size,
76067613
*/
76077614

76087615
uint index = klass->m_event_name_index;
7609-
PFS_memory_stat_delta delta_buffer;
7610-
PFS_memory_stat_delta *delta;
76117616

76127617
PFS_thread *old_thread = sanitize_thread(*owner_thread);
76137618
PFS_thread *new_thread = my_thread_get_THR_PFS();
@@ -7639,19 +7644,22 @@ PSI_memory_key pfs_memory_claim_vc(PSI_memory_key key, size_t size,
76397644

76407645
if (old_thread != nullptr) {
76417646
/* 1: A FREE is counted against X. */
7647+
PFS_memory_stat_free_delta free_delta_buffer;
7648+
PFS_memory_stat_free_delta *free_delta;
7649+
76427650
event_name_local_array = old_thread->write_instr_class_memory_stats();
76437651
local_stat = &event_name_local_array[index];
7644-
delta = local_stat->count_free(size, &delta_buffer);
7652+
free_delta = local_stat->count_free(size, &free_delta_buffer);
76457653

7646-
if (delta != NULL) {
7647-
old_thread->carry_memory_stat_delta(delta, index);
7654+
if (free_delta != NULL) {
7655+
old_thread->carry_memory_stat_free_delta(free_delta, index);
76487656
}
76497657

76507658
/* 2: A MALLOC is counted globally. */
76517659
event_name_global_array = global_instr_class_memory_array;
76527660
if (event_name_global_array) {
76537661
global_stat = &event_name_global_array[index];
7654-
(void)global_stat->count_alloc(size, &delta_buffer);
7662+
global_stat->count_global_alloc(size);
76557663
}
76567664

76577665
/* 3: verify owner was X. */
@@ -7675,16 +7683,18 @@ PSI_memory_key pfs_memory_claim_vc(PSI_memory_key key, size_t size,
76757683
event_name_global_array = global_instr_class_memory_array;
76767684
if (event_name_global_array) {
76777685
global_stat = &event_name_global_array[index];
7678-
(void)global_stat->count_free(size, &delta_buffer);
7686+
global_stat->count_global_free(size);
76797687
}
76807688

76817689
/* 2: A MALLOC is counted against Y. */
7690+
PFS_memory_stat_alloc_delta alloc_delta_buffer;
7691+
PFS_memory_stat_alloc_delta *alloc_delta;
76827692
event_name_local_array = new_thread->write_instr_class_memory_stats();
76837693
local_stat = &event_name_local_array[index];
7684-
delta = local_stat->count_alloc(size, &delta_buffer);
7694+
alloc_delta = local_stat->count_alloc(size, &alloc_delta_buffer);
76857695

7686-
if (delta != nullptr) {
7687-
new_thread->carry_memory_stat_delta(delta, index);
7696+
if (alloc_delta != nullptr) {
7697+
new_thread->carry_memory_stat_alloc_delta(alloc_delta, index);
76887698
}
76897699

76907700
/* 3: set owner to Y. */
@@ -7710,40 +7720,36 @@ void pfs_memory_free_vc(PSI_memory_key key, size_t size,
77107720
*/
77117721

77127722
uint index = klass->m_event_name_index;
7713-
PFS_memory_stat_delta delta_buffer;
7714-
PFS_memory_stat_delta *delta;
7723+
PFS_memory_stat_free_delta delta_buffer;
7724+
PFS_memory_stat_free_delta *delta;
77157725

77167726
if (flag_thread_instrumentation && !klass->is_global()) {
77177727
PFS_thread *pfs_thread = my_thread_get_THR_PFS();
7728+
PFS_thread *owner_thread = reinterpret_cast<PFS_thread *>(owner);
77187729
if (likely(pfs_thread != nullptr)) {
7719-
#ifdef PFS_PARANOID
7720-
PFS_thread *owner_thread = reinterpret_cast<PFS_thread *>(owner);
7721-
7722-
if (owner_thread != pfs_thread) {
7723-
owner_thread = sanitize_thread(owner_thread);
7724-
if (owner_thread != nullptr) {
7725-
report_memory_accounting_error("pfs_memory_free_vc", pfs_thread, size,
7726-
klass, owner_thread);
7730+
if (pfs_thread == owner_thread) {
7731+
/*
7732+
Do not check pfs_thread->m_enabled.
7733+
If a memory alloc was instrumented,
7734+
the corresponding free must be instrumented.
7735+
*/
7736+
/* Aggregate to MEMORY_SUMMARY_BY_THREAD_BY_EVENT_NAME */
7737+
PFS_memory_safe_stat *event_name_array;
7738+
PFS_memory_safe_stat *stat;
7739+
event_name_array = pfs_thread->write_instr_class_memory_stats();
7740+
stat = &event_name_array[index];
7741+
delta = stat->count_free(size, &delta_buffer);
7742+
7743+
if (delta != nullptr) {
7744+
pfs_thread->carry_memory_stat_free_delta(delta, index);
77277745
}
7746+
return;
77287747
}
7729-
#endif /* PFS_PARANOID */
7730-
7731-
/*
7732-
Do not check pfs_thread->m_enabled.
7733-
If a memory alloc was instrumented,
7734-
the corresponding free must be instrumented.
7735-
*/
7736-
/* Aggregate to MEMORY_SUMMARY_BY_THREAD_BY_EVENT_NAME */
7737-
PFS_memory_safe_stat *event_name_array;
7738-
PFS_memory_safe_stat *stat;
7739-
event_name_array = pfs_thread->write_instr_class_memory_stats();
7740-
stat = &event_name_array[index];
7741-
delta = stat->count_free(size, &delta_buffer);
7742-
7743-
if (delta != nullptr) {
7744-
pfs_thread->carry_memory_stat_delta(delta, index);
7745-
}
7746-
return;
7748+
#ifdef PFS_PARANOID
7749+
report_memory_accounting_error("pfs_memory_free_vc", pfs_thread, size,
7750+
klass, owner_thread);
7751+
#endif
7752+
/* Fall back to global aggregate below. */
77477753
}
77487754
}
77497755

storage/perfschema/pfs_account.cc

Lines changed: 36 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -615,32 +615,60 @@ void PFS_account::rebase_memory_stats() {
615615
}
616616
}
617617

618-
void PFS_account::carry_memory_stat_delta(PFS_memory_stat_delta *delta,
619-
uint index) {
618+
void PFS_account::carry_memory_stat_alloc_delta(
619+
PFS_memory_stat_alloc_delta *delta, uint index) {
620620
PFS_memory_shared_stat *event_name_array;
621621
PFS_memory_shared_stat *stat;
622-
PFS_memory_stat_delta delta_buffer;
623-
PFS_memory_stat_delta *remaining_delta;
622+
PFS_memory_stat_alloc_delta delta_buffer;
623+
PFS_memory_stat_alloc_delta *remaining_delta;
624624

625625
event_name_array = write_instr_class_memory_stats();
626626
stat = &event_name_array[index];
627-
remaining_delta = stat->apply_delta(delta, &delta_buffer);
627+
remaining_delta = stat->apply_alloc_delta(delta, &delta_buffer);
628628

629629
if (remaining_delta == nullptr) {
630630
return;
631631
}
632632

633633
if (m_user != nullptr) {
634-
m_user->carry_memory_stat_delta(remaining_delta, index);
634+
m_user->carry_memory_stat_alloc_delta(remaining_delta, index);
635635
/* do not return, need to process m_host below */
636636
}
637637

638638
if (m_host != nullptr) {
639-
m_host->carry_memory_stat_delta(remaining_delta, index);
639+
m_host->carry_memory_stat_alloc_delta(remaining_delta, index);
640640
return;
641641
}
642642

643-
carry_global_memory_stat_delta(remaining_delta, index);
643+
carry_global_memory_stat_alloc_delta(remaining_delta, index);
644+
}
645+
646+
void PFS_account::carry_memory_stat_free_delta(
647+
PFS_memory_stat_free_delta *delta, uint index) {
648+
PFS_memory_shared_stat *event_name_array;
649+
PFS_memory_shared_stat *stat;
650+
PFS_memory_stat_free_delta delta_buffer;
651+
PFS_memory_stat_free_delta *remaining_delta;
652+
653+
event_name_array = write_instr_class_memory_stats();
654+
stat = &event_name_array[index];
655+
remaining_delta = stat->apply_free_delta(delta, &delta_buffer);
656+
657+
if (remaining_delta == nullptr) {
658+
return;
659+
}
660+
661+
if (m_user != nullptr) {
662+
m_user->carry_memory_stat_free_delta(remaining_delta, index);
663+
/* do not return, need to process m_host below */
664+
}
665+
666+
if (m_host != nullptr) {
667+
m_host->carry_memory_stat_free_delta(remaining_delta, index);
668+
return;
669+
}
670+
671+
carry_global_memory_stat_free_delta(remaining_delta, index);
644672
}
645673

646674
PFS_account *sanitize_account(PFS_account *unsafe) {

storage/perfschema/pfs_account.h

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,8 @@ struct PFS_global_param;
4444
struct PFS_user;
4545
struct PFS_host;
4646
struct PFS_thread;
47-
struct PFS_memory_stat_delta;
47+
struct PFS_memory_stat_alloc_delta;
48+
struct PFS_memory_stat_free_delta;
4849
struct PFS_memory_shared_stat;
4950

5051
/**
@@ -88,7 +89,10 @@ struct PFS_ALIGNED PFS_account : PFS_connection_slice {
8889
/** Reset all memory statistics. */
8990
void rebase_memory_stats();
9091

91-
void carry_memory_stat_delta(PFS_memory_stat_delta *delta, uint index);
92+
void carry_memory_stat_alloc_delta(PFS_memory_stat_alloc_delta *delta,
93+
uint index);
94+
void carry_memory_stat_free_delta(PFS_memory_stat_free_delta *delta,
95+
uint index);
9296

9397
void set_instr_class_memory_stats(PFS_memory_shared_stat *array) {
9498
m_has_memory_stats = false;

storage/perfschema/pfs_host.cc

Lines changed: 22 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -290,19 +290,35 @@ void PFS_host::rebase_memory_stats() {
290290
}
291291
}
292292

293-
void PFS_host::carry_memory_stat_delta(PFS_memory_stat_delta *delta,
294-
uint index) {
293+
void PFS_host::carry_memory_stat_alloc_delta(PFS_memory_stat_alloc_delta *delta,
294+
uint index) {
295295
PFS_memory_shared_stat *event_name_array;
296296
PFS_memory_shared_stat *stat;
297-
PFS_memory_stat_delta delta_buffer;
298-
PFS_memory_stat_delta *remaining_delta;
297+
PFS_memory_stat_alloc_delta delta_buffer;
298+
PFS_memory_stat_alloc_delta *remaining_delta;
299299

300300
event_name_array = write_instr_class_memory_stats();
301301
stat = &event_name_array[index];
302-
remaining_delta = stat->apply_delta(delta, &delta_buffer);
302+
remaining_delta = stat->apply_alloc_delta(delta, &delta_buffer);
303303

304304
if (remaining_delta != nullptr) {
305-
carry_global_memory_stat_delta(remaining_delta, index);
305+
carry_global_memory_stat_alloc_delta(remaining_delta, index);
306+
}
307+
}
308+
309+
void PFS_host::carry_memory_stat_free_delta(PFS_memory_stat_free_delta *delta,
310+
uint index) {
311+
PFS_memory_shared_stat *event_name_array;
312+
PFS_memory_shared_stat *stat;
313+
PFS_memory_stat_free_delta delta_buffer;
314+
PFS_memory_stat_free_delta *remaining_delta;
315+
316+
event_name_array = write_instr_class_memory_stats();
317+
stat = &event_name_array[index];
318+
remaining_delta = stat->apply_free_delta(delta, &delta_buffer);
319+
320+
if (remaining_delta != nullptr) {
321+
carry_global_memory_stat_free_delta(remaining_delta, index);
306322
}
307323
}
308324

storage/perfschema/pfs_host.h

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,8 @@
4141
#include "storage/perfschema/pfs_lock.h"
4242

4343
struct PFS_global_param;
44-
struct PFS_memory_stat_delta;
44+
struct PFS_memory_stat_alloc_delta;
45+
struct PFS_memory_stat_free_delta;
4546
struct PFS_memory_shared_stat;
4647
struct PFS_thread;
4748

@@ -86,7 +87,10 @@ struct PFS_ALIGNED PFS_host : PFS_connection_slice {
8687
/** Reset all memory statistics. */
8788
void rebase_memory_stats();
8889

89-
void carry_memory_stat_delta(PFS_memory_stat_delta *delta, uint index);
90+
void carry_memory_stat_alloc_delta(PFS_memory_stat_alloc_delta *delta,
91+
uint index);
92+
void carry_memory_stat_free_delta(PFS_memory_stat_free_delta *delta,
93+
uint index);
9094

9195
void set_instr_class_memory_stats(PFS_memory_shared_stat *array) {
9296
m_has_memory_stats = false;

0 commit comments

Comments
 (0)