Skip to content

TreeInterpreter creates reference cycle, causing GC pressure #291

Open
@mrry

Description

@mrry

We recently noticed that a heavy JMESpath workload was triggering a large number of garbage collection runs. We are using jmespath.compile(), and we tracked this down to the jmespath.visitor.TreeInterpreter that is created on every call to `ParsedResult.search():

interpreter = visitor.TreeInterpreter(options)

It appears that TreeInterpreter creates a reference cycle, which leads to the GC being triggered frequently to clean up the cycles. As far as I can tell, the problem comes from the Visitor._method_cache:

method = getattr(
self, 'visit_%s' % node['type'], self.default_visit)
self._method_cache[node_type] = method

...which store references to methods that are bound to self in a member of self.

Possible solution

We worked around the problem by monkey patching ParsedResult so that it (1) caches a default_interpreter for use when options=None, and (2) uses it in search(). If I understand correctly, we could go further and use a global TreeInterpreter for all ParsedResult instances. The TreeInterpreter seems to be stateless apart from self._method_cache and that implementation seems to be thread-safe (with only the risk of multiple lookups for the same method in a multithreaded case).

I'd be happy to contribute a PR for either version if this would be welcome.

How to reproduce

The following reproducer shows the problem:

import jmespath

import gc
gc.set_debug(gc.DEBUG_COLLECTABLE)

pattern = jmespath.compile("foo")
value = {"foo": "bar"}

for _ in range(1000000):
    pattern.search(value)

...where the output contains one million repetitions of something like:

gc: collectable <TreeInterpreter 0x10f634fa0>
gc: collectable <dict 0x10f63e780>
gc: collectable <Options 0x10f634520>
gc: collectable <Functions 0x10f6345b0>
gc: collectable <method 0x10f63ee80>
gc: collectable <dict 0x10f63eb00>

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions