Skip to content

Investigative Todos #89

@eseiler

Description

@eseiler

membership_for

I noticed that each membership_for call will call counting_agent on each IBF that it traverses.
And each counting_agent will allocate memory, etc.

Since membership_for will probably be called multiple times (multiple queries to one HIBF), it might be beneficial to store the counting agents.

This needs some more benchmarking. It might already be beneficial when conducting 4096 queries on an HIBF with 4096 user bins. In this case, there was a 10% speed-up, and it did not slow down in other cases.

Click to show diff
diff --git a/include/hibf/hierarchical_interleaved_bloom_filter.hpp b/include/hibf/hierarchical_interleaved_bloom_filter.hpp
index b3ff075..bdd300f 100644
--- a/include/hibf/hierarchical_interleaved_bloom_filter.hpp
+++ b/include/hibf/hierarchical_interleaved_bloom_filter.hpp
@@ -196,11 +196,14 @@ private:
     //!\brief A pointer to the augmented hierarchical_interleaved_bloom_filter.
     hierarchical_interleaved_bloom_filter const * const hibf_ptr{nullptr};
 
+    //!\brief Stores counting_agents of all IBFs.
+    std::vector<interleaved_bloom_filter::counting_agent_type<uint16_t>> agents{};
+
     //!\brief Helper for recursive membership querying.
     template <std::ranges::forward_range value_range_t>
     void membership_for_impl(value_range_t && values, int64_t const ibf_idx, size_t const threshold)
     {
-        auto agent = hibf_ptr->ibf_vector[ibf_idx].template counting_agent<uint16_t>();
+        auto & agent = agents[ibf_idx];
         auto & result = agent.bulk_count(values);
 
         uint16_t sum{};
@@ -243,7 +246,12 @@ public:
      * \param hibf The hierarchical_interleaved_bloom_filter.
      */
     explicit membership_agent_type(hierarchical_interleaved_bloom_filter const & hibf) : hibf_ptr(std::addressof(hibf))
-    {}
+    {
+        size_t const number_of_agents = hibf_ptr->ibf_vector.size();
+        agents.reserve(number_of_agents);
+        for (size_t i = 0; i < number_of_agents; ++i)
+            agents.emplace_back(hibf_ptr->ibf_vector[i].template counting_agent<uint16_t>());
+    }
     //!\}
 
     //!\brief Stores the result of membership_for().

Other

  • uint64_t for UB indices, use max value as merged bin indicator
  • pmr for agents

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions