-
Notifications
You must be signed in to change notification settings - Fork 6
Added mem-dbg across all structs of the crate as optional feature #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@LucaCappelletti94 thanks for adding optional We'd love to have this change in, however it seems few size calculation inconsistencies should be addressed first. For example, if I apply the following diff:
and then run tests with:
I can see quite a few unexplained discrepancies between
Perhaps, |
Basically, in most cases the derive is enough to cover everything, and honestly when I opened the PR I had just done that thinking it should be it. Afterwards, as I started benchmarking, I realized that due to the use of the pointer trick to handle dynamic size, the derive told us the struct was only 8 bits or so. Therefore, I quickly wrote up the traits implementations, but I only covered the inner representation. I will try and fix it shortly. |
Small update: fixed errors estimating size on your crate side, but also identified an error on the mem dbg side. Working on that now. |
Now if you rerun the same script as above, you will find that all estimates are matching. |
@@ -34,7 +35,7 @@ name = "cardinality_estimator" | |||
harness = false | |||
|
|||
[features] | |||
default = [] | |||
default = ["mem_dbg"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really necessary to enable mem_dbg
feature by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if by the end of this PR we were to replace size_of with mem_dbg - the two methods are by definition very overlapping in scope. Possibly, reading the other comments, you intended size_of to only include the size of the stored object, while mem_dbg has the goal of computing the size of the object in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If mem_dbg
aims to show real heap memory usage similarly to size_of
then it could be a good replacement / alternative to current size_of
implementation.
src/array.rs
Outdated
@@ -110,7 +111,7 @@ impl<'a, const P: usize, const W: usize> RepresentationTrait for Array<'a, P, W> | |||
/// Return memory size of `Array` representation | |||
#[inline] | |||
fn size_of(&self) -> usize { | |||
size_of::<usize>() + size_of_val(self.arr) | |||
size_of::<usize>() + size_of::<usize>() + size_of::<&[u32]>() + size_of_val(self.arr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this size_of
calculation logic. Originally, I was counting it as sum of:
- data: usize - which stores either data or pointer to the data
- optional u32 slice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me comment it better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done that in commit 20f3d0d, let me know if all is clear.
src/hyperloglog.rs
Outdated
@@ -151,7 +152,8 @@ impl<'a, const P: usize, const W: usize> RepresentationTrait for HyperLogLog<'a, | |||
/// Return memory size of `HyperLogLog` | |||
#[inline] | |||
fn size_of(&self) -> usize { | |||
size_of::<usize>() + size_of_val(self.data) | |||
// The length of the slice, plus the pointer of the slice reference, and the size of the slice itself. | |||
size_of::<usize>() + size_of::<usize>() + size_of_val(self.data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We technically don't store pointer of the slice reference in memory, as it's stored in "data" field of CardinalityEstimator
struct.
I'd suggest to measure allocations of Vec<CardinalityEstimator>
let's say of 1,000 elements of same size.
Also heap allocations can be measured with https://docs.rs/dhat/latest/dhat/ as it's done in https://github.com/cloudflare/cardinality-estimator/blob/main/benches/cardinality_estimator.rs#L148
We could try to compare mem_dbg
to size_of
to dhat
measured allocations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more comments regarding this in commit 953cea8. While I am all for comparing the results of mem-dbg
with those of other crates (I think the authors already have done some of that), in this case I am pretty sure about the correctness of the estimation of the size of a slice (len (usize) + pointer (usize) + size of values therein)
fn size_of(&self) -> usize { | ||
// A priori, we do not know which of the child structs is the largest and therefore | ||
// composing the bulk of the memory usage by the enum. Therefore, we subtract the size of | ||
// the variants, and then we sum the size of the enum itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually Representation
enum is not stored directly and has pub(crate)
visibility. It's only created when accessing CardinalityEstimator
methods and usually optimized by compiler anyway. What's stored is CardinalityEstimator
struct and its data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see, so basically what I am currently measuring would be the 'runtime peak size' in place of the storage size. I suppose the question then is which one we are more interested in measuring. Which one would you rather go for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're primarily interested in measuring heap allocated memory usage of overall CardinalityEstimator
and not the temporary used individual sub-structs Small
, Array
and HyperLogLog
, so that we can reliably and accurately measure how much memory Vec
or HashMap
of 1M or 100M of CardinalityEstimator
uses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Representation
enum and its sub-structs are never actually stored on the heap, these are stack allocated and quickly disposed structs. I think we should only report how much memory CardinalityEstimator
uses.
fn size_of(&self) -> usize { | ||
// A priori, we do not know which of the child structs is the largest and therefore | ||
// composing the bulk of the memory usage by the enum. Therefore, we subtract the size of | ||
// the variants, and then we sum the size of the enum itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Representation
enum and its sub-structs are never actually stored on the heap, these are stack allocated and quickly disposed structs. I think we should only report how much memory CardinalityEstimator
uses.
let cap_size = size_of::<usize>(); | ||
// The size of the arr attribute, which is a slice | ||
// and therefore composed of a pointer (usize) and a length (usize) | ||
let arr_size = size_of::<usize>() * 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the "multiply by 2" aspect of this logic, however I don't think we should measure memory usage of Array
and other structs except CardinalityEstimator
since these are stack allocated and not actually stored / exposed beyond pub(crate)
visibility.
Mem-dbg is a crate that allows to compute the size of a struct. I have added the derives through the crate as an optional feature, so as to use it to compare this implementation with others easily.
Cheers!