Skip to content

Add new API iter.CurrentBufferSize() #465

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

kamstrup
Copy link

@kamstrup kamstrup commented Jun 3, 2020

I am working on a project where it would be immensely useful to know how many bytes (approximately) each JSON object we pull of an iterator used. We can almost do this by passing in a io.Reader that counts the number of bytes read, but we still need to discount for anything in the buffer. With this simple API addition this is possible.

Alternatively we could count all bytes read, and provide an InputOffset() method like the stdlib json.Decoder, but that feels like going over the top?

@kamstrup
Copy link
Author

kamstrup commented Jun 3, 2020

And thanks for a great project btw! I love it! :-D

@codecov
Copy link

codecov bot commented Jun 3, 2020

Codecov Report

Merging #465 into master will increase coverage by 0.08%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #465      +/-   ##
==========================================
+ Coverage   86.52%   86.61%   +0.08%     
==========================================
  Files          41       41              
  Lines        5100     5102       +2     
==========================================
+ Hits         4413     4419       +6     
+ Misses        546      542       -4     
  Partials      141      141              
Impacted Files Coverage Δ
iter.go 89.25% <0.00%> (-0.85%) ⬇️
reflect_struct_decoder.go 85.35% <0.00%> (+0.79%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cd6773e...10b5d6d. Read the comment docs.

@AllenX2018
Copy link
Collaborator

Thanks for the pr. Could you pls show an example how you're using this api?

@kamstrup
Copy link
Author

kamstrup commented Jun 8, 2020

Sure. Concretely the intent is to figure out how many bytes we read for each json object in some stream. To this end I wrap the base reader in a custom ByteCountingReader

// Implements io.Reader and counts total number of bytes read from R
type ByteCountingReader struct {
  N int64 // incremented on every .Read()
  R io.Reader
}

So pseudo-go looks something like:

bcr := &ByteCountingReader{0, r}
iter.Reset(bcr)
for {
  beginN := bcr.N
  beginBuf := iter.CurrentBufferSize()
  obj = parseObj(iter)
  endN := bcr.N
  endBuf := iter.CurrentBufferSize()
  reportBytesForObj(obj, (endN - beginN) + (beginBuf - endBuf))
}

Looking at it I realize that it requires some tight book keeping on the caller side - maybe it is cleaner having the Iterator do the book keeping and expose iter.InputOffset() instead, this would also help bring the API to jsoniter.Decoder and bring it on par with go 1.14...

@kamstrup
Copy link
Author

Superseded by #493

@kamstrup kamstrup closed this Sep 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants