-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
avoiding key encoding/decoding costs #30
Comments
Yeah you're right about the performance hit. This was a good improvement a little while back: #24 Are there any particular places you have in mind? Encoding and decoding has to happen somewhere. Its really just a matter of how you want to do it. I plan on moving this library over to use LexiCodec so that you have more control over it. Then you can simply work with strings if you want and only encode/decode at your pleasure. |
Do you mean instances where I don't want to encode/decode?
That sounds ideal. If I could only encode/decode when I need it, that would be awesome. Do you have any ideas about what the API would look like to support that sort of thing? |
You can check out what I'm doing here: https://github.com/ccorcos/database-experiments/blob/35c21accc274b948248b750d7d2ea8bffeda9967/src/tuple-okv.ts But from what it sounds like, you should just use lexicodec on top of an ordered key value store like LevelDb. Then you have all the control that you want. On a separate note though, I would caution you against premature optimization. Is this actually a performance bottleneck for you? Are you worried about the CPU cost? O(n) decode where n is the length of each key is pretty efficient... Or are you worred about the Memory cost, allocating arrays and objects instead of dealing with raw strings... In my experience, if either of those are concerns for you, you're probably either (1) focusing on something that isn't actually a problem or (2) shouldn't be using a language with a garbage collector like JavaScript. There's a big difference between theoretical performance and actual performance... I would urge you to come up with an example that you feel like is too slow so we have something concrete to iterate a solution on. |
Thanks. This is a great reference
I haven't done any very rigorous performance tests, yet. But from the times I ran the chrome profiler, the encode/decode utilities took up a larger percentage of the overall time than I had anticipated.
I'd love to write the core of my app for compilation to webassembly, but I want it to be usable from javascript. And I think the overhead of passing values back and forth across the interface might be more costly than just running the core in javascript. I don't have enough experience with wasm to know for sure and don't have any idea of a fast path to getting more insight that isn't just building the app twice and running benchmarks. So I'm playing it safe with javascript at the moment.
That's fair enough. Feel free to close this issue until I find something more concrete |
Encoding / decoding across process boundaries is very commonly the most expensive thing in any application... If you use the in-memory tuple database, there's no encoding/decoding going on. But serializing to and from a database is always going to have that cost. |
Seems like a non-negligible amount of time is spent in encoding and decoding the keys. Sometimes, in the case of a scan, I don't want the keys at all. Other times, it would be handy to get the encoded key string and do manipulations directly on that (and run queries with it) without creating a bunch of short-lived arrays. Is this something you'd be willing to support? I could make a PR at some point (post-rewrite), if you're not opposed.
The text was updated successfully, but these errors were encountered: