Try this. Find a photo on your phone that you love. Now squint, or zoom out until it's the size of a stamp. It's still the same photo. You can still tell what's in it. But something about it has gone a little flat — the part that made you take it in the first place has quietly walked out of the room. Most of us would describe what just happened with a shrug: "it's just smaller." But the truth is m
Hello Developers! 👋 Most developers today pick a side: Let’s talk about combining C++ and JavaScript—the ultimate hybrid stack for high-performance applications. 👇 1. The Core Engine (C++) ⚙️ 2. The Browser Bridge (WebAssembly) 🌉 3. The Cinematic Experience (Vanilla JS + UI/UX) ✨ The Takeaway 🎯 Keep optimizing, keep building! 💻✨ ~ Ujjwal Sharma | @stackbyujjwal About the Author 👨💻 Ujjwal
I built a Vamana-based vector search engine in C++ called sembed-engine. Recently I made a pull request that sped up queries by 16x and builds by 9x. The algorithm stayed exactly the same. The recall stayed at 1.0. The number of visited nodes did not change. The speedup came from data layout. The original code stored vectors as separate objects pointed to by shared_ptr: struct Record { int64_t
The first time I implemented Vamana from the DiskANN paper, my approximate nearest neighbor index was slower than brute force. On tiny test fixtures, brute force took 0.27 ms per query. My Vamana implementation took 22.98 ms. That sounds absurd. ANN exists to skip work. The problem was not the algorithm. It was how I mapped the paper's abstractions to actual data structures. The DiskANN pseudocode
Hash tables feel like the default choice for membership tests. std::unordered_set promises average O(1) lookup, so we reach for it automatically. In performance-sensitive C++ code, that habit can cost you an order of magnitude. I ran into this while building a Vamana graph index for approximate nearest neighbor search. The algorithm needs to track visited nodes. Node ids are dense integers, and th
A production-grade embedded system enabling communication across speech, text, Morse, and haptic signals within a single unified pipeline. Official Project Page: https://anandps.in/projects/unified-assistive-communication-system GitHub Repository: https://github.com/anand-ps/unified-assistive-communication-system Problem Assistive communication systems are fragmented. Most tools so
The problem Pattern matching on a large set of literal values looks clean in code but hits a wall at runtime. Every on() call constructs case objects for every arm. With 128 arms, that is 128 object constructions per match call. At 11ns per call, this is fine for one-off use. Inside a hot loop, it is a disaster. // Clean syntax, 128 case objects constructed per call return match(x) | on( lit(0
KMRI is chunk-based MRI compression format for .nii files (Python + Zstd and C++). Check it out at https://github.com/Kiamehr5/KMRI and let me know what you think 💻