I have a slightly absurd #RetroComputing project for the #PDP8 that I find kind of fun because it forces me to find new and novel data structures and algorithms for things. You can't Just Use A Hash Table on a system with 12-bit values, 8k of core, and no multiply. It's hilarious finding algorithms that say "this structure is good for space-constrained systems, because the main table only takes up 64kB of RAM" and realising how over spec that is.
So I was digging through disco-era algorithms and data representations, mostly. Lots of neat stuff, some overlooked gems (I have a cool variable-length four-bit text representation that I found via wikibooks). But text processing is still pretty memory-hungry, with some huge tree structures necessary for the heavy lifting, so I'm mostly building abbreviation tables and such via python scripts.
But then I found LCP arrays, which I had missed for some reason. I mean, not "some reason" but that they were actually published after my university algorithms books were written! They're not like SUPER new, but most of the work on them was done by bioinformatics nerds in the early 2000s, so I didn't really notice them until now.
They're super neat! You just create a suffix array by sorting indices into your string lexicographically (from index to the end), then you go down that in order and compare each substring to the one before it and record the length of the largest common prefix in that. Bingo! Two arrays the size of your working string, and you can do all sorts of queries on them in linear time to find valuable repeated sequences for compression tools like LZ77 or an abbreviation tool.
It really *feels* like a 1970s algorithm, but back then they were all doing this in like O(N*log(N)) via dynamic programming techniques. I may still bubble sort just to save code size...