8 comments

  • elil17 44 minutes ago
    I would love to see real examples of what reduced quality means in practice. Are you able to recover a document from the vector in a human readable format? If so, what sort of changes come up?

    I could imagine a scenario where differences tend to be more substantive than you'd expect because of how less frequent words with fine distinctions in meaning - the very words that make the document special - may be embedded in the vector space.

  • purple-leafy 54 minutes ago
    Hey breadislove; amazing article, I’ll be sending mixedbread an email in the morning that may interest you (email will be <5-characters>@pm.me)

    I have also been working in compression and performance engineering, and managed to get a 99+% compression unlock versus conventional approaches (100+KB down to 1KB) in the scenario of 30 minute massive multiplayer game replays for a “game+engine” I’m developing

    I think there’s a synergy between these 2 concepts I’d love to chat some more

  • functionmouse 25 minutes ago
    there is no such thing as "near lossless"
    • ttoinou 3 minutes ago
      There is, after you define what you’re ready to loose and understand the lossy space. That’s how we came up with mobile cellphones, audio and video codecs etc. Literally powering all modern devices we use.
  • Ameo 3 hours ago
    I can't wait until we get to 100% storage/cost/compute reduction for LLMs. Every thought you could have thought pre-conceived in high-fidelity super-resolution. Every action you could have taken predicted and simulated in advance courtesy of Openthropic and the USA Sovereign Wealth Fund.
    • peheje 2 hours ago
      Reminds me of 'Learning to be me' by Greg Egan
    • throwaw12 2 hours ago
      100% reduction is impossible for something which should work, because -100% means it is now 0
      • neonstatic 1 hour ago
        They were clearly being sarcastic
  • johnathan101 3 hours ago
    97% is impressive, but I'm curious what the latency tradeoff looks like in production. Storage is only half the story for retrieval systems.
  • rq1 1 hour ago
    The Pi compression algorithm is better.
  • mv_d5339e31 2 hours ago
    [dead]
  • TradingReality 1 day ago
    [flagged]