Agree with OP that LLMs are a great tool for this use case. It's made possible because OP diligently created useful input data. Unfortunately OP's conclusion goes against the AI hype machine. If "consuming" is the "superpower" of AI, then the current level of investment/attention would not be justified.
> No human could read all of this in a lifetime. AI consumes it in seconds.
And therefore it's impossible to test the accuracy if it's consuming your own data. AI can hallucinate on any data you feed it, and it's been proven that it doesn't summarize, but rather abridges and abbreviates data.
In the authors example
> "What patterns emerge from my last 50 one-on-ones?" AI found that performance issues always preceded tool complaints by 2-3 weeks. I'd never connected those dots.
Maybe that's a pattern from 50 one-on-ones. Or maybe it's only in the first two and the last one.
I'd be wary of using AI to summarize like this and expecting accurate insights
> it's been proven that it doesn't summarize, but rather abridges and abbreviates data
Do you have more resources on that? I'd love to read about the methodology.
> And therefore it's impossible to test the accuracy if it's consuming your own data.
Isn't it only if it's hard to verify the result? If it's a result that's hard to produce but easy to verify, a class which many problems fall into, you'd just need to look at the synthetized results.
If you ask it "given these arbitrary metrics, what is the best business plan for my company?" It'd be really hard to verify the result. I'd be hard to verify the result from anyone for that matter, even specialists.
So I think it's less about expecting the LLM to do autonomous work and more about using LLMs to more efficiently help you search the latent space for interesting correlations, so that you and not the LLM come up with the insights.
Similar to P/NP, verification can often be faster than solving. For example, you can then ask the AI to give you the list of tool complaints and the performance issues. Then a text search can easily validate the claim.
Far ahead in producing bugs, far ahead in losing their skills, far ahead in becoming irrelevant, far ahead in being unable to think critically, that's absolutely right.
I often see things like this and get a little bit of FOMO because I'd love to see what I can get out of this but I'm just not willing to upload all these private documents of mine to other people's computers where they're likely to be stored for training or advertising purposes.
How are you guys dealing with this risk? I'm sure on this site nobody is naive to the potential harms of tech, but if you're able to articulate how you've figured out that the risk is worth the benefits to you I'd love to hear it. I don't think I'm being to cynical to wait for either local LLMs to get good or for me to be able to afford expensive GPUs for current local LLMs, but maybe I should be time-discounting a bit harder?
I'm happy to elaborate on why I find it dangerous, too, if this is too vague. Just really would like to have a more nuanced opinion here.
OP's post is trying to con you into using a LLMs for a task that they do not perform well at.
This is specific, but if you start replying to LLM summaries of emails, instead of reading and responding to the content of the email itself, you are quickly going to become a burden socially.
The people you are responding to __will__ be able to tell, and will dislike you for your lack of consideration.
The docs I upload are ones I'd be OK getting leaked. That also includes code. Even more broadly, it also includes whatever pics I put onto social media, including chat groups like Telegram.
This does mean that, useful as e.g. Claude Code is, for any business with NDA-type obligations, I don't think I could recommend it over a locally hosted model, even though the machine needed to run a decent local model might cost €10k (with current price increases due to demand exceeding supply), that the machine is still slower than what hosts the hosted models, that the rapid rate of improvement means a 3-month delay between SOTA in open-weights and private-weights is enough to matter*.
But until then? If I'm vibe coding a video game I'd give away for free anyway, or copy-editing a blog post that's public anyway, or using it to help with some short stories that I'd never be able to charge money for, or uploading pictures of the plants in my garden right by the public road… that's fine.
* When the music (money for training) stops, it could be just about any provider whose model is best, whatever that is is likely to still get distilled down fairly cheaply and/or some 3-month-old open-weights model is likely to get fine-tuned for each task fairly cheaply; independently of this, without the hyper-scalers the supply chains may shift back from DCs to PCs and make local models much more affordable.
I don't really buy this post. LLMs are still pretty weak at long contexts and asking them to find some patterns in data usually leads to very superficial results.
No one said you cannot run LLMs with the same task more than once. For my local tooling, I usually use the process of "Do X with previously accumulated results, add new results if they come up, otherwise reply with just Y" and then you put that into a loop until LLM signals it's done. Software-wise, you could add so it continues beyond that too, for extra assurance.
In general for chat platforms you're right though, uploading/copy-pasting long documents and asking the LLM to find not one, but multiple needles in a haystack tend to give you really poor results. You need a workflow/process for getting accuracy for those sort of tasks.
I think this will be a significant thing in the future, but right now I think the reasoning abilities are too limited. It can reasonably approximate a vector database where it can find related things, but I think that success can hide the failure to find important things.
I'd like to be able to point a model at a news story and have it follow every fact and claim back to an origin, (or lack of one). I'm not sure when they will be able to do that, they aren't up to the task yet. Reading the news would be so much different if you could separate the 'we report this happened' from the 'we report that someone else reported this happened"
The article is more about offloading your thinking to the machine than a real usage of what notes is. You may as well make every decision rely on a coin toss.
I take notes for remembrance and relevance (what is interesting for me). But linking concepts is all my thinking. Doing whatever rhe article is prescribing is like sending someone on a tourist trip to take pictures and then bragging that you visited the country. While knowing that some pictures are photoshopped.
I disagree, what ai brings to thr table is instant and total recall of our thoughts/notes/experiences. Deep analysis of that vast data storage is only possible via ai, which should trigger the aha!!! Moment, or the "you're crazy ai" moment. Either way, it's very useful. And we haven't even talked about the knowl she we have collecting digital dust in emails, and notes and reports of past employees.
At least half of AI's "superpower" in OP's case is the fact that he has everything in Obsidian already. With all of that background context, any tool becomes super valuable in evaluating & guiding future actions.
Still, all credit to him for creating that asset in the first place.
I was in a research math lecture the other day, and the speaker used some obscure technical terminology I didn't know. So I dug out my phone and googled it.
The AI summary at the top was surprisingly good! Of course, the AI isn't doing anything original; instead, it created a summary of whatever written material is already out there. Which is exactly what I wanted.
My counterpoint to this is, if someone cannot verify the validity of the summary then is it truly a summary? And what would the end result be if the vast majority of people opted to adopt or deny a position based on the summary written by a third party?
This isn't strictly a case against AI, just a case that we have a contradiction on the definition of "well informed". We value over-consumption, to the point where we see learning 3 things in 5 minutes as better than learning 1 thing in 5 minutes, even if that means being fully unable to defend or counterpoint what we just read.
I'm speficially referring to what you said: "the speaker used some obscure technical terminology I didn't know" this is due to lack of assumed background knowledge, which makes it hard to verify a summary on your own.
At least with pre-AI search, the info is provided with a source. So there is a small level of reputation that can be considered. With AI, it's a black box that someone decides what to train it on, and as someone said elsewhere, there's no way to police its sources. To get the best results, you have to turn it loose on everything.
So someone who wants a war or wants Tweedledum to get more votes than Tweedledee has incentives to poison the well and disseminate fake content that makes it into the training set. Then there's a whole department of "safety" that has to manually untrain it to not be politically incorrect, racist etc. Because the whole thesis is don't think for yourself, let the AI think for you.
The issue is even deeper - the 1 thing in 5 minutes was probably already surface knowledge. We don’t usually really ‘know’ the thing that quickly. But we might have a chance.
The 3 things in 5 minutes is even worse - it’s like taking Google Maps everywhere without even thinking about how to get from point A to point B - the odds of knowing anything at all from that are near zero.
And since it summarizes the original content, it’s an even bigger issue - we never even have contact with the thing we’re putatively learning from, so it’s even harder to tell bullshit from reality.
It’s like we never even drove the directions Google Maps was giving us.
We’re going to end up with a huge number of extremely disconnected and useless people, who all absolutely insist they know things and can do stuff. :s
I have to agree. People moan that the ai summary is rubbish but that misses the point. If i need a quick overview of a subject i don't necessarily need anything more then a low quality summary. It's easier then wading through a bunch of blogs of unknown quality.
In my experience Google's AI summaries are consistently accurate when retrieving technical information. In particular, documentation for long-lived, infrequently changing software packages tends to be accurate.
If you ask Google about news, world history, pop culture, current events, places of interest, etc., it will lie to you frequently and confidently. In these cases, the "low quality summary" is very often a completely idiotic and inane fabrication.
I looked up a medical term, that is frequently misused (eg. "retarded"), and asked the Gemini to compare it with similar conditions.
Because I have enough of a background in the subject matter, I could tell what it had construed by its mixing the many incorrect references with the much fewer correct references in the training data.
I asked it for sources, and it failed to provide anything useful. But once I am looking at sources, I would be MUCH better off searching and only reading the sources might actually be useful.
I was sitting with a medical professional at the time (who is not also a programmer) and he completely swallowed what Gemini was feeding him. He commented that he appreciates that these summaries let him know when he is not up to date with the latest advances, and he learnt alot from the response.
As an aside, I am not sure I appreciate that Google's profile would now associate me with that particular condition.
This is just garbage in, garbage out. Would you better off if I gave you an incorrect source? What about three incorrect ones? And a search engine would also associate you with this term now. Nothing you describe here seems specific to AU.
Sorry, is this new? Providing the right data to LLMs supercharges them. Yes, I agree. I've been doing this since March 2025 when there was a blog post about using MCP on HN. I'm not the only one who's doing that.
I've written my whole lifestory, the parts I'm willing to share that is, and posted it in Claude. It helped me way better with all kinds of things. It took me 2 days to write without formatting, pretty much how I write all my HN comments (but then 2 days straight: eat, sleep, write).
I've also exported all my notes, but it's too big for the context. That's why I wrote my life story.
From a practical standpoint I think the focus is on context management. Obsidian can help with this (I haven't used it so don't know the details). For code, it means doing things like static and dynamic analysis to see which functions calls what and create a topology of function calls and send that as context, then Claude Code can more easily know what to edit, and it doesn't need to read all the code.
Curious, what did you get out of it? Counseling? Some action plan? A reflection? Seems intriguing to do, but would like to know how it helped you exactly if you don’t mind sharing.
Career planning at the moment, tailoring resumes. Currently, it's not tailoring it well enough yet because it's hallucinating too much, so I need to write a specific prompt for that. But I know for work, where I do similar things (text generation with a human in the loop), that I can tackle that problem.
So yea, I definitely, add to the "AI generated" text part but I read over all the texts, and usually they don't get sent out. Ultimately, it's still a lot quicker to do it this way.
For career planning, so far it hasn't beaten my own insights but it came close. For example, it mentioned that I should actually be a developer advocate instead of a software engineer. 2 to 3 years ago I came to that same thought. I ultimately rejected the idea due to how I am but it is a good one to think about.
What I see now, I think the best job for me would be a tech consultant. Or as I'd also like to call it: a data analyst that spots problems and then uses his software engineering or teaching skills to solve that problem. I don't think that job has a good catch all title as it is a pretty generalist job. I'm currently at a company that allows me to do this but the pay is quite low, so I'm looking for a tech company where I could do something similar. Maybe a product manager role? It really depends on the company culture.
What I also noticed it did better: it doesn't reduce me to data engineering anymore. It understands that I aspire to learn everything and anything I can get my hands on. It's my mode of living and Claude understands that.
So nothing too spectacular yet, but it'll come. It requires more prompt/context engineering and fine tuning of certain things. I didn't get around to it yet.
What is the approach used? It seems everything gets done in context by plain text searches with some agent like Claude code or is there RAG involved? (was the article written by AI? it has that LinkedIn-groove all over the place)
I would say the AI consumption aspect was a side effect: the primary goal was to "generate" new stuff. So far, to me, the significant boost is the coding aspect. Still, for the rest of the people, I think you are right: 90% of the benefits come from being an interactive, conversational search on top of the available information that AI can read/consume.
Ironically this article/blog itself is giving off an AI-generated smell as it's tone and cadence seem very similar to LinkedIn posts or rather output of prompts to create LinkedIn posts.
If we pair this with a wearable ai pendant like plaid or limitless, we can increase the amount and frequency of depositing into our knowledge vault. Op, do you type your thoughts and notes or dictate them?
This is the right approach. I exported my 25k Evernote notes to markdown (I'm using Emacs' Howm mode) and I use Codex CLI to ask questions about my notes. It is great and powerful!
This is akin to using AI as a 'second brain', just getting started with Obsidian, my main challenge is loading it up with every communication trace I have...but haven't given up.
And the surveillance could be inversely correlated to profitability. If they pour billions into these chat bots and can't monetize them to the revolutionary oracles they touted, one minor consolation is to sell detailed profiles about the people using them. You could probably sort out the less intelligent people based on what they were asking.
It is scary! my coping mechanism, which I admit is stupid, is to believe no matter what I do as long as I am online they have my data. But you are right most people just give absurd amount of data for willingly.
I found that out while working with music models like Suno! I love creating music for my own listening experience as a hobbyist and when I give suno a prompt no matter how well crafted it is the outcome varies from "meh" to "that's good" ... while when I upload semi finished beat I made and prompt it to cover it the results consistently leave me speechless! Could be a bias since the music has a lot of elements I created but this workflow is similar across other generative models for me.
Is this like your belief? Transformers et al were invented by researchers with the explicit goal of surveillance and mass profiling? You think maybe that could have been an unintended effect of something/someone else? Or it's all the researchers fault?
And therefore it's impossible to test the accuracy if it's consuming your own data. AI can hallucinate on any data you feed it, and it's been proven that it doesn't summarize, but rather abridges and abbreviates data.
In the authors example
> "What patterns emerge from my last 50 one-on-ones?" AI found that performance issues always preceded tool complaints by 2-3 weeks. I'd never connected those dots.
Maybe that's a pattern from 50 one-on-ones. Or maybe it's only in the first two and the last one.
I'd be wary of using AI to summarize like this and expecting accurate insights
Do you have more resources on that? I'd love to read about the methodology.
> And therefore it's impossible to test the accuracy if it's consuming your own data.
Isn't it only if it's hard to verify the result? If it's a result that's hard to produce but easy to verify, a class which many problems fall into, you'd just need to look at the synthetized results.
If you ask it "given these arbitrary metrics, what is the best business plan for my company?" It'd be really hard to verify the result. I'd be hard to verify the result from anyone for that matter, even specialists.
So I think it's less about expecting the LLM to do autonomous work and more about using LLMs to more efficiently help you search the latent space for interesting correlations, so that you and not the LLM come up with the insights.
How are you guys dealing with this risk? I'm sure on this site nobody is naive to the potential harms of tech, but if you're able to articulate how you've figured out that the risk is worth the benefits to you I'd love to hear it. I don't think I'm being to cynical to wait for either local LLMs to get good or for me to be able to afford expensive GPUs for current local LLMs, but maybe I should be time-discounting a bit harder?
I'm happy to elaborate on why I find it dangerous, too, if this is too vague. Just really would like to have a more nuanced opinion here.
This is specific, but if you start replying to LLM summaries of emails, instead of reading and responding to the content of the email itself, you are quickly going to become a burden socially.
The people you are responding to __will__ be able to tell, and will dislike you for your lack of consideration.
This does mean that, useful as e.g. Claude Code is, for any business with NDA-type obligations, I don't think I could recommend it over a locally hosted model, even though the machine needed to run a decent local model might cost €10k (with current price increases due to demand exceeding supply), that the machine is still slower than what hosts the hosted models, that the rapid rate of improvement means a 3-month delay between SOTA in open-weights and private-weights is enough to matter*.
But until then? If I'm vibe coding a video game I'd give away for free anyway, or copy-editing a blog post that's public anyway, or using it to help with some short stories that I'd never be able to charge money for, or uploading pictures of the plants in my garden right by the public road… that's fine.
* When the music (money for training) stops, it could be just about any provider whose model is best, whatever that is is likely to still get distilled down fairly cheaply and/or some 3-month-old open-weights model is likely to get fine-tuned for each task fairly cheaply; independently of this, without the hyper-scalers the supply chains may shift back from DCs to PCs and make local models much more affordable.
In general for chat platforms you're right though, uploading/copy-pasting long documents and asking the LLM to find not one, but multiple needles in a haystack tend to give you really poor results. You need a workflow/process for getting accuracy for those sort of tasks.
I'd like to be able to point a model at a news story and have it follow every fact and claim back to an origin, (or lack of one). I'm not sure when they will be able to do that, they aren't up to the task yet. Reading the news would be so much different if you could separate the 'we report this happened' from the 'we report that someone else reported this happened"
I take notes for remembrance and relevance (what is interesting for me). But linking concepts is all my thinking. Doing whatever rhe article is prescribing is like sending someone on a tourist trip to take pictures and then bragging that you visited the country. While knowing that some pictures are photoshopped.
Still, all credit to him for creating that asset in the first place.
The AI summary at the top was surprisingly good! Of course, the AI isn't doing anything original; instead, it created a summary of whatever written material is already out there. Which is exactly what I wanted.
This isn't strictly a case against AI, just a case that we have a contradiction on the definition of "well informed". We value over-consumption, to the point where we see learning 3 things in 5 minutes as better than learning 1 thing in 5 minutes, even if that means being fully unable to defend or counterpoint what we just read.
I'm speficially referring to what you said: "the speaker used some obscure technical terminology I didn't know" this is due to lack of assumed background knowledge, which makes it hard to verify a summary on your own.
So someone who wants a war or wants Tweedledum to get more votes than Tweedledee has incentives to poison the well and disseminate fake content that makes it into the training set. Then there's a whole department of "safety" that has to manually untrain it to not be politically incorrect, racist etc. Because the whole thesis is don't think for yourself, let the AI think for you.
The 3 things in 5 minutes is even worse - it’s like taking Google Maps everywhere without even thinking about how to get from point A to point B - the odds of knowing anything at all from that are near zero.
And since it summarizes the original content, it’s an even bigger issue - we never even have contact with the thing we’re putatively learning from, so it’s even harder to tell bullshit from reality.
It’s like we never even drove the directions Google Maps was giving us.
We’re going to end up with a huge number of extremely disconnected and useless people, who all absolutely insist they know things and can do stuff. :s
If you ask Google about news, world history, pop culture, current events, places of interest, etc., it will lie to you frequently and confidently. In these cases, the "low quality summary" is very often a completely idiotic and inane fabrication.
I looked up a medical term, that is frequently misused (eg. "retarded"), and asked the Gemini to compare it with similar conditions.
Because I have enough of a background in the subject matter, I could tell what it had construed by its mixing the many incorrect references with the much fewer correct references in the training data.
I asked it for sources, and it failed to provide anything useful. But once I am looking at sources, I would be MUCH better off searching and only reading the sources might actually be useful.
I was sitting with a medical professional at the time (who is not also a programmer) and he completely swallowed what Gemini was feeding him. He commented that he appreciates that these summaries let him know when he is not up to date with the latest advances, and he learnt alot from the response.
As an aside, I am not sure I appreciate that Google's profile would now associate me with that particular condition.
Scary!
I've written my whole lifestory, the parts I'm willing to share that is, and posted it in Claude. It helped me way better with all kinds of things. It took me 2 days to write without formatting, pretty much how I write all my HN comments (but then 2 days straight: eat, sleep, write).
I've also exported all my notes, but it's too big for the context. That's why I wrote my life story.
From a practical standpoint I think the focus is on context management. Obsidian can help with this (I haven't used it so don't know the details). For code, it means doing things like static and dynamic analysis to see which functions calls what and create a topology of function calls and send that as context, then Claude Code can more easily know what to edit, and it doesn't need to read all the code.
So yea, I definitely, add to the "AI generated" text part but I read over all the texts, and usually they don't get sent out. Ultimately, it's still a lot quicker to do it this way.
For career planning, so far it hasn't beaten my own insights but it came close. For example, it mentioned that I should actually be a developer advocate instead of a software engineer. 2 to 3 years ago I came to that same thought. I ultimately rejected the idea due to how I am but it is a good one to think about.
What I see now, I think the best job for me would be a tech consultant. Or as I'd also like to call it: a data analyst that spots problems and then uses his software engineering or teaching skills to solve that problem. I don't think that job has a good catch all title as it is a pretty generalist job. I'm currently at a company that allows me to do this but the pay is quite low, so I'm looking for a tech company where I could do something similar. Maybe a product manager role? It really depends on the company culture.
What I also noticed it did better: it doesn't reduce me to data engineering anymore. It understands that I aspire to learn everything and anything I can get my hands on. It's my mode of living and Claude understands that.
So nothing too spectacular yet, but it'll come. It requires more prompt/context engineering and fine tuning of certain things. I didn't get around to it yet.
I would love to try this out but don’t feel comfortable sharing all my personal notes with a third party.
Well for most humans that's the more super of the powers too ;)