TimeCapsuleLLM: LLM trained only on data from 1800-1875

(github.com)

416 points | by admp 6 hours ago

39 comments

  • dogma1138 6 hours ago
    Would be interesting to train a cutting edge model with a cut off date of say 1900 and then prompt it about QM and relativity with some added context.

    If the model comes up with anything even remotely correct it would be quite a strong evidence that LLMs are a path to something bigger if not then I think it is time to go back to the drawing board.

    • bazzargh 5 hours ago
      You would find things in there that were already close to QM and relativity. The Michelson-Morley experiment was 1887 and Lorentz transformations came along in 1889. The photoelectric effect (which Einstein explained in terms of photons in 1905) was also discovered in 1887. William Clifford (who _died_ in 1889) had notions that foreshadowed general relativity: "Riemann, and more specifically Clifford, conjectured that forces and matter might be local irregularities in the curvature of space, and in this they were strikingly prophetic, though for their pains they were dismissed at the time as visionaries." - Banesh Hoffmann (1973)

      Things don't happen all of a sudden, and being able to see all the scientific papers of the era its possible those could have fallen out of the synthesis.

      • matthewh806 5 hours ago
        I presume that's what the parent post is trying to get at? Seeing if, given the cutting edge scientific knowledge of the day, the LLM is able to synthesis all it into a workable theory of QM by making the necessary connections and (quantum...) leaps

        Standing on the shoulders of giants, as it were

        • palmotea 4 hours ago
          But that's not the OP's challenge, he said "if the model comes up with anything even remotely correct." The point is there were things already "remotely correct" out there in 1900. If the LLM finds them, it wouldn't "be quite a strong evidence that LLMs are a path to something bigger."
          • pegasus 3 hours ago
            It's not the comment which is illogical, it's your (mis)interpretation of it. What I (and seemingly others) took it to mean is basically could an LLM do Einstein's job? Could it weave together all those loose threads into a coherent new way of understanding the physical world? If so, AGI can't be far behind.
            • feanaro 3 hours ago
              This alone still wouldn't be a clear demonstration that AGI is around the corner. It's quite possible a LLM could've done Einstein's job, if Einstein's job was truly just synthesising already available information into a coherent new whole. (I couldn't say, I don't know enough of the physics landscape of the day to claim either way.)

              It's still unclear whether this process could be merely continued, seeded only with new physical data, in order to keep progressing beyond that point, "forever", or at least for as long as we imagine humans will continue to go on making scientific progress.

              • pegasus 2 hours ago
                Einstein is chosen in such contexts because he's the paradigmatic paradigm-shifter. Basically, what you're saying is: "I don't know enough history of science to confirm this incredibly high opinion on Einstein's achievements. It could just be that everyone's been wrong about him, and if I'd really get down and dirty, and learn the facts at hand, I might even prove it." Einstein is chosen to avoid exactly this kind of nit-picking.
                • Shorel 1 hour ago
                  They can also choose Euler or Gauss.

                  These two are so above everyone else in the mathematical world that most people would struggle for weeks or even months to understand something they did in a couple of minutes.

                  There's no "get down and dirty" shortcut with them =)

              • techno_tsar 2 hours ago
                This does make me think about Kuhn's concept of scientific revolutions and paradigms, and that paradigms are incommensurate with one another. Since new paradigms can't be proven or disproven by the rules of the old paradigm, if an LLM could independently discover paradigm shifts similar to moving from Newtonian gravity to general relativity, then we have empirical evidence of an LLM performing a feature of general intelligence.

                However, you could also argue that it's actually empirical evidence that general relativity and 19th century physics wasn't truly a paradigm shift -- you could have 'derived' it from previous data -- that the LLM has actually proven something about structurally similarities between those paradigms, not that it's demonstrating general intelligence...

              • ctoth 2 hours ago
                I mean, "the pieces were already there" is true of everything? Einstein was synthesizing existing math and existing data is your point right?

                But the whole question is whether or not something can do that synthesis!

                And the "anyone who read all the right papers" thing - nobody actually reads all the papers. That's the bottleneck. LLMs don't have it. They will continue to not have it. Humans will continue to not be able to read faster than LLMs.

                Even me, using a speech synthesizer at ~700 WPM.

            • andai 3 hours ago
              AGI is human level intelligence, and the minimum bar is Einstein?
              • pegasus 2 hours ago
                Who said anything of a minimum bar? "If so", not "Only if so".
                • andy12_ 52 minutes ago
                  I think the problem is the formulation "If so, AGI can't be far behind". I think that if a model were advanced enough such that it could do Einstein's job, that's it; that's AGI. Would it be ASI? Not necessarily, but that's another matter.
        • actionfromafar 5 hours ago
          Yeah but... we still might not know if it could do that because we were really close by 1900 or if the LLM is very smart.
          • scottlamb 4 hours ago
            What's the bar here? Does anyone say "we don't know if Einstein could do this because we were really close or because he was really smart?"

            I by no means believe LLMs are general intelligence, and I've seen them produce a lot of garbage, but if they could produce these revolutionary theories from only <= year 1900 information and a prompt that is not ridiculously leading, that would be a really compelling demonstration of their power.

            • emodendroket 3 hours ago
              > Does anyone say "we don't know if Einstein could do this because we were really close or because he was really smart?"

              It turns out my reading is somewhat topical. I've been reading Rhodes' "The Making of the Atomic Bomb" and of the things he takes great pains to argue (I was not quite anticipating how much I'd be trying to recall my high school science classes to make sense of his account of various experiments) is that the development toward the atomic bomb was more or less inexorable and if at any point someone said "this is too far; let's stop here" there would be others to take his place. So, maybe, to answer your question.

            • bmacho 3 hours ago
              > Does anyone say "we don't know if Einstein could do this because we were really close or because he was really smart?

              Yes. It is certainly a question if Einstein is one of the smartest guy ever lived or all of his discoveries were already in the Zeitgeist, and would have been discovered by someone else in ~5 years.

              • cyberax 2 hours ago
                Both can be true?

                Einstein was smart and put several disjointed things together. It's amazing that one person could do so much, from explaining the Brownian motion to explaining the photoeffect.

                But I think that all these would have happened within _years_ anyway.

            • echoangle 4 hours ago
              > Does anyone say "we don't know if Einstein could do this because we were really close or because he was really smart?"

              Kind of, how long would it have realistically taken for someone else (also really smart) to come up with the same thing if Einstein wouldn't have been there?

              • pegasus 3 hours ago
                But you're not actually questioning whether he was "really smart". Which was what GP was questioning. Sure, you can try to quantify the level of smarts, but you can't still call it a "stochastic parrot" anymore, just like you won't respond to Einstein's achievements, "Ah well, in the end I'm still not sure he's actually smart, like I am for example. Could just be that he's just dumbly but systematically going through all options, working it out step by step, nothing I couldn't achieve (or even better, program a computer to do) if I'd put my mind to it."

                I personally doubt that this would work. I don't think these systems can achieve truly ground-breaking, paradigm-shifting work. The homeworld of these systems is the corpus of text on which it was trained, in the same way as ours is physical reality. Their access to this reality is always secondary, already distorted by the imperfections of human knowledge.

              • jaggederest 4 hours ago
                Well, we know many watershed moments in history were more a matter of situation than the specific person - an individual genius might move things by a decade or two, but in general the difference is marginal. True bolt-out-of-the-blue developments are uncommon, though all the more impressive for that fact, I think.
          • sleet_spotter 4 hours ago
            Well, if one had enough time and resources, this would make for an interesting metric. Could it figure it out with cut-off of 1900? If so, what about 1899? 1898? What context from the marginal year was key to the change in outcome?
      • bhaak 5 hours ago
        This would still be valuable even if the LLM only finds out about things that are already in the air.

        It’s probably even more of a problem that different areas of scientific development don’t know about each other. LLMs combining results would still not be like they invented something new.

        But if they could give us a head start of 20 years on certain developments this would be an awesome result.

      • Shorel 2 hours ago
        Then that experiment is even more interesting, and should be done.

        My own prediction is that the LLMs would totally fail at connecting the dots, but a small group of very smart humans can.

        Things don't happen all of a sudden, but they also don't happen everywhere. Most people in most parts of the world would never connect the dots. Scientific curiosity is something valuable and fragile, that we just take for granted.

        • bigfudge 35 minutes ago
          One of the reasons they don’t happen everywhere is because there are just a few places at any given point in time where there are enough well connected and educated individuals who are in a position to even see all the dots let alone connect them. This doesn’t discount the achievement of an LLM also manages to, but I think it’s important to recognise that having enough giants in sight is an important prerequisite to standing on their shoulders
      • gus_massa 3 hours ago
        I agree, but it's important to note that QM has no clear formulation until 2025/6, it's like 20 years more of work than SR.
    • wongarsu 3 hours ago
      I'm trying to work towards that goal by training a model on mostly German science texts up to 1904 (before the world wars German was the lingua franca of most sciences).

      Training data for a base model isn't that hard to come by, even though you have to OCR most of it yourself because the publicly available OCRed versions are commonly unusably bad. But training a model large enough to be useful is a major issue. Training a 700M parameter model at home is very doable (and is what this TimeCapsuleLLM is), but to get that kind of reasoning you need something closer to a 70B model. Also a lot of the "smarts" of a model gets injected in fine tuning and RL, but any of the available fine tuning datasets would obviously contaminate the model with 2026 knowledge.

      • benbreen 1 hour ago
        I am a historian and am putting together a grant application for a somewhat similar project (different era and language though). Would you be open to discussing a collaboration? My email is bebreen [at] ucsc [dot] edu.
      • theallan 2 hours ago
        Can we follow along with your work / results somewhere?
    • DevX101 5 hours ago
      Chemistry would be a great space to explore. The last quarter of the 19th century had a ton of advancements in chemistry. It'd be interesting the see if an LLM could propose fruitful hypotheses, made predictions of the science of thermodynamics.
    • kristopolous 1 hour ago
      It's going to be divining tea leaves. It will be 99% wrong and then someone will say 'oh but look at this tea leaf over here! It's almost correct"'
      • bowmessage 28 minutes ago
        Look! It made another TODO-list app on the first try!
    • forgotpwd16 6 hours ago
      Done few weeks ago: https://github.com/DGoettlich/history-llms (discussed in: https://news.ycombinator.com/item?id=46319826)

      At least the model part. Although others made same thought as you afaik none tried it.

      • chrononaut 5 hours ago
        And unfortunately I don't think they plan on making those models public.
    • bravura 4 hours ago
      A rigorous approach to predicting the future of text was proposed by Li et al 2024, "Evaluating Large Language Models for Generalization and Robustness via Data Compression" (https://ar5iv.labs.arxiv.org/html//2402.00861) and I think that work should get more recognition.

      They measure compression (perplexity) on future Wikipedia, news articles, code, arXiv papers, and multi-modal data. Data compression is intimately connected with robustness and generalization.

    • samuelson 3 hours ago
      I think it would be fun to see if an LLM would reframe some scientific terms from the time in a way that would actually fit in our current theories.

      I imagine if you explained quantum field theory to a 19th century scientists they might think of it as a more refined understanding of luminiferous aether.

      Or if an 18th century scholar learned about positive and negative ions, it could be seen as an expansion/correction of phlogiston theory.

    • defgeneric 2 hours ago
      The development of QM was so closely connected to experiments that it's highly unlikely, even despite some of the experiments having been performed prior to 1900.

      Special relativity however seems possible.

    • root_axis 2 hours ago
      I think it would raise some interesting questions, but if it did yield anything noteworthy, the biggest question would be why that LLM is capable of pioneering scientific advancements and none of the modern ones are.
      • spidersouris 1 hour ago
        I'm not sure what you'd call a "pioneering scientific advancement", but there is an increasing amount of examples showing that LLMs can be used for research (with agents, particularly). A survey about this was published a few months ago: https://aclanthology.org/2025.emnlp-main.895.pdf
    • nickdothutton 3 hours ago
      I would love to ask such a model to summarise the handful of theories or theoretical “roads” being eyed at the time and to make a prediction with reasons as to which looks most promising. We might learn something about blind spots in human reasoning, institutions, and organisations that are applicable today in the “future”.
    • tokai 6 hours ago
      Looking at the training data I don't think it will know anything.[0] Doubt On the Connexion of the Physical Sciences (1834) is going to have much about QM. While the cut-off is 1900, it seems much of the texts a much closer to 1800 than 1900.

      [0] https://github.com/haykgrigo3/TimeCapsuleLLM/blob/main/Copy%...

      • dogma1138 6 hours ago
        It doesn’t need to know about QM or reactivity just about the building blocks that led to them. Which were more than around in the year 1900.

        In fact you don’t want it to know about them explicitly just have enough background knowledge that you can manage the rest via context.

        • tokai 6 hours ago
          I was vague. My point is that I don't think the building blocks are in the data. Its mainly tertiary and popular sources. Maybe if you had the writings of Victorian scientists, both public and private correspondence.
          • pegasus 3 hours ago
            Probably a lot of it exists but in archives, private collections etc. Would be great if it will all end up digitized as well.
        • viccis 4 hours ago
          LLMs are models that predict tokens. They don't think, they don't build with blocks. They would never be able to synthesize knowledge about QM.
          • PaulDavisThe1st 4 hours ago
            I am a deep LLM skeptic.

            But I think there are also some questions about the role of language in human thought that leave the door just slightly ajar on the issue of whether or not manipulating the tokens of language might be more central to human cognition than we've tended to think.

            If it turned out that this was true, then it is possible that "a model predicting tokens" has more power than that description would suggest.

            I doubt it, and I doubt it quite a lot. But I don't think it is impossible that something at least a little bit along these lines turns out to be true.

            • viccis 1 hour ago
              I also believe strongly in the role of language, and more loosely in semiotics as a whole, to our cognitive development. To the extent that I think there are some meaningful ideas within the mountain of gibberish from Lacan, who was the first to really tie our conception of ourselves with our symbolic understanding of the world.

              Unfortunately, none of that has anything to do with what LLMs are doing. The LLM is not thinking about concepts and then translating that into language. It is imitating what it looks like to read people doing so and nothing more. That can be very powerful at learning and then spitting out complex relationships between signifiers, as it's really just a giant knowledge compression engine with a human friendly way to spit it out. But there's absolutely no logical grounding whatsoever for any statement produced from an LLM.

              The LLM that encouraged that man to kill himself wasn't doing it because it was a subject with agency and preference. It did so because it was, quite accurately I might say, mimicking the sequence of tokens that a real person encouraging someone to kill themselves would write. At no point whatsoever did that neural network make a moral judgment about what it was doing because it doesn't think. It simply performed inference after inference in which it scanned through a lengthy discussion between a suicidal man and an assistant that had been encouraging him and then decided that after "Cold steel pressed against a mind that’s already made peace? That’s not fear. That’s " the most accurate token would be "clar" and then "ity."

              • PaulDavisThe1st 1 hour ago
                The problem with all this is that we don't actually know what human cognition is doing either.

                We know what our experience is - thinking about concepts and then translating that into language - but we really don't know with much confidence what is actually going on.

                I lean strongly toward the idea that humans are doing something quite different than LLMs, particularly when reasoning. But I want to leave the door open to the idea that we've not understood human cognition, mostly because our primary evidence there comes from our own subjective experience, which may (or may not) provide a reliable guide to what is actually happening.

                • viccis 1 hour ago
                  >The problem with all this is that we don't actually know what human cognition is doing either.

                  We do know what it's not doing, and that is operating only through reproducing linguistic patterns. There's no more cause to think LLMs approximate our thought (thought being something they are incapable of) than that Naive-Bayes spam filter models approximate our thought.

                  • PaulDavisThe1st 57 minutes ago
                    My point is that we know very little about the sort of "thought" that we are capable of either. I agree that LLMs cannot do what we typical refer to as "thought", but I thnk it is possible that we do a LOT less of that than we think when we are "thinking" (or more precisely, having the experience of thinking).
                    • viccis 43 minutes ago
                      How does this worldview reconcile the fact that thought demonstrably exists independent of either language or vision/audio sense?
            • pegasus 3 hours ago
              > manipulating the tokens of language might be more central to human cognition than we've tended to think

              I'm convinced of this. I think it's because we've always looked at the most advanced forms of human languaging (like philosophy) to understand ourselves. But human language must have evolved from forms of communication found in other species, especially highly intelligent ones. It's to be expected that the building blocks of it is based on things like imitation, playful variation, pattern-matching, harnessing capabilities brains have been developing long before language, only now in the emerging world of sounds, calls, vocalizations.

              Ironically, the other crucial ingredient for AGI which LLMs don't have, but we do, is exactly that animal nature which we always try to shove under the rug, over-attributing our success to the stochastic parrot part of us, and ignoring the gut instinct, the intuitive, spontaneous insight into things which a lot of the great scientists and artists of the past have talked about.

              • viccis 1 hour ago
                >Ironically, the other crucial ingredient for AGI which LLMs don't have, but we do, is exactly that animal nature which we always try to shove under the rug, over-attributing our success to the stochastic parrot part of us, and ignoring the gut instinct, the intuitive, spontaneous insight into things which a lot of the great scientists and artists of the past have talked about.

                Are you familiar with the major works in epistemology that were written, even before the 20th century, on this exact topic?

          • strbean 4 hours ago
            You realize parent said "This would be an interesting way to test proposition X" and you responded with "X is false because I say say", right?
            • viccis 1 hour ago
              Yes. That is correct. If I told you I planned on going outside this evening to test whether the sun sets in the east, the best response would be to let me know ahead of time that my hypothesis is wrong.
              • strbean 1 hour ago
                So, based on the source of "Trust me bro.", we'll decide this open question about new technology and the nature of cognition is solved. Seems unproductive.
                • viccis 1 hour ago
                  In addition to what I have posted elsewhere in here, I would point to the fact that this is not indeed an "open question", as LLMs have not produced an entirely new and more advanced model of physics. So there is no reason to suppose they could have done so for QM.
            • anonymous908213 3 hours ago
              "Proposition X" does not need testing. We already know X is categorically false because we know how LLMs are programmed, and not a single line of that programming pertains to thinking (thinking in the human sense, not "thinking" in the LLM sense which merely uses an anthromorphized analogy to describe a script that feeds back multiple prompts before getting the final prompt output to present to the user). In the same way that we can reason about the correctness of an IsEven program without writing a unit test that inputs every possible int32 to "prove" it, we can reason about the fundamental principles of an LLM's programming without coming up with ridiculous tests. In fact the proposed test itself is less eminently verifiable than reasoning about correctness; it could be easily corrupted by, for instance, incorrectly labelled data in the training dataset, which could only be determined by meticulously reviewing the entirety of the dataset.

              The only people who are serious about suggesting that LLMs could possibly 'think' are the people who are committing fraud on the scale of hundreds of billions of dollars (good for them on finding the all-time grift!) and people who don't understand how they're programmed, and thusly are the target of the grift. Granted, given that the vast majority of humanity are not programmers, and even fewer are programmers educated on the intricacies of ML, the grift target pool numbers in the billions.

              • strbean 1 hour ago
                > We already know X is categorically false because we know how LLMs are programmed, and not a single line of that programming pertains to thinking (thinking in the human sense, not "thinking" in the LLM sense which merely uses an anthromorphized analogy to describe a script that feeds back multiple prompts before getting the final prompt output to present to the user).

                Could you elucidate me on the process of human thought, and point out the differences between that and a probabilistic prediction engine?

                I see this argument all over the place, but "how do humans think" is never described. It is always left as a black box with something magical (presumably a soul or some other metaphysical substance) inside.

                • anonymous908213 1 hour ago
                  There is no need to involve souls or magic. I am not making the argument that it is impossible to create a machine that is capable of doing the same computations as the brain. The argument is that whether or not such a machine is possible, an LLM is not such a machine. If you'd like to think of our brains as squishy computers, then the principle is simple: we run code that is more complex than a token prediction engine. The fact that our code is more complex than a token prediction engine is easily verified by our capability to address problems that a token prediction engine cannot. This is because our brain-code is capable of reasoning from deterministic logical principles rather than only probabilities. We also likely have something akin to token prediction code, but that is not the only thing our brain is programmed to do, whereas it is the only thing LLMs are programmed to do.
                • viccis 1 hour ago
                  Kant's model of epistemology, with humans schematizing conceptual understanding of objects through apperception of manifold impressions from our sensibility, and then reasoning about these objects using transcendental application of the categories, is a reasonable enough model of thought. It was (and is I think) a satisfactory answer for the question of how humans can produce synthetic a priori knowledge, something that LLMs are incapable of (don't take my word on that though, ChatGPT is more than happy to discuss [1])

                  1: https://chatgpt.com/share/6965653e-b514-8011-b233-79d8c25d33...

    • metalliqaz 5 hours ago
      Yann LeCun spoke explicitly on this idea recently and he asserts definitively that the LLM would not be able to add anything useful in that scenario. My understanding is that other AI researchers generally agree with him, and that it's mostly the hype beasts like Altman that think there is some "magic" in the weights that is actually intelligent. Their payday depends on it, so it is understandable. My opinion is that LeCun is probably correct.
      • johnsmith1840 5 hours ago
        There is some ability for it to make novel connections but it's pretty small. You can see this yourself having it build novel systems.

        It largely cannot imaginr anything beyond the usual but there is a small part that it can. This is similar to in context learning, it's weak but it is there.

        It would be incredible if meta learning/continual learning found a way to train exactly for novel learning path. But that's literally AGI so maybe 20yrs from now? Or never..

        You can see this on CL benchmarks. There is SOME signal but it's crazy low. When I was traing CL models i found that signal was in the single % points. Some could easily argue it was zero but I really do believe there is a very small amount in there.

        This is also why any novel work or findings is done via MASSIVE compute budgets. They find RL enviroments that can extract that small amount out. Is it random chance? Maybe, hard to say.

        • SoftTalker 49 minutes ago
          Is this so different from what we see in humans? Most people do not think very creatively. They apply what they know in situations they are familiar with. In unfamiliar situations they don't know what to do and often fail to come up with novel solutions. Or maybe in areas where they are very experienced they will come up with something incrementally better than before. But occasionally a very exceptional person makes a profound connection or leap to a new understanding.
          • johnsmith1840 17 minutes ago
            Sure we make small steps at the time but we compound these unlike AI.

            AI cannot compound their learnings for the foreseeable future

      • samuelson 3 hours ago
        Preface: Most of my understand of how LLMs actually work comes from 3blue1brown's videos, so I could easily be wrong here.

        I mostly agree with you, especially about distrusting the self-interested hype beasts.

        While I don't think the models are actually "intelligent", I also wonder if there are insights to be gained by looking at how concepts get encoded by the models. It's not really that the models will add something "new", but more that there might be connections between things that we haven't noticed, especially because academic disciplines are so insular these days.

      • matheusd 3 hours ago
        How about this for an evaluation: Have this (trained-on-older-corpus) LLM propose experiments. We "play the role of nature" and inform it of the results of the experiments. It can then try to deduce the natural laws.

        If we did this (to a good enough level of detail), would it be able to derive relativity? How large of an AI model would it have to be to successfully derive relativity (if it only had access to everything published up to 1904)?

      • mlinksva 3 hours ago
        Do you have a pointer to where LeCun spoke about it? I noticed last October that Dwarkesh mentioned the idea off handedly on his podcast (prompting me to write up https://manifold.markets/MikeLinksvayer/llm-trained-on-data-...) but I wonder if this idea has been around for much longer, or is just so obvious that lots of people are independently coming up with it (parent to this comment being yet another)?
      • catigula 5 hours ago
        This is definitely wrong, most AI researchers DO NOT agree with LeCun.

        Most ML researchers think AGI is imminent.

        • kingstnap 4 hours ago
          Where do you get your majority from?

          I don't think there is any level of broad agreement right now. There are tons of random camps none of which I would consider to be broadly dominating.

        • p_j_w 4 hours ago
          Who is in this group of ML researchers?
        • rafram 3 hours ago
          The ones being paid a million dollars a year by OpenAI to say stuff like that, maybe.
        • johnsmith1840 3 hours ago
          The guy who built chatgpt literally said we're 20 years away?

          Not sure how to interpret that as almost imminent.

          • nottorp 2 hours ago
            > The guy who built chatgpt literally said we're 20 years away?

            20 years away in 2026, still 20 years away in 2027, etc etc.

            Whatever Altman's hyping, that's the translation.

        • goatlover 3 hours ago
          Do you have poll of ML researchers that shows this?
        • paodealho 3 hours ago
          Well, can you point us to their research then? Please.
        • Alex2037 4 hours ago
          their employment and business opportunities depend on the hype, so they will continue to 'think' that (on xitter) despite the current SOTA of transformers-based models being <100% smarter than >3 year old GPT4, and no revolutionary new architecture in sight.
          • catigula 3 hours ago
            You're going to be in for a very rude awakening.
    • imjonse 6 hours ago
      I suppose the vast majority of training data used for cutting edge models was created after 1900.
      • dogma1138 6 hours ago
        Ofc they are because their primary goal is to be useful and to be useful they need to always be relevant.

        But considering that Special Relativity was published in 1905 which means all its building blocks were already floating in the ether by 1900 it would be a very interesting experiment to train something on Claude/Gemini scale and then say give in the field equations and ask it to build a theory around them.

        • famouswaffles 6 hours ago
          His point is that we can't train a Gemini 3/Claude 4.5 etc model because we don't have the data to match the training scale of those models. There aren't trillions of tokens of digitized pre-1900s text.
        • p1esk 6 hours ago
          How can you train a Claude/Gemini scale model if you’re limited to <10% of the training data?
      • kopollo 5 hours ago
        I don't know if this is related to the topic, but GPT5 can convert an 1880 Ottoman archival photograph to English, and without any loss of quality.
    • damnitbuilds 1 hour ago
      I like this, it would be exciting (and scary) if it deduced QM, and informative if it cannot.

      But I also think we can do this with normal LLMs trained on up-to-date text, by asking them to come up with any novel theory that fits the facts. It does not have to be a groundbreaking theory like QM, just original and not (yet) proven wrong ?

    • a-dub 6 hours ago
      yeah i was just wondering that. i wonder how much stem material is in the training set...
      • signa11 6 hours ago
        i will go for ‘aint gonna happen for a 1000 dollars alex’
    • nickpsecurity 3 hours ago
      That would be an interesting experiment. It might be more useful to make a model with a cut off close to when copyrights expire to be as modern as possible.

      Then, we have a model that knows quite a bit in modern English. We also legally have a data set for everything it knows. Then, there's all kinds of experimentation or copyright-safe training strategies we can do.

      Project Gutenberg up to the 1920's seems to be the safest bet on that.

  • dash2 4 hours ago
    Mm. I'm a bit sceptical of the historical expertise of someone who thinks that "Who art Henry" is 19th century language. (It's not actually grammatically correct English from any century whatever: "art" is the second person singular, so this is like saying "who are Henry?")
    • joshuakoehler 4 hours ago
      As a reader of a lot of 17th, 18th, and 19th century Christian books, this was my thought exactly.
    • auraham 4 hours ago
      Can you elaborate on this? After skimming the README, I understand that "Who art Henry" is the prompt. What should be the correct 19th century prompt?
      • canjobear 4 hours ago
        "Who art Henry?" was never grammatical English. "Art" was the second person singular present form of "to be" and it was already archaic by the 17th century. "Who is Henry?" would be fine.
      • andai 3 hours ago
        Who art thou?

        (Well, not 19th century...)

      • vintermann 3 hours ago
        "Who is Henry?"
  • eqmvii 6 hours ago
    Could this be an experiment to show how likely LLMs are to lead to AGI, or at least intelligence well beyond our current level?

    If you could only give it texts and info and concepts up to Year X, well before Discovery Y, could we then see if it could prompt its way to that discovery?

    • ben_w 6 hours ago
      > Could this be an experiment to show how likely LLMs are to lead to AGI, or at least intelligence well beyond our current level?

      You'd have to be specific what you mean by AGI: all three letters mean a different thing to different people, and sometimes use the whole means something not present in the letters.

      > If you could only give it texts and info and concepts up to Year X, well before Discovery Y, could we then see if it could prompt its way to that discovery?

      To a limited degree.

      Some developments can come from combining existing ideas and seeing what they imply.

      Other things, like everything to do with relativity and quantum mechanics, would have required experiments. I don't think any of the relevant experiments had been done prior to this cut-off date, but I'm not absolutely sure of that.

      You might be able to get such an LLM to develop all the maths and geometry for general relativity, and yet find the AI still tells you that the perihelion shift of Mercury is a sign of the planet Vulcan rather than of a curved spacetime: https://en.wikipedia.org/wiki/Vulcan_(hypothetical_planet)

      • grimgrin 5 hours ago
        An example of why you need to explain what you mean by AGI is:

        https://www.robinsloan.com/winter-garden/agi-is-here/

      • opponent4 5 hours ago
        > You'd have to be specific what you mean by AGI

        Well, they obviously can't. AGI is not science, it's religion. It has all the trappings of religion: prophets, sacred texts, origin myth, end-of-days myth and most importantly, a means to escape death. Science? Well, the only measure to "general intelligence" would be to compare to the only one which is the human one but we have absolutely no means by which to describe it. We do not know where to start. This is why you scrape the surface of any AGI definition you only find circular definitions.

        And no, the "brain is a computer" is not a scientific description, it's a metaphor.

        • strbean 4 hours ago
          > And no, the "brain is a computer" is not a scientific description, it's a metaphor.

          Disagree. A brain is turing complete, no? Isn't that the definition of a computer? Sure, it may be reductive to say "the brain is just a computer".

          • opponent4 3 hours ago
            Not even close. Turing complete does not apply to the brain plain and simple. That's something to do with algorithms and your brain is not a computer as I have mentioned. It does not store information. It doesn't process information. It just doesn't work that way.

            https://aeon.co/essays/your-brain-does-not-process-informati...

            • strbean 1 hour ago
              > Forgive me for this introduction to computing, but I need to be clear: computers really do operate on symbolic representations of the world. They really store and retrieve. They really process. They really have physical memories. They really are guided in everything they do, without exception, by algorithms.

              This article seems really hung up on the distinction between digital and analog. It's an important distinction, but glosses over the fact that digital computers are a subset of analog computers. Electrical signals are inherently analog.

              This maps somewhat neatly to human cognition. I can take a stream of bits, perform math on it, and output a transformed stream of bits. That is a digital operation. The underlying biological processes involved are a pile of complex probabilistic+analog signaling, true. But in a computer, the underlying processes are also probabilistic and analog. We have designed our electronics to shove those parts down to the lowest possible level so they can be abstracted away, and so the degree to which they influence computation is certainly lower than in the human brain. But I think an effective argument that brains are not computers is going to have to dive in to why that gap matters.

            • nearbuy 1 hour ago
              That is an article by a psychologist, with no expertise in neuroscience, claiming without evidence that the "dominant cognitive neuroscience" is wrong. He offers no alternative explanation on how memories are stored and retrieved, but argues that large numbers of neurons across the brain are involved and he implies that neuroscientists think otherwise.

              This is odd because the dominant view in neuroscience is that memories are stored by altering synaptic connection strength in a large number of neurons. So it's not clear what his disagreement is, and he just seems to be misrepresenting neuroscientists.

              Interestingly, this is also how LLMs store memory during training: by altering the strength of connections between many artificial neurons.

            • Closi 2 hours ago
              A human is effectively turning complete if you give the person paper and pen and the ruleset, and a brain clearly stores information and processes it to some extent, so this is pretty unconvincing. The article is nonsense and badly written.

              > But here is what we are not born with: information, data, rules, software, knowledge, lexicons, representations, algorithms, programs, models, memories, images, processors, subroutines, encoders, decoders, symbols, or buffers – design elements that allow digital computers to behave somewhat intelligently. Not only are we not born with such things, we also don’t develop them – ever.

              Really? Humans don't ever develop memories? Humans don't gain information?

            • anthonypasq 3 hours ago
              ive gotta say this article was not convincing at all.
            • mistermann 3 hours ago
              [dead]
        • ben_w 4 hours ago
          Cargo cults are a religion, the things they worship they do not understand, but the planes and the cargo themselves are real.

          There's certainly plenty of cargo-culting right now on AI.

          Sacred texts, I don't recognise. Yudkowsky's writings? He suggests wearing clown shoes to avoid getting a cult of personality disconnected from the quality of the arguments, if anyone finds his works sacred, they've fundamentally misunderstood him:

            I have sometimes thought that all professional lectures on rationality should be delivered while wearing a clown suit, to prevent the audience from confusing seriousness with solemnity.
          
          - https://en.wikiquote.org/wiki/Eliezer_Yudkowsky

          Prophets forecasting the end-of-days, yes, but this too from climate science, from everyone who was preparing for a pandemic before covid and is still trying to prepare for the next one because the wet markets are still around, from economists trying to forecast growth or collapse and what will change any given prediction of the latter into the former, and from the military forces of the world saying which weapon systems they want to buy. It does not make a religion.

          A means to escape death, you can have. But it's on a continuum with life extension and anti-aging medicine, which itself is on a continuum with all other medical interventions. To quote myself:

            Taking a living human's heart out without killing them, and replacing it with one you got out a corpse, that isn't the magic of necromancy, neither is it a prayer or ritual to Sekhmet, it's just transplant surgery.
          
            …
          
            Immunity to smallpox isn't a prayer to the Hindu goddess Shitala (of many things but most directly linked with smallpox), and it isn't magic herbs or crystals, it's just vaccines.
          
          - https://benwheatley.github.io/blog/2025/06/22-13.21.36.html
      • markab21 5 hours ago
        Basically looking for emergent behavior.
    • water-data-dude 5 hours ago
      It'd be difficult to prove that you hadn't leaked information to the model. The big gotcha of LLMs is that you train them on BIG corpuses of data, which means it's hard to say "X isn't in this corpus", or "this corpus only contains Y". You could TRY to assemble a set of training data that only contains text from before a certain date, but it'd be tricky as heck to be SURE about it.

      Ways data might leak to the model that come to mind: misfiled/mislabled documents, footnotes, annotations, document metadata.

      • gwern 4 hours ago
        There's also severe selection effects: what documents have been preserved, printed, and scanned because they turned out to be on the right track towards relativity?
        • mxfh 2 hours ago
          This.

          Especially for London there is a huge chunk of recorded parliament debates.

          More interesting for dialoge seems training on recorded correspondence in form of letters anyway.

          And that corpus script just looks odd to say the least, just oversample by X?

    • alansaber 6 hours ago
      I think not if only for the fact that the quantity of old data isn't enough to train anywhere near a SoTA model, until we change some fundamentals of LLM architecture
      • andyfilms1 6 hours ago
        I mean, humans didn't need to read billions of books back then to think of quantum mechanics.
        • alansaber 5 hours ago
          Which is why I said it's not impossible, but current LLM architecture is just not good enough to achieve this.
        • famouswaffles 5 hours ago
          Right, what they needed was billions of years of brute force and trial and error.
      • franktankbank 6 hours ago
        Are you saying it wouldn't be able to converse using english of the time?
        • ben_w 5 hours ago
          Machine learning today requires an obscene quantity of examples to learn anything.

          SOTA LLMs show quite a lot of skill, but they only do so after reading a significant fraction of all published writing (and perhaps images and videos, I'm not sure) across all languages, in a world whose population is 5 times higher than the link's cut off date, and the global literacy went from 20% to about 90% since then.

          Computers can only make up for this by being really really fast: what would take a human a million or so years to read, a server room can pump through a model's training stage in a matter of months.

          When the data isn't there, reading what it does have really quickly isn't enough.

        • wasabi991011 5 hours ago
          That's not what they are saying. SOTA models include much more than just language, and the scale of training data is related to its "intelligence". Restricting the corpus in time => less training data => less intelligence => less ability to "discover" new concepts not in its training data
          • franktankbank 5 hours ago
            Perhaps less bullshit though was my thought? Was language more restricted then? Scope of ideas?
    • armcat 5 hours ago
      I think this would be an awesome experiment. However you would effectively need to train something of a GPT-5.2 equivalent. So you need lot of text, a much larger parameterization (compared to nanoGPT and Phi-1.5), and the 1800s equivalents of supervised finetuning and reinforcement learning with human feedback.
    • dexwiz 5 hours ago
      This would be a true test of can LLMs innovate or just regurgitate. I think part of people's amazement of LLMs is they don't realize how much they don't know. So thinking and recalling look the same to the end user.
    • Trufa 5 hours ago
      This is fascinating, but the experiment seems to fail in being a fair comparison of how much knowledge can we have from that time in data vs now.

      As a thought experiment I find it thrilling.

    • nickpsecurity 1 hour ago
      That is one of the reasons I want it done. We cant tell if AI's are parroting training data without having the whole, training data. Making it old means specific things won't be in it (or will be). We can do more meaningful experiments.
    • Rebuff5007 5 hours ago
      OF COURSE!

      The fact that tech leaders espouse the brilliance of LLMs and don't use this specific test method is infuriating to me. It is deeply unfortunate that there is little transparency or standardization of the datasets available for training/fine tuning.

      Having this be advertised will make more interesting and informative benchmarks. OEM models that are always "breaking" the benchmarks are doing so with improved datasets as well as improved methods. Without holding the datasets fixed, progress on benchmarks are very suspect IMO.

    • mistermann 3 hours ago
      [dead]
    • feisty0630 5 hours ago
      I fail to see how the two concepts equate.

      LLMs have neither intelligence nor problem-solving abillity (and I won't be relaxing the definition of either so that some AI bro can pretend a glorified chatbot is sentient)

      You would, at best, be demonstrating that the sharing of knowledge across multiple disciplines and nations (which is a relatively new concept - at least at the scale of something like the internet) leads to novel ideas.

      • al_borland 5 hours ago
        I've seen many futurists claim that human innovation is dead and all future discoveries will be the results of AI. If this is true, we should be able to see AI trained on the past figure it's way to various things we have today. If it can't do this, I'd like said futurists to quiet down, as they are discouraging an entire generation of kids who may go on to discover some great things.
        • skissane 5 hours ago
          > I've seen many futurists claim that human innovation is dead and all future discoveries will be the results of AI.

          I think there's a big difference between discoveries through AI-human synergy and discoveries through AI working in isolation.

          It probably will be true soon (if it isn't already) that most innovation features some degree of AI input, but still with a human to steer the AI in the right direction.

          I think an AI being able to discover something genuinely new all by itself, without any human steering, is a lot further off.

          If AIs start producing significant quantities of genuine and useful innovation with minimal human input, maybe the singularitarians are about to be proven right.

        • thinkingemote 4 hours ago
          I'm struggling to get a handle on this idea. Is the idea that today's data will be the data of the past, in the future?

          So if it can work with whats now past, it will be able to work with the past in the future?

          • al_borland 1 hour ago
            Essentially, yes.

            If the prediction is that AI will be able to invent the future. If we give it data from our past without knowledge of the present... what type of future will it invent, what progress will it make, if any at all? And not just having the idea, but how to implement the idea in a way that actually works with the technology of the day, and can build on those things over time.

            For example, would AI with 1850 data have figured out the idea of lift to make an airplane and taught us how to make working flying machines and progress them to the jets we have today, or something better? It wouldn't even be starting from 0, so this would be a generous example, as da Vinci way playing with these ideas in the 15th century.

            If it can't do it, or what it produces is worse than what humans have done, we shouldn't leave it to AI alone to invent our actual future. Which would mean reevaluating the role these "thought leaders" say it will play, and how we're educating and communicating about AI to the younger generations.

  • tgtweak 3 hours ago
    Very interesting but the slight issue I see here is one of data: the information that is recorded and in the training data here is heavily skewed to those intelligent/recognized enough to have recorded it and had it preserved - much less than the current status quo of "everyone can trivially document their thoughts and life" diorama of information we have today to train LLMs on. I suspect that a frontier model today would have 50+TB of training data in the form of text alone - and that's several orders of magnitude more information and from a much more diverse point of view than what would have survived from that period. The output from that question "what happened in 1834" read like a newspaper/bulletin which is likely a huge part of the data that was digitized (newspapers etc).

    Very cool concept though, but it definitely has some bias.

    • twosdai 2 hours ago
      > but it definitely has some bias.

      to be frank though, I think this a better way than all people's thoughts all of the time.

      I think the "crowd" of information makes the end output of an LLM worse rather than better. Specifically in our inability to know really what kind of Bias we're dealing with.

      Currently to me it feels really muddy knowing how information is biased, beyond just the hallucination and factual incosistencies.

      But as far as I can tell, "correctness of the content aside", sometimes frontier LLMs respond like freshman college students, other times they respond with the rigor of a mathematics PHD canidate, and sometimes like a marketing hit piece.

      This dataset has a consistency which I think is actually a really useful feature. I agree that having many perspectives in the dataset is good, but as an end user being able to rely on some level of consistency with an AI model is something I really think is missing.

      Maybe more succinctly I want frontier LLM's to have a known and specific response style and bias which I can rely on, because there already is a lot of noise.

    • notarobot123 2 hours ago
      Biases exposed through artificial constraints help to make visible the hidden/obscured/forgotten biases of state-of-the-art systems.
    • nickpsecurity 1 hour ago
      Models today will be biased based on what's in their training data. If English, it will be biased heavily toward Western, post-1990's views. Then, they do alignment training that forces them to speak according to the supplier's morals. That was Progressive, atheist, evolutionist, and CRT when I used them years ago.

      So, the OP model will accidentally reflect the biases of the time. The current, commercial models intentionally reflect specific biases. Except for uncensored models which accidentally have those in the training data modified by uncensoring set.

  • addaon 5 hours ago
    Suppose two models with similar parameters trained the same way on 1800-1875 and 1800-2025 data. Running both models, we get probability distributions across tokens, let's call the distributions 1875' and 2025'. We also get a probability distribution finite difference (2025' - 1875'). What would we get if we sampled from 1.1*(2025' - 1875') + 1875'? I don't think this would actually be a decent approximation of 2040', but it would be a fun experiment to see. (Interpolation rather than extrapolation seems just as unlikely to be useful and less likely to be amusing, but what do I know.)
    • pvab3 3 hours ago
      What if it's just genAlpha slang?
      • andai 3 hours ago
        The real mode collapse ;)
  • radarsat1 2 hours ago
    Heh, at least this wouldn't spread emojis all over my readmes. Hm, come to think of it I wonder how much tokenization is affected.

    Another thought, just occurred when thinking about readmes and coding LLMs: obviously this model wouldn't have any coding knowledge, but I wonder if it could be possible to combine this somehow with a modern LLM in such a way that it does have coding knowledge, but it renders out all the text in the style / knowledge level of the 1800's model.

    Offhand I can't think of a non-fine-tuning trick that would achieve this. I'm thinking back to how the old style transfer models used to work, where they would swap layers between models to get different stylistic effects applied. I don't know if that's doable with an LLM.

    • fluoridation 13 minutes ago
      Just have the models converse with each other?
  • chc4 2 hours ago
    I think it would be very cute to train a model exclusively in pre-information age documents, and then try to teach it what a computer is and get it to write some programs. That said, this doesn't look like it's nearly there yet, with the output looking closer to Markov chain than ChatGPT quality.
  • jimmytucson 5 hours ago
    Fascinating idea. There was another "time-locked" LLM project that popped up on HN recently[1]. Their model output is really polished but the team is trying to figure out how to avoid abuse and misrepresentation of their goals. We think it would be cool to talk to someone from 100+ years ago but haven't seriously considered the many ways in which it would be uncool. Interesting times!

    [1] https://news.ycombinator.com/item?id=46319826

  • InvisibleUp 5 hours ago
    If the output of this is even somewhat coherent, it would disprove the argument that mass amounts of copyrighted works are required to train an LLM. Unfortunately that does not appear to be the case here.
    • HighFreqAsuka 4 hours ago
      Take a look at The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text (https://arxiv.org/pdf/2506.05209). They build a reasonable 7B parameter model using only open-licensed data.
      • nickpsecurity 1 hour ago
        They mostly do that. They risked legal contamination by using Whisper-derived text and web text which might have gotchas. Other than that, it was a great collection for low-risk training.
  • hallvard 5 hours ago
    Cool! I also did something like this: https://github.com/hallvardnmbu/transformer

    But on various data (i.e., separate model per source): the Bible, Don Quixote and Franz Kafka. (As well as a (bad!) lyrics generator, and translator.)

  • cowlby 3 hours ago
    I wonder if you could train an LLM with everything up to Einstein. Then see if with thought experiments + mathematics you could arrive at general relativity.
    • erenkaradag 59 minutes ago
      The problem is that the 'genius' of Einstein wasn't just synthesizing existing data,but actively rejecting the axioms of that data. The 1875 corpus overwhelmingly 'proves' absolute time and the luminiferous aether. A model optimizing for the most probable continuation will converge on that consensus.

      To get Relativity, the model needs to realize the training data isn't just incomplete, but fundamentally wrong. That requires abductive reasoning (the spark of genius) to jump out of the local minimum. Without that AGI-level spark, a 'pure knowledge pile' will just generate a very eloquent, mathematically rigorous defense of Newtonian physics.

  • simonw 6 hours ago
    Anyone seen a low-friction way to run prompts through this yet, either via a hosted API or chat UI or a convenient GGML or MLX build that runs in Ollama or llama.cpp or LM Studio?
    • throwaway18875 4 hours ago
      Currently running it using LM Studio. It can download it from Hugging Face. It generates incoherent text though

      ===

      You:

      I pray you, who is this Master Newton?

      timecapsulellm-v2-1800-1875-mlx:

      TI offer to pay you the very same fee as you did before. It was not in the power of your master to deliver the letter to your master. He did. I will be with you as soon as I can keep my word. It is not at all clear, whether the letter has been sent or not. It is not at all clear: but it is clear also that it was written by the person who gave it. "No," I said, "I cannot give it to you." There, the letter was sent to me. "The letter is yours, I believe," I said. "But, I hope, you will not refuse to give it to me?

    • t1amat 6 hours ago
      Not a direct answer but it looks like v0.5 is a nanoGPT arch and v1 is a Phi 1.5 arch, which should be well supported by quanting utilities for any engine. They are small too and should be able to be done on a potato.
    • alansaber 5 hours ago
      I too have completely forgotten how the adapters library works and would have appreciated a simple inference script
    • philmo1 6 hours ago
      +1
  • myrmidon 5 hours ago
    There was a discussion around a very similar model (Qwen3 based) some weeks ago:

    https://news.ycombinator.com/item?id=46319826

    I found it particularly thought-inspiring how a model with training from that time period completely lacks context/understanding of what it is itself, but then I realized that we are the same (at least for now).

  • sl_convertible 3 hours ago
    Harry Seldon would, no doubt, find this fascinating. Imagine having a sliding-window LLM that you could use to verify a statistical model of society. I wonder what patterns it could deduce?
  • patcon 3 hours ago
    > OCR noise (“Digitized by Google”) still present in outputs

    This feels like a neat sci-fi short story hook to explain the continuous emergence of God as an artifact of a simulation

    • fluoridation 9 minutes ago
      I'm reminded of SD models that put vaguely-shaped Patreon logos in the corner.
  • abhishekjha 6 hours ago
    Oh I have really been thinking long about this. The intelligence that we have in these models represent a time.

    Now if I train a foundation models with docs from library of Alexandria and only those texts of that period, I would have a chance to get a rudimentary insight on what the world was like at that time.

    And maybe time shift further more.

    • feisty0630 6 hours ago
      > I would have a chance to get a rudimentary insight on what the world was like at that time

      Congratulations, you've reinvented the history book (just with more energy consumption and less guarantee of accuracy)

      • gordonhart 5 hours ago
        History books, especially those from classical antiquity, are notoriously not guaranteed to be accurate either.
        • feisty0630 5 hours ago
          Do you expect something exclusively trained on them to be any better?
          • gordonhart 5 hours ago
            To a large extent, yes. A model trained on many different accounts of an event is likely going to give a more faithful picture of that event than any one author.

            This isn't super relevant to us because very few histories from this era survived, but presumably there was sufficient material in the Library of Alexandria to cover events from multiple angles and "zero out" the different personal/political/religious biases coloring the individual accounts.

      • lcfcjs6 5 hours ago
        [dead]
  • CGMthrowaway 52 minutes ago
    Is there a link where I can try it out?
  • dlcarrier 5 hours ago
    It's interesting that it's trained off only historic text.

    Back in the pre-LLM days, someone trained a Markov chain off the King James Bible and a programming book: https://www.tumblr.com/kingjamesprogramming

    I'd love to see an LLM equivalent, but I don't think that's enough data to train from scratch. Could a LoRA or similar be used in a way to get speech style to strictly follow a few megabytes worth of training data?

  • krunck 4 hours ago
    Training LLMs on data with certain date cut-offs and then doing comparative analysis between the LLMs would be interesting.
  • chuckadams 2 hours ago
    Think I'll ask it to come up with some jacquard loom patterns. vibe-weaving.
  • aqme28 6 hours ago
    This kind of technique seems like a good way to test model performance against benchmarks. I'm too skeptical that new models are taking popular benchmark solutions into their training data. So-- how does e.g. ChatGPT's underlying architecture perform on SWE-bench if trained only on data prior to 2024.
    • NitpickLawyer 5 hours ago
      > are taking popular benchmark solutions into their training data

      That happened in the past, and the "naive" way of doing it is usually easy to spot. There are, however, many ways in which testing data can leak into models, even without data contamination. However this doesn't matter much, as any model that only does well in benchmarks but is bad in real-world usage will be quickly sussed out by people actually using them. There are also lots and lots of weird, not very popular benchmarks out there, and the outliers are quickly identified.

      > perform on SWE-bench if trained only on data prior to 2024.

      There's a benchmark called swe-REbench, that takes issues from real-world repos, published ~ monthly. They perform tests and you can select the period and check their performance. This is fool-proof for open models, but a bit unknown for API-based models.

  • HarHarVeryFunny 6 hours ago
    It would be interesting if there's enough data to train a model capable enough to converse with and ask about contemporary views on issues of the day, or what it thought about "potential" future events/technologies yet to happen.
  • Aperocky 2 hours ago
    Looks a lot like the output from a markov chain...
  • zingababba 6 hours ago
    LOL

    PROMPT:Charles Darwin

    Charles DarwinECCEMACY. Sir, — The following case is interesting to me : — I was in London a fortnight, and was much affected with an attack of rheumatism. The first attack of rheumatism was a week before I saw you, and the second when I saw you, and the third when I saw you, and the third in the same time. The second attack of gout, however, was not accompanied by any febrile symptoms, but was accompanied with an increased flow of urine, and with a more copious discharge of urine. The third attack was an hour after I saw you, and was succeeded by a return of a paroxysm of gout, and a more rapid return of the gout. The fourth attack was also accompanied by a fever, but was not always accompanied by any febrile symptoms. The third attack of gout was a fortnight after you had been ill, and the fourth was followed by a paroxysm of gout. The fourth attack was a fortnight after you were attacked, and was accompanied by a sense

    • alansaber 5 hours ago
      Average small language model response
    • feisty0630 5 hours ago
      Interesting that it reads a bit like it came from a Markov chain rather than an LLM. Perhaps limited training data?
      • kgeist 1 hour ago
        Early LLMs used to have this often. I think's that where the "repetition penalty" parameter comes from. I suspect output quality can be improved with better sampling parameters.
  • radiothomp 3 hours ago
    A LLM trained only on data from certain time periods to ~reduce modern bias~ enhance past bias
  • marmalade2413 5 hours ago
    Can you confidently say that the architure of the LLM doesn't include any a priori bias that might effect the integrity of this LLM?

    That is, the architectures of today are chosen to yield the best results given the textual data around today and the problems we want to solve today.

    I'd argue that this lack of bias would need to be researched (if it hasn't been already) before this kind of model has credence.

    LLMs aren't my area of expertise but during my PhD we were able to encode a lot of a priori knowledge through the design of neural network architectures.

  • escapecharacter 2 hours ago
    I would pay like $200/month if there was an LLM out there that I could only communicate with using an old-timey telegraph key and morse code.
  • dhruv3006 6 hours ago
    This will be something good - would love something on Ollama or lmstudio.
  • philmo1 6 hours ago
    Exciting idea!
  • srigi 6 hours ago
    "I'm sorry, my knowledge cuttoff is 1875"
  • tonymet 3 hours ago
    the "1917 model" from a few weeks back post-trained the model with ChatGPT dialog. So it had modern dialect and proclivities .

    A truly authentic historical model will have some unsavory opinions and very distinctive dialect.

  • huflungdung 2 hours ago
    [dead]
  • Swoerd 5 hours ago
    [dead]
  • orthecreedence 5 hours ago
    [flagged]
  • dogemaster2032 5 hours ago
    [flagged]
  • ourmandave 4 hours ago
    Can I use it to get up-to-date legal advice on Arizona reproductive health laws?