Logos Language Guide: Compile English to Rust

(logicaffeine.com)

52 points | by tristenharr 5 days ago

10 comments

gouggoug 1 day ago
I wonder who the target is for such language.
What's difficult with programming isn't the language itself, it's everything else: understanding concepts, algorithms, programming patterns, the science.
It feels a bit like trying to reinvent the language of mathematics. It'd be really inefficient writing math in plain English.
[-]
- forgotpwd16 1 day ago
  There's no target. Someone is just experimenting with Claude. 2026 gonna be year of slop. And note this project is not FOSS. (Not sure what author is thinking. Don't they know nowadays someone can code launder their code through Claude?)
  P.S. The English-to-AST though could be useful to other projects that may want natural-ish language input without having to resort to a LLM. E.g. a modify CSV tool in natural language like one posted yesterday.
  [-]
  - tristenharr 1 day ago
    Hello, author here! The license is BSL 1.1 based on the MariaDB license, the source transitions to MIT on December 24th 2029. We're a small bootstrapped team, and I was worried if I went full on FOSS from the get-go a big player might resell it with a easy one click button to deploy things like the playground and such that's coming soon and I'd struggle to feed myself while maintaining a potentially growing project while others reaped the fruits of the labor. I've seen that kind of thing happen a lot in recent years. I also am aware somebody could code-launder things, but personally I'd take that as a compliment, if somebody truly wants to copy my programming language and such, then I'd be glad to have inspired someone haha! We're tiny, bootstrapped, and nobody has ever heard of us so that kind of attention alone would be awesome!
    It's free for individuals, orgs with < 25 people, educators, students, and non-profits, and I'm currently still working through monetization but I'm thinking of taking two paths, one being payment to get the Z3 verification feature that lets you mathematically verify that the code won't panic at runtime. The other being payment to use the tokenizer that will be built with this. If you look here you can see the lexicon to get a better idea how the english compile pipeline works. https://github.com/Brahmastra-Labs/logicaffeine/blob/main/as...
    This also makes the language highly configurable as you can change any of the key-words to better suit your brain if you so chose.
    Current LLM's biggest bottlenecks in my personal opinion would be the tokenizers and the way they get their info. Imagine if you got fed in random chunks of tokens the way they do. If you could create an AST of the english and use that to tokenize things instead... well at least I have some hair-brained theories here I want to test out. Standard LLM tokenizers are statistical and they chop words into chunks based on frequency, often breaking semantic units. This lexer could perform morphological normalization on the fly, an LLM spends millions of parameters to learn that the word "The" usually precedes a noun, but this parser knows that deterministically. This could be used to break things into clauses rather than arbitrary windows. Even just as a tool for compaction and goal tracking and rule following this could be super useful is my theory. A semantic tokenizer could potentially feed an LLM all parse trees to teach it ambiguity.
    There is a test suite of over 1500 passing tests. I do utilize Claude, but I try really hard to prevent it from becoming slop. Development follows a strict RED/GREEN TDD cycle, where the feature gets specced out first, the plan and spec gets refined and tests get designed, then the tests get written and then implementation occurs. It is somewhat true that I can't make as many promises about the code regarding untested behavior, but I can make promises regarding the things that have been tested. The test suite is wired directly into CI. I guess it is fair that some people will feel any code written with the assistance of an LLM is slop, but everyone is still working out their workflows and personally you can find mine here: https://github.com/Brahmastra-Labs/logicaffeine/blob/main/Tr...
    TLDR of it would be: 1. Don't Vibe-Code 2. One-shot things in a loop and if you fail use git stash. 3. Spend 95% of the time cleaning the project and writing specifications, spend 5% of the time implementing. 4. Create a generate-docs.sh script that dumps your entire project into a single markdown file. 5. Summon a council of experts and have them roleplay. 6. Use the council to create a specification for the thing you are working on. 7. Iterate and refine the specification until it is pristine. 8. Only begin to code when the specification is ready. Use TDD with red/green tests.
    I'm always learning though, so please if you've got suggestions on better ways share them!
    [-]
    - Folcon 1 day ago
      Hey, so serious question, have you thought about making this language bidirectional?
      If it's supposed to be a learning resource, that would allow students to process rust and at least begin to understand what it means?
      [-]
      - tristenharr 20 hours ago
        Yes, absolutely! I definitely want to look into this, although it's not the top of the current roadmap.
        To me, the first step is going to be to really work through and trying to get this right. Do user studies. Watch people write code in this. Watch people with lots of experience, and people with none get tossed into a project written in the LOGOS and told nothing.
        Once the language surface is more solid and not as likely to go through major changes, I want to focus our efforts in that direction.
        [-]
        Folcon 17 hours ago
        Don't take this the wrong way, but my understanding was that you're vibe coding it?
        If that's the case I'd do this from day 1, your parser should be a 1 to 1 mapping of some text to code, this you can easily and rigourously test, then if you want to, you can do other stuff on top
      - yumberjack 1 day ago
        I like this idea a lot…
- layer8 1 day ago
  Math started out as being written in plain natural language, until the 17th century or so, and yes, it was inefficient, that’s why it eventually changed.
tristenharr 5 days ago
Happy new year HN! After a few late nights this past week, I've finally got this one over the finish line just in time to kick the year off.
I'm happy to answer any questions that might pop up...
Some things of note:
Built-in P2P Mesh Networking
Listen on "/ip4/0.0.0.0/tcp/8080". Connect to "/ip4/192.168.1.5/tcp/8080". Sync counter on "game-room".
That's all you need for libp2p, QUIC transport, mDNS discovery, GossipSub pub/sub
Full conflict-free replicated data types: - GCounter, PNCounter — distributed counters - ORSet with configurable AddWins/RemoveWins bias - RGA, YATA — sequence CRDTs for collaborative text editing - Vector clocks, dot contexts, delta CRDTs
Wrap any CRDT in Distributed<T> and get: - Automatic journaling to disk (CRC32 checksums, auto-compaction at 1000 entries) - Automatic GossipSub replication to all peers - Unified flow: Local mutation → Journal → Network. Remote update → RAM → Journal. - Survives restarts, offline nodes, network partitions.
Go-Style Concurrency - TaskHandle<T> — spawnable async tasks with abort - Pipe<T> — bounded channels (sender/receiver split) - check_preemption() — cooperative yielding every 10ms for fairness
There's more... but those are my personal favorite features.
I've put a good bit of work into this, so I hope you all can appreciate and maybe find some uses for it here on my favorite place on the interwebs!
[-]
- tombert 1 day ago
  These features seem considerably more interesting than the "English to Rust" feature. These data structures and the concurrency stuff seems pretty neat.
- nextaccountic 1 day ago
  > Wrap any CRDT in Distributed<T>
  Is this in a crate in crates.io?
  [-]
  - tristenharr 21 hours ago
    Crates coming soon! :)
- koakuma-chan 1 day ago
  Do I really want to yield every 10 ms?
  [-]
  - gpm 1 day ago
    You already do, to the kernel. It's probably not much more costly to do so an extra time in userspace.
    [-]
    - koakuma-chan 1 day ago
      I yield to the kernel to allow other threads that do some kind of background work to run. Do I want my application's async tasks to yield every 10ms? I assume that is what is being meant here.
      [-]
      - tristenharr 1 day ago
        That is a valid concern! To clarify:
        Configurability: We absolutely plan to make the 10ms yield interval configurable (or opt-out) in the runtime settings. It is currently a default safety rail to prevent async starvation, not a hard constraint.
        Concurrency Options: It is important to note that LOGOS has three distinct execution primitives, and this yield logic doesn't apply to all of them:
        Simultaneously: (Parallel CPU): This compiles to rayon::join or dedicated std::threads. It does not use the cooperative yield check, allowing full blocking CPU usage on separate cores.
        Attempt all: (Async Join) & Launch a task: (Green Threads): These compile to tokio async tasks. The cooperative yield is specifically here to prevent a single heavy async task from blocking the shared reactor loop.
        So if you need raw, uninterrupted CPU cycles, you would use Simultaneously, and you wouldn't be forced to yield.
      - nextaccountic 1 day ago
        with cooperative scheduling, yes. This is indeed something missing from the Rust async ecosystem, tasks are meant to be IO-bound and if they become CPU-bound accidentally they will starve other tasks (async-std tried to fix this, but had some backlash due to overhead IIRC). Go actually puts a yield on every loop body (or used to), to prevent starvation. A 10ms thing will have negligible impact
        Also: yielding to the kernel is very costly. Yielding to another task in your own thread is practically free in comparison
tristenharr 1 day ago
Oh, I'm glad this got picked up! I posted it on New Years day and wasn't sure if it was going to be!
As a bit of background on myself and my goals/targets with this.
I started my career as an embedded software developer writing uCos-III for an RTOS working on medical devices for Cardinal Health where I primarily worked on enteral feeding pumps. From there, I spent a couple years in fintech, before trying my hand at my first startup where I co-founded a company in the quick commerce space. (Think similar to Doordash Dashmarts). When that fell apart I took a job at Hasura where I wrote GraphQL to SQL transpilers and other neat things for about 18 months. I've worked across a few different domains and the motivation behind writing this language is that I am preparing to teach my 13 year old brother how to code and wanted something I could get low level with him on, without losing him altogether.
This is a dual-purpose language and has a dual-AST, and the things I'm currently working on... having switched gears towards spending a couple days on the Logical side of things are adding an actual prover to the Logical AST. I'm getting ready to add a derivation tree struct and incorporate the ability to use a solver to do things like Modus Ponens, Universal Instantiation, etc. I also want to upgrade the engine to be able to reason about numbers and recursion similar to Lean with inductive types.
This is an early access product, it is free for students, educators, non-profits, orgs with < 25 people, and individuals.
It would make my day if I could get some rigorous HN style discussions, and even the critiques!! The feedback and discussions are invaluable and will help shape the direction of this effort.
What a lovely surprise doing my daily HN check before bed and seeing this post. :)
EDIT: I will read and respond to comments when I get up in the morning, feel free to ask questions! Or make suggestions.
dented42 1 day ago
Ah. So we’re recreating COBOL in 2026 I see.
[-]
- iberator 1 day ago
  Cobol is still actively developed and maintained by IBM.
- tombert 1 day ago
  You said it before I did; wasn't this the basic point of COBOL? TO make something that more naturally read like English but could be executed.
  It's a cute idea, though I think the consensus is that once you actually learn a programming language, it generally doesn't help to have it look like prose.
  [-]
  - anonymous908213 1 day ago
    I wouldn't be so sure of that consensus, given that C# and Python exist and are generally well-regarded by their users. Clearly there are varying degrees to it, and taking the idea to its logical extreme is not by necessity going to produce the best result, but there's certainly merit to the idea of code that can be read more naturally.
    And I think that is really the point of syntax sugar: reading code, not writing code. It seems like a misconception about syntax sugar is that its primary purpose is to make code easier for beginners to learn to write. But I would contend that the real purpose is to make code easier for even experienced programmers to read at a glance, because reading code is actually far more important than writing it.
    ...granted a certain subsection of the population has determined that reading code is for chumps and boast about how quickly they can use a tool to write lines of code they haven't even read, and that this is the future of software development. Despite their boasts I have yet to see any software I would actually want to use that was written in this manner, though.
    [-]
    - wtetzner 1 day ago
      I don't think C# and Python are particularly close to natural language. I also don't think making a language read more like English really makes it more readable. If that was true people wouldn't struggle with reading legalese.
      [-]
      - anonymous908213 1 day ago
        You can absolutely write C# that reads close to natural language. I do so on a daily basis.
        Legalese is a bit of a non-sequitur. Despite English legalese ostenisbly being written in English, it is specifically obfuscated, using terminology that is not encountered in everyday English so as to be more difficult for laymen to understand. In fact it is common for legalese to use English that is not English, that is, words that look like English words but have completely different definitions that are not in accord with how those words are used in regular communication.
        [-]
        voidUpdate 1 day ago
        I use a lot more brackets in C# than I would in natural language...
        [-]
        anonymous908213 1 day ago
        And the original submission billed as "Compile English to Rust" uses a lot more ## and quotation marks than natural language. Perhaps we could establish an understanding that we are still talking about programming languages, not natural language, and that there is a scale of "further from natural language" and "closer to natural language", wherein decisions made about and within the programming language can move it along the scale while still being a programming language.
khimaros 1 day ago
i also built something in this space, but English to any language. it is also self hosting (the "compiler" itself is built from an English language spec and chilled to several language implementations). supports any LLM backend including llama.cpp: https://github.com/khimaros/enc -- designed for Makefile driven workflows, mimicking the 'cc' CLI. many examples in the repo.
elcapitan 1 day ago
I get Applescript PTSD from this. Could never remember the "easier natural" way to write even small pieces of code correctly in it.
[-]
- stevedonovan 1 day ago
  Yep, there's always syntax, whether it's 'friendly' or not.
  And it still isn't natural language, where you can paraphrase, use synonyms etc. Good job for an LLM
lasgawe 15 hours ago
A great idea. I think this is a pseudocode to language compiler, similar to Pascal but easier than that. This is more suitable for language learners rather than developers imho.
Surac 1 day ago
Reinventing COBOL?
chupchap 1 day ago
OMG this is so fascinating. We were taught LOGO in school for a year when we were kids
[-]
- gus_massa 1 day ago
  A total different language, but I guess the name clash is a problem.
colordrops 1 day ago
At first I thought this was going to be some LLM thing. I had an idea a while ago, "context-based module development", that I started on a prototype of but never followed up on. The idea is to have a standard format for defining modules, black boxes with interfaces and clear definitions, that can be composed hierarchically. Module definitions and their code should not be larger than the context window of an LLM. As long as each module is well defined and tested and treated like a black box, you could have a system composed of both human and AI built modules that should behave as expected and be somewhat comprehensible. Not all architectures would work with this and I don't know if it would have worked in the end, but I do expect that at some point a more formal system for defining software from the ground up for AI development will emerge.