No AI or OCR here, just an idiot with too much free time. After seeing that the AltaBASIC source posted was a scan of paper tape, I thought having it in a digital document proper would be good for preservation and novelty's sake. I've never written Assembly, and I'm not a typography expert! Feel free to point and laugh at any silly mistakes. Let's see how many days it takes!
It's not "paper tape" (that's a digital storage medium), just a printout. And the numbers in the left half are not actually part of the source file, they are line numbers and machine code output produced by the assembler. You'd probably better not waste time transcribing them.
Not trying to dissuade you, but here's some things you should consider:
• Turn off your spell checker, it will only make this more difficult! It certainly won't help with the code itself, and it seems like you want to reproduce everything perfectly, including typos in the comments.
• I'd strongly suggest to at the very least become a bit familiar with 8080 assembly language before attempting this.
• The tools used to produce this output add another layer of complications. They used the PDP10 system's assembler with a set of macros to adapt it to generate 8080 code, so it's using somewhat different syntax and directives than those of 8080-native assemblers (like the ones from Intel or Digital Research).
• Some characters are hard to read, and without knowledge of the context and at least some of the PDP10-specific syntax it will be impossible to just guess. E.g. decimal numbers are sometimes prefixed with '^D', and octal numbers with '^O', which look quite similar in this scan. The 'RADIX' directive changes the default for when there is no such prefix, it should be 10 for most of it, but I think that it does start out as octal. Memory addresses will be octal (like 'RAMBOT==^O20000' in line 13), ASCII characters could be either but they seem to prefer decimal for those ('^D13' is CR, '^D10' is LF).
There are PDP-10 emulators with well-maintained copies of the different operating systems for them, so someone could check that the typed up source can be assembled.
I do have some (surface-level) experience with GB/GBC assembly, but other than that I'm new. As for spell-checker, I've figured out how to rid myself of that. And the paper tape mix-up was just my inexperience.
If you've spent 7 days so far and got as far as "F3", I wondering if this is actually some kind of elaborate troll. At the rate of one byte per week, you might finish transcribing the 4K ROM within 8 decades.
Oh haha, just seen the reply below mine that has a link to the photos of the printout...
Interestingly, the large F3 appears to be a batch number for the print job, so it's just coincidental that it's the same as the first byte in the ROM. The first byte starts on line 732 and printed in octal not hex, so "000363" (the first byte of the output seems to be some kind of markup for addresses in multi-byte opcodes (immediate loads seem to have 000 for these).
You're not an idiot. I have transcribed several listings manually. I also tried OCR. In my experience, fixing OCR mistakes take at least as much work as just typing the whole thing.
I recruited a team to transcribe a particularly important document, and I had us type each page to catch typos.
Even if the source is already transcribed anywhere, I'm interested at what your experience will be if you keep going all the way through. A lot of authors have done things like rewrite Shakespeare plays or famous novels out by hand and claim to have got a lot of the experience of feeling like they were getting into the author's mind and stuff.. I wonder if typing this out could be similarly enlightening.
Am I alone in thinking that it's revealing that Gates himself hasn't released this as open source? I'm not sure how that would work as presumably the copyright rests with MS and not Gates but I'm sure that he could have found a way.
I have been researching/playing around with Macro-10 using the simh PDP-10 emulator just for fun recently. I have always been curious about the inner workings of the 8080 emulator Paul Allen developed for running this code on a "fake" Altair. Maybe this source code could provide some insight into how Micro-Soft went about getting to a fully working BASIC for the Altair step by step from the very beginning.
The wiki article implies the Professor was fine with it, so stolen is maybe not correct.
" Harvard officials were not pleased that Gates and Allen (who was not a student) had used the PDP-10 to develop a commercial product, but determined that the computer itself, which technically belonged to the military, was not covered by any Harvard policy; the PDP-10 was controlled by Professor Thomas Cheatham, who felt that students could use the machine for personal use. Harvard placed restrictions on the computer's use, and Gates and Allen had to use a commercial time share computer in Boston to finalize the software."
Impressive work on AltaBasic! The bytecode compiler and VM approach is clever. Using Lua for the runtime and Python for the compiler is a pragmatic choice. I'd be curious to see benchmarks vs. alternatives like tinybasic. Well documented code and a solid start to an open-source retro BASIC.
"Tiny BASIC was designed by Dennis Allison and the People's Computer Company (PCC) in response to the open letter published by Bill Gates complaining about users pirating Altair BASIC, which sold for $150. Tiny BASIC was intended to be a completely free version of BASIC that would run on the same early microcomputers."
Trolling or nah? If the author of this link is struggling with spell check and hasn't written assembly language before, this doesn't provide much confidence for an accurate transcription of the source code printout.
Good luck on the exercise, though! Don't worry, Bill and his legal team will be watching you closely. They'll be cheering you on (but not for the reasons you think they will be.)
Please read Bill Gates's Letter to Hobbyists and don't do this. He's not going to be happy that people are ripping off his Altair BASIC and not giving him any money. What you're doing is just not fair to him.
That letter was written decades ago lol. I am a bit upset that you didn’t say that and made some of us look it up.
Gates has many faults, but his current life consists of mostly positive stuff (such as fighting climate change, global hunger, diseases, lack of clean water) that I and others support.
I will absolutely never support his past actions, but currently? I can name a whole bunch of billionaires that are doing worse. Like nearly all of them.
Not trying to dissuade you, but here's some things you should consider:
• Turn off your spell checker, it will only make this more difficult! It certainly won't help with the code itself, and it seems like you want to reproduce everything perfectly, including typos in the comments.
• I'd strongly suggest to at the very least become a bit familiar with 8080 assembly language before attempting this.
• The tools used to produce this output add another layer of complications. They used the PDP10 system's assembler with a set of macros to adapt it to generate 8080 code, so it's using somewhat different syntax and directives than those of 8080-native assemblers (like the ones from Intel or Digital Research).
• Some characters are hard to read, and without knowledge of the context and at least some of the PDP10-specific syntax it will be impossible to just guess. E.g. decimal numbers are sometimes prefixed with '^D', and octal numbers with '^O', which look quite similar in this scan. The 'RADIX' directive changes the default for when there is no such prefix, it should be 10 for most of it, but I think that it does start out as octal. Memory addresses will be octal (like 'RAMBOT==^O20000' in line 13), ASCII characters could be either but they seem to prefer decimal for those ('^D13' is CR, '^D10' is LF).
All super interesting info!
Interestingly, the large F3 appears to be a batch number for the print job, so it's just coincidental that it's the same as the first byte in the ROM. The first byte starts on line 732 and printed in octal not hex, so "000363" (the first byte of the output seems to be some kind of markup for addresses in multi-byte opcodes (immediate loads seem to have 000 for these).
https://images.gatesnotes.com/12514eb8-7b51-008e-41a9-512542...
Actually this happened twice: once, someone else's code but I had the listing. Another, my own code but no listing.
I recruited a team to transcribe a particularly important document, and I had us type each page to catch typos.
I reckon there are ~40,000 lines in the printout. So: ~10 years.
I wonder if MS has this code somewhere in a digital vault.
We have a decompile of the BASIC, and the binary.
I picture he and Paul discussing the code he had just written when Bill walks in and asks "Are you done yet?"
I have been researching/playing around with Macro-10 using the simh PDP-10 emulator just for fun recently. I have always been curious about the inner workings of the 8080 emulator Paul Allen developed for running this code on a "fake" Altair. Maybe this source code could provide some insight into how Micro-Soft went about getting to a fully working BASIC for the Altair step by step from the very beginning.
" Harvard officials were not pleased that Gates and Allen (who was not a student) had used the PDP-10 to develop a commercial product, but determined that the computer itself, which technically belonged to the military, was not covered by any Harvard policy; the PDP-10 was controlled by Professor Thomas Cheatham, who felt that students could use the machine for personal use. Harvard placed restrictions on the computer's use, and Gates and Allen had to use a commercial time share computer in Boston to finalize the software."
https://en.wikipedia.org/wiki/An_Open_Letter_to_Hobbyists
Heck, even CP/M 2.2 it's free software today.
"Tiny BASIC was designed by Dennis Allison and the People's Computer Company (PCC) in response to the open letter published by Bill Gates complaining about users pirating Altair BASIC, which sold for $150. Tiny BASIC was intended to be a completely free version of BASIC that would run on the same early microcomputers."
Good luck on the exercise, though! Don't worry, Bill and his legal team will be watching you closely. They'll be cheering you on (but not for the reasons you think they will be.)
Gates is truly the Thomas Midgley Jr. of software.
Gates has many faults, but his current life consists of mostly positive stuff (such as fighting climate change, global hunger, diseases, lack of clean water) that I and others support.
I will absolutely never support his past actions, but currently? I can name a whole bunch of billionaires that are doing worse. Like nearly all of them.