C extensions, portability, and alternative compilers

(lemon.rip)

121 points | by xngbuilds 8 hours ago

10 comments

  • kscarlet 20 minutes ago
    I think the Common Lisp ecosystem sets a good example of how a dozen of implementations move ahead together. Implementations experiment with extensions, the really useful ones get implemented multiple times, and some portability library emerges as de-facto standard if it's good enough. You can watch the result in https://portability.cl/, the language is evolving like never before even if the standard committee has dissolved nearly 40 years ago!
  • whizzter 7 hours ago
    One of my pet-peeves with C projects is that it's so often more or less "works on my machine" when written by Linux users (as a Windows and FreeBSD user it often hits you on both those platforms).

    The article highlights a typical piece:

      #if !(defined __GNUC__ || defined __clang__ || defined __TINYC__)
      # define __attribute__(xyz)     /* Ignore */
      #endif
    
    There is no reason that !defined check to not include a check for __attribute__ already being defined (a custom compiler author could then force an define for __attribute__ that translates to an internal __mycompiler__attribute__ replacement by default).

    But outside of that, just trying to compile on FreeBSD you often run into systemd dependencies or other non-posix behaviors (Not to mention on Windows but I'm not here to bring on flamewars so I'll leave that part).

    • matheusmoreira 3 hours ago
      > One of my pet-peeves with C projects is that it's so often more or less "works on my machine" when written by Linux users (as a Windows and FreeBSD user it often hits you on both those platforms).

      Windows users singling out Linux users for not catering to their platform. How the times change...

      > you often run into systemd dependencies or other non-posix behaviors

      Not a problem. POSIX is irrelevant, systemd is great and we should all be using Linux to its fullest extent. Linux has great features and there is absolutely no reason not to use them all. Nobody complains about the fact BSDs have cool things like kqueue and unveil.

      • neonz80 36 minutes ago
        > Windows users singling out Linux users for not catering to their platform. How the times change...

        In my experience this was a problem over 25 years ago when I developed for Solaris and other non-Linux operating systems.

    • rwmj 4 hours ago
      MSVC support for C is fairly terrible. For the projects we write that are portable to Windows we insist you use GCC or Clang on Windows. No one has time to deal with the lack of even standard C1x/C2x features (never mind useful extensions like attribute cleanup).

      Surprised about FreeBSD. My experience is that porting Linux software is usually pretty easy as long as it's not using some Linux-only feature (io_uring for instance).

      • bigbadfeline 1 hour ago
        > Surprised about FreeBSD. My experience is that porting Linux software is usually pretty easy as long as it's not using some Linux-only feature (io_uring for instance).

        I'm not sure why you're surprised, the parent of you comment clearly stated

        "on FreeBSD you often run into systemd dependencies or other non-posix behaviors"

        which means, software written for Linux often uses "Linux-only features" such as systemd and other non-posix dependencies that are foreign to the BSDs and traditional UNIX. Thus, it shouldn't be surprising that Linux software is hard to port to the BSDs.

        Linux used to be a pretty good UNIX, I'm not sure what it is now.

      • jstimpfle 2 hours ago
        What do you even miss, honestly works fine for me? In terms of platform APIs, I prefer the Windows ones on Windows anyway
        • drwu 2 hours ago
          Complex numbers, for example. Also, C preprocessor expands macros differently on MSVC.
    • BadBadJellyBean 6 hours ago
      If it is an open source project then that is quite alright with me. An open source author doesn't need to support all platforms. Only those they care about. If someone else wants support for another platform they have the source.
    • kps 6 hours ago
      >One of my pet-peeves with C projects is that it's so often more or less "works on my machine"

      “All the world's a VAX”

      https://groups.google.com/g/comp.lang.c/c/CYgWkWdWCcQ/m/thMt...

      https://www.lysator.liu.se/c/ten-commandments.html

    • formerly_proven 7 hours ago
      For a bunch of software categories there isn't really much point to support Windows at all these days. We've had "developed for unix, ported to Windows" software for a long time and it often doesn't work that well, because the agreement even for fairly basic stuff is not that large between the two.
      • whizzter 6 hours ago
        1: My point isn't "developer on unix, ported to Windows", it's "developed on linux, maybe works elsewhere".

        2: You could easily compile Samba yourself for FreeBSD in the past, last time I tried a new version it broke in what I remember being due to linux-isms (yes there is ports, but being reliant on older versions if ports maintainers can't keep up isn't a good thing).

        3: The only "fairly basic" stuff that's hugely different is mostly the absence/reliance on shell-scripts (when building), but that has little to do with the actual code function (Personally I often used Node scripts in those scenarios, Python scripts would probably be an improvement since there's no reason it couldn't be everywhere).

        I used to use Tremor to decode Ogg audio (no UI needs, just binary data in, arrays of primitive values in audio buffers out), early versions were easy to compile under Windows but building later versions were buried in shell scripts generating headers,etc for no real good reason (maybe to help port when working on a Linux workstation to other embedded devices but made the code less easily compilable by default), the core functionality only really needed a C compiler as early versions showed.

        I can agree that something with advanced UI's like Blender (that relies on GL/3d rendering for UI) might not be easily portable, but when algortihm libraries often requires heavy reworking it's not a good thing (Here I think Github has helped since people has had an easier time to contribute, it's a sad thing that people are moving away due to the AI-crap).

        In the end, it's not about _actual_ differences but more of a superiority complex of Linux users that is the main roadblock.

        • matheusmoreira 3 hours ago
          Superiority complex?

          How many times have we been told that we're entitled freeloaders for expecting Linux compatibility work from others? Insulted by people who use dominant platforms that get all the commercial support while we get literally nothing? Reduced to reverse engineering stuff with no documentation and zero help?

          Pretty wild to watch this unfold. Now that Linux is finally coming out ahead, as it should, because people are finally writing software for it... Suddenly we're the bullies.

          • justin66 3 hours ago
            > Now that Linux is finally coming out ahead

            Where have you been for the last quarter century?

            • matheusmoreira 3 hours ago
              I was using Linux on my machines despite all the difficulties involved.
        • rootnod3 5 hours ago
          Exactly, the amount of patches needed in many FreeBSD or other BSD ports just to appease the Linux-centricity is bonkers. And many times the changes aren't even that grave.
      • jdw64 6 hours ago
        in Asia, and over here, Windows is the de facto standard while Linux is actually a poor option. Almost all of the infrastructure is written for Windows, so the second you switch to Linux, you're fighting an uphill battle just to do basic tasks. Seeing this makes you realize that our worldview is entirely shaped by the ecosystem we live in
      • zephen 7 hours ago
        There's portability between systems, which as you note, has ever-diminishing returns.

        Then there's portability between compilers, which, as the article notes, glibc is also completely hostile to (except for anointed compilers) for no good reason whatsoever.

    • dooglius 6 hours ago
      The preceding comment indicates that the intent is to support other compilers. I think a better approach is to define __glibc_attribute__ based on compiler support and to stick to that within glibc since there's no reason to think that another compiler's attributes have the same semantics as GNU C's.
      • rurban 2 hours ago
        That ship sailed already. You simply have to mimic gcc.

        Which is at least better than with MSVC, where they did everything differently, and only half of it.

    • einpoklum 4 hours ago
      systemd is indeed the bane of Linux, and a pain in the ass for a lot of FOSS. Once the main distributions made it mandatory to install (not just the default, but mandatory) - we've started to see this sort of bifrucation of a lot more FOSS away from being standard-based and multi-platform to being Linux-specific.

      That said - I think a rule-of-thumb one can follow is that any inclusion of a file with a directory prefix, especially `<sys/whatever>`, needs a guarantee-of-availability in your build configuration phase, e.g. CMake `find_package()`, or or at least `check_include_file()` and such. That way, you might be more likely to fail to build, but at least you'll be telling the user "I expect these things to be present".

      • jeltz 1 hour ago
        No, systemd is not the bane of Linux. What existed before it was much worse. Upstart was a totally broken mess and almost all sysv init scripts contained several bugs.

        I don't like systemd but it is a lesser evil.

        • einpoklum 41 minutes ago
          systemd is not an init system; it _contains_ an init system. It is a huge swatch of the whole userspace of a Linux system up to shell or GUI sessions - and having an init system was just an excuse; and in fact, the systemd point brought up in the linked article is unrelated to init systems.

          There are quite a few init systems: The venerable sysvinit, runit, s6, openrc and others. You don't like upstart? Ok, choose another one, there are many. Here is a comparison table by the Gentoo folks:

          https://wiki.gentoo.org/wiki/Comparison_of_init_systems

          As for the claim of "almost all sysvinit scripts contained several bugs" - that's both hyperbole and false. Plus, you seem to be implying that systemd has not been troubled by bugs, which of course it has (and that does not disqualify it; the fundamental design and organizational nature as a project are the disqualifiers).

  • WalterBright 6 hours ago
    Yes, when I implemented ImportC (a C compiler built in to the D compiler), I had to spend a lot of time finding ways to work with all the nutburger nonsense in the various .h files.

    https://github.com/dlang/dmd/blob/master/druntime/src/import...

    https://github.com/dlang/dmd/blob/master/druntime/src/__impo...

  • fuhsnn 6 hours ago
    For those who are making indie C compilers that don't pretend to be __GNUC__ but want to compile real world projects, slimcc's test script[1] and platform header hacks[2] might save you some time. [1] https://github.com/fuhsnn/slimcc/blob/main/scripts/linux_thi...

    [2] https://github.com/fuhsnn/slimcc/blob/main/slimcc_headers/pl...

    Some more fun stories:

    - Game projects default to using SIMD so for example SDL and STB you always need to pass -DSDL_DISABLE_IMMINTRIN_H and -DSTB_NO_SIMD

    - math.h's NAN usually fall back to (0.0f / 0.0f), which will print "-nan" with printf, some projects test suite fail because of it (they expected "nan").

    - NetBSD's sys/cdefs.h straight up #error's if you don't pretend to be GCC or PCC.

    - Some projects can't compile without __attribute__((always_inline)) because they use it on non-static functions.

    - Many projects probe -fvisibility in the build system and pass -fvisibility=hidden to compile, but in the headers they gate __attribute__((visibility(default))) behind __GNUC__ checks, so you'll get missing symbols.

    - Some projects use if(0) { undefined_function() } to fake static_assert(), there is even a bug report from QEMU to Clang because it failed to optimize in -O0 a certain `if` written this way.

    - Even if you define __STDC_NO_VLA__, projects might fall back to alloca() code path that's untested and broken (python and jemalloc both had this problem, already reported)

    - Valkey has broken __builtin_ctzll fallback nobody noticed (reported).

    - Zig's C bootstrap path expects the compiler to have GCC/Clang-tier optimization and stack overflows if you don't (reported).

    - I contributed stdatomic.h code path for Ruby just to compile it with slimcc, pretty sure it's still the only user of the code path.

    - I implemented __has_extension in the hope that projects can use it to query gnu_asm; but SQLite broke because they use __has_extension(c_atomic) to query GNU atomics builtin, but c_atomic actually is meant for C11 _Atomic (IMO they should use __has_builtin)

    • upvotrsuper 41 minutes ago
      Although I don't use slimcc, I follow its development because I learn so much about C - even though I've been using C and C++ for over 30 years. https://github.com/fuhsnn/slimcc/blob/main/scripts/linux_thi... is an incredible piece of work. I can't imagine how many hours it took to assemble this collection. If slimcc can pass that project list torture test, then surely it's bulletproof.
  • meghprkh 5 hours ago
    Why would they not do something like?

      + #if !(defined __GLIBC_COMPILER_SUPPORTS_ATTRIBUTES__)
      - #if !(defined __GNUC__ || defined __clang__ || defined __TINYC__)
      # define __attribute__(xyz)     /* Ignore */
      #endif
    
    (or probably a more fine grained for each attribute they try to use)

    Considering such checks are fairly conventional in downstream C++ libraries based on compilers (for example checking OS platform or compiler, e.g. [Boost.Config](https://www.boost.org/doc/libs/latest/libs/config/). Modern C++ even went ahead and standardized this somewhat https://en.cppreference.com/cpp/utility/feature_test )

    • ventana 4 hours ago
      I have no knowledge of this specific "why", but my general experience shows that feature flags, while almost universally better than checks for the specific software version in hope that it has proper support for a feature, appear only as a replacement to platform / software detection, after a few years of struggle.

      It's probably just natural for software developers (myself included), whenever only FooZoid v5 supports frobnicate(), say "#ifdef FOOZOID_V5" and go back to your business, rather than introducing "FROBNICATE_SUPPORTED".

      Also, when you try to ask for a feature flag in the code review, people will throw YAGNI at you, and they might be not wrong, at least for the first few years. After that, it's a costly refactor.

      • fc417fc802 1 hour ago
        The ideal isn't feature flags (ie FOO_SUPPORTED) but rather feature tests (ie COMPILES( foo( int ) ) ). Yet another reason why languages with proper metaprogramming capabilities are better.
  • rurban 5 hours ago
    I just implemented a fast small compiler rcc to compete against tcc, and these glibc header quirks were simply fixed in the same way clang does it. By defining all gcc predefines and implementing all gcc extensions. Needed a day. And my headers are clean compared to glibc. The others in this league are slimcc, kefir and cproc. No other compilers can parse glibc headers. tcc has this special exception.
    • lelanthran 4 hours ago
      Back in my youth I implemented a C compiler that was almost impressive.

      Dealing with the preprocessor (essentially a different, crippled language) was too much headache at the start so I just used `cpp`, and at the end, I was just too lazy to implement it, so continued using `cpp`.

      (BTW: Anyone else here ever used the Cosmic C Compiler for Motorola microcontrollers? Amongst other idiosyncrasies, it had only one datatype - `byte` - and I had to implement macros to do 16-bit arithmetic operations. That project was easily the worst development experience of my life.)

    • einpoklum 3 hours ago
      If the codebase is small, and the dependencies are minimal, you might look into trying to integrate your rcc in "bootstrapping" flows, which currently often rely on tcc as a stepping stone towards gcc. And perhaps even being able to skip gcc 4.7 in favor of your compiler, if it's capable of, say, compiling a modern gcc to work even suboptimally.
      • rurban 3 hours ago
        tcc is pretty good for this.

        What I want to explore is cheap and fast optimizations, without SSA and data flow tracking. I have huge sources files compiled to 25MB .o files, which should need to be compiled in less than 5m. So far only tcc can do that. But I have now consteval and deadcode elimination for free. Which tcc cannot do.

  • gritzko 6 hours ago
    What is the feasible way to test code against the matrix of compilers/oses?
    • btrettel 5 hours ago
      One approach for testing with multiple compilers that I use on some Fortran projects (where testing against multiple compilers seems more common than in C) is to use a variable from the command line to specify the compiler, for example:

          make FC=ifx check
      
      On my Fortran projects, that will run the tests with Intel's Fortran compiler. The Makefile has logic to automatically change compiler flags as appropriate. I default to the GNU Fortran compiler, so `FC` isn't required.

      I have made a script to run through a series of compilers by alternating between `make check` and `make clean`.

      I have separate Makefiles for GNU Make and NMAKE/jom. My Fortran code works fine on various Linux distributions and Windows, though I'll add that achieving that is probably easier with Fortran than C. I've also tried a BSD Make that worked (on Ubuntu at least). My Makefiles are pretty close to the intersection of POSIX and NMAKE, so the main differences between the different Make versions are the conditional statements needed to handle the different compiler flags and the include statements (as I put the compiler flags in separate files).

    • aDyslecticCrow 6 hours ago
      And architectures. Probably a bunch of build servers or a swarm of docker, qemu, and VMs, with a good test coverage to detect behaviour differences.

      In practice, the compiler is an often an omitted dependency of any c code.

      • gritzko 5 hours ago
        I think I did not get to that level yet where I trust Claude to create its own VMs. But one day I will.
    • pocksuppet 5 hours ago
      Realistically, you don't aim to support every platform. You have a finite set of platforms you want to support. As the size of that set goes up, so does the probability it'll work on platforms unknown to you, but it's never 100%. Did you know some platforms have 32-bit char?
  • einpoklum 4 hours ago
    > Anyone who's written C knows that full ISO C standard-adhering code is an impractical rarity.

    I've written quite a bit of C code and do not know that to be a rarity. Especially when it comes to libraries rather than applications, and FOSS as opposed to proprietary code.

    > Most real world C code out there relies on non-standard behaviors and language extensions to varying extents

    Maybe it depends on which domain you're working in. At companies whose target platform is not a PC, relying on idiosycratic behavior, or extensions, is difficult: The compiler for the target device may simply not support the bells and whistles of GCC or whatever, so you stick to C99, (or even C89, ugh) to be on the safe side. And even then there will be things which are standard, but... well, I would be wary of relying on them being supported robustly enough, e.g. variable-length arrays.

    And of course, once your code does not target just one single machine then you're forced to have to worry about portability and standard compliance etc.

    • kouosi 3 hours ago
      > And of course, once your code does not target just one single machine then you're forced to have to worry about portability and standard compliance etc.

      Well linux exclusively usages gcc to compile.

      • fanf2 3 hours ago
        • Maxatar 2 hours ago
          That's a good point, the more precise statement is that Linux exclusively uses gnu11 to compile.

          Clang happens to implement gnu11, I think it's the only non-GCC compiler to do so.

      • einpoklum 51 minutes ago
        Yes, sorry, mistype, that should have been "single kind of machine" or "single operating system".
    • bmink 2 hours ago
      Yes, while the article makes some good points the first paragraph was puzzling.
  • cocodill 4 hours ago
    If you complain about the portability of open source on the level of compiler, so breathe in the spirit of the open source and shut the fuck up, then exhale and port it to you magic niche compiler by yourself.