> The core insight of Moebius can be summarized in a single equation:
Synergy × (Architecture + Distillation) = Shattering the "Impossible Triangle" of Low Parameters, Fast Inference, and High Quality
Is it just me or is it weird seeing these clickbaity AI-generated taglines in an otherwise scientific work?
I think 3? I feel like that's often enough. Sometimes it's nice to do a quick dumb ass gag on a whim. If I am anything I am a man who loves a dumb ass gag.
(I'm counting only times I used generative editing options in my Galaxy phone - if I were to take your question literally, it would be "at least once every other day", simply due to rotating and cropping.)
Nitpick: in the showcase on that page, under Comparison of Natural Scenes, Moebius should definitely get a "structural confusion" tag for the back of the surfboard. If other models get deducted for truncating the surfboard, then surely the elongation that Moebius does should count too.
Also, what's going on behind the in-painted corner of the house? We'd need to see higher resolution pictures, but I'm not convinced that it too shouldn't get a flag. Likewise with the beach just behind the surfboard. Not terrible, but what gets flagged in the competitors is similar.
Scared for the same reason I found last year's 'Ghibli filter' craze upsetting, I would have personally hated to have seen this artist's legacy used for promoting AI image generation.
In case that happened then the rest of the world would probably appreciate the art, and a subset of it, the artist (and even a small subset of ~whole Internet-connected population is a lot of people). Some silver lining, perhaps.
> In case that happened then the rest of the world would probably appreciate the art
What art?
We’re talking about generated pictures, aka slop, not art made by a real human.
And I don’t know if you’ve been paying attention but people seem to be pretty tired of the slop. I don’t think it would be appreciated nearly as much as you think.
This definition of "slop" doesn't cut reality just quite at the joints.
People are tired of marketing. AI generated slop people are annoyed with, is garbage produced for marketing reasons, and it's distinctly noticeable precisely because all the bottom-feeder marketing houses switched to using it. But it's not the AI itself that's the problem here. Slop was here before, but it was made with cheap protein-based image generators. Silicon-based generators are just cheaper.
> This definition of "slop" doesn't cut reality just quite at the joints.
> People are tired of marketing.
You know what, I'll give you that one. I find most generated art pretty tasteless, but I have enjoyed the occasional piece of fiction with small generated elements for atmosphere. I still hesitate to call it 'art', but I will grant it's not all 'slop'.
But for the second part:
> But it's not the AI itself that's the problem here. Slop was here before, but it was made with cheap protein-based image generators. Silicon-based generators are just cheaper.
I think the problem is how much cheaper it is now. I would estimate generating a picture is at least 2 orders of magnitude cheaper than paying even a cheap human, so with the same amount of money being invested into slop we are due for - and seeing - a huge tidal wave of it, because the same amount of money turns out way more crap now.
Awnings, if I understand correctly (I just learned this word right now), are purely additive attachments to structure exteriors - so perhaps they wouldn't necessarily need a full inpainting model? Wouldn't it be enough to estimate an affine transform for a quad and blend the image of awning directly (and the same with shadow map to fake shade)? Is classical photogrammetry up to such task these days?
I'm quite perplexed by this comment. If I'm understanding you correctly, sure, what you describe is possible through significantly more effort, orchestration, and source photos. Or we can grab one still image and throw an inpainting model at it.
I have no idea but I think you might be onto something.
So you're saying that, if I can calculate from the picture the position (height, inclination and such), and I can render the model (should be doable) for that height and angle, my best course of action could be to combine original + render and only at the end use a visual model? That could be interesting.
I have an example of interior decorating inpainting where I replaced a large floor-to-ceiling window with a mirror, and the result was pretty impressive using NB Pro from nearly a year ago.
For locally hostable image editing models, the edit variant of the recently released Boogu-Image[1] model is very good. Anecdotally, I'd say way better than Flux.2 Klein 9B and Qwen-Edit.
As far as I know, gpt-image-2 doesn't even let you define a mask unless you've already run it through one iteration, and once you do define the mask, it just ignores it 90% of the time. It's utterly useless for inpainting. Also, this and other proprietary models are severely limited in their output resolution.
I do agree, however, that the Flux2 family is the SoTA at the moment. Running locally via something like Comfy gets incredible results.
Is it just me or is it weird seeing these clickbaity AI-generated taglines in an otherwise scientific work?
(I'm counting only times I used generative editing options in my Galaxy phone - if I were to take your question literally, it would be "at least once every other day", simply due to rotating and cropping.)
Edit: I think I found it https://huggingface.co/hustvl/Moebius
Also, what's going on behind the in-painted corner of the house? We'd need to see higher resolution pictures, but I'm not convinced that it too shouldn't get a flag. Likewise with the beach just behind the surfboard. Not terrible, but what gets flagged in the competitors is similar.
https://characterdesignreferences.com/artist-of-the-week-3/m...
[0] https://en.wikipedia.org/wiki/Jean_Giraud
What art?
We’re talking about generated pictures, aka slop, not art made by a real human.
And I don’t know if you’ve been paying attention but people seem to be pretty tired of the slop. I don’t think it would be appreciated nearly as much as you think.
People are tired of marketing. AI generated slop people are annoyed with, is garbage produced for marketing reasons, and it's distinctly noticeable precisely because all the bottom-feeder marketing houses switched to using it. But it's not the AI itself that's the problem here. Slop was here before, but it was made with cheap protein-based image generators. Silicon-based generators are just cheaper.
> People are tired of marketing.
You know what, I'll give you that one. I find most generated art pretty tasteless, but I have enjoyed the occasional piece of fiction with small generated elements for atmosphere. I still hesitate to call it 'art', but I will grant it's not all 'slop'.
But for the second part:
> But it's not the AI itself that's the problem here. Slop was here before, but it was made with cheap protein-based image generators. Silicon-based generators are just cheaper.
I think the problem is how much cheaper it is now. I would estimate generating a picture is at least 2 orders of magnitude cheaper than paying even a cheap human, so with the same amount of money being invested into slop we are due for - and seeing - a huge tidal wave of it, because the same amount of money turns out way more crap now.
I have a potential project for my e-commerce where I want to allow users to upload images of their house exteriors and impaint awnings.
So you're saying that, if I can calculate from the picture the position (height, inclination and such), and I can render the model (should be doable) for that height and angle, my best course of action could be to combine original + render and only at the end use a visual model? That could be interesting.
I have an example of interior decorating inpainting where I replaced a large floor-to-ceiling window with a mirror, and the result was pretty impressive using NB Pro from nearly a year ago.
https://imgpb.com/ZXkiXV
Locally hostable? For my money I'd argue Flux.2 Klein but Qwen-Edit still puts in the work.
[1]: https://github.com/boogu-project/Boogu-Image
I do agree, however, that the Flux2 family is the SoTA at the moment. Running locally via something like Comfy gets incredible results.
2) If these are reasonable, a WebGPU demo would be great..