• barsoap@lemm.ee
    link
    fedilink
    arrow-up
    1
    ·
    2 months ago

    AI image generators don’t “consult” source images to generate an output.

    Well, you have an artist breaking things down for an audience understanding neither the technical nor artistic aspect…

    Modern AI generators are increasingly good at generating text. They still struggle a bit

    I mean… SDXL still struggles a lot. The only thing you can get it to spell reliably is probably “Hooters”. There’s the one or other lora which makes it not suck completely but it’s still nowhere near actually good at generating text, the training just isn’t there. And even with that in place things like signatures are probably going to be gibberish.

    While a naive (and cheaper) approach to AI generation doesn’t use layers, there are generators which do use layers,

    Unless you start off training by feeding the model 3d data (say, voxels) alongside 2d projections I don’t think it’s ever going to develop a proper understanding of these kinds of things. Or, differently put: Learning object permanence (of sorts, related) is a meta-cognitive abstraction step that just won’t happen with the type of topologies we know how to engineer. It’s probably like 90% on the way towards AGI, so to get a simple topology to understand it we have to spoon-feed it permanence information alongside the (apparent) non-permanence.