• FierySpectre@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    ·
    27 days ago

    That’s the crazy thing here, it is interacting with programs in a way that is wildly inefficient. At some point stuff like this will be properly integrated, and that both scares and excites me.

    • Hackworth@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      27 days ago

      Yeah, using image recognition on a screenshot of the desktop and directing a mouse around the screen with coordinates is definitely an intermediate implementation. Open Interpreter, Shell-GPT, LLM-Shell, and DemandGen make a little more sense to me for anything that can currently be done from a CLI, but I’ve never actually tested em.