- cross-posted to:
- technews@radiation.party
- cross-posted to:
- technews@radiation.party
The Inventor Behind a Rush of AI Copyright Suits Is Trying to Show His Bot Is Sentient::Stephen Thaler’s series of high-profile copyright cases has made headlines worldwide. He’s done it to demonstrate his AI is capable of independent thought.
Correct, but I haven’t seen anything suggesting that DABUS is an LLM. My understanding is that it’s basically made up of two components:
EDIT: This article is the best one I’ve found that explains how DABUS works. See also this article, which I read when first writing this comment.
Other than using machine vision and machine hearing (“acoustic processing algorithms”) to supervise the neural networks, I haven’t found any description of how the thalamobot functions. Machine vision / hearing could leverage ML but might not, and either way I’d be more interested in how it determines what to prioritize / additional algorithms to trigger rather than how it integrates with the supervised system.
As far as I can tell, probably, but not necessarily.
Ignoring Thaler’s claims, theoretically a supervisor could be used in conjunction with an LLM to “learn” by re-training or fine-tuning the model. That’s expensive and doesn’t provide a ton of value, though.
That said, a database / external process for retaining and injecting context into an LLM isn’t smoke and mirrors when it comes to persistent memory; the main difference compared to re-training is that the LLM itself doesn’t change. There are other limitations, too. But if I have an LLM that can handle an 8k token context where the first 4k is used (including during training) to inject summaries of situational context and of topics/concepts that are currently relevant, and the last 4k are used like traditional context, then that gives you a lot of what persistent memory would. Combine that with the ability for the system to retrain as needed to assimilate new knowledge bases and you’re all the way there.
That’s still not an AGI or even an attempt at one, of course.
Just talking hypothetically, I think it may be possible to actually make an AGI with an LLM base with a threaded interpreted language like Forth. If it was integrated into the model, it might be able to add network layers like a LoRA in real time or let’s say average prompt to response time. The nature of Forth makes it possible to negate issues with code syntax as a single token or two could trigger a Forth program of any complexity. I can imagine a scenario where Forth is fully integrated and able to modify the network with more than just LoRAs and embeddings, but I’m no expert; just a hobbyist. I fully expect any major breakthrough will be from white paper research, and not someone that is using hype media nonsense and grandstanding for a spotlight. It will not involve external code.
Tacking systems together with databases is not what I would call a human-brain analog or AGI. I expect a plastic network with self modifying behavior in near real time along with the ability to expand at or arbitrarily alter any layer. It would also require a self test mechanism and bookmarking system to roll back any unstable or unexpected behavior using self generated tests.
Agreed, and either of those are more than a system with persistent memory.
I think it would be wise for such a system to have a rollback mechanism, but I don’t think it’s necessary for it to qualify as a human brain analog or AGI - I don’t have the ability to roll back my brain to the way it was yesterday, for example, and neither does anyone I’ve ever heard of.
I don’t think this is realistic or necessary, either. If I want to learn a new, non-trivial skill, I have to practice it, generally over a period of days or longer. I would expect the same from an AI.
Sleeping after practicing / studying often helps to learn a concept or skill. It seems to me that this is analogous to a re-training / fine-tuning process that isn’t necessarily part of the same system.
It’s unclear to me why you say this. External, traditional code is necessary to link multiple AI systems together, like a supervisor and a chatbot model, right? (Maybe I’m missing how this is different from invoking a language from within the LLM itself - I’m not familiar with Forth, after all.) And given that human neurology is basically comprised of multiple “systems” - left brain, right brain, frontal node, our five senses, etc. - why wouldn’t we expect the same to be true for more sophisticated AIs? I personally expect there to be breakthroughs if and when an AI that is trained on multi-modal data (sight + sound + touch + smell + taste + feedback from your body + anything else of relevance) is built (e.g., by wiring up people with sensors to pull down that data), and I believe that models capable of interacting with that kind of training data would comprise multiple systems.
At minimum you currently need an external system wrapped around the LLM to emulate “thinking,” which my understanding is something ChatGPT already does (or did) to an extent. I think this is currently just a “check your work” kind of loop but a more sophisticated supervisor / AI consciousness could be much more capable.
That said, I would expect an AGI to be able to leverage databases in the course of its work, much the same way that Bing can surf the web now or ChatGPT can integrate with Wolfram — separate from its own ability to remember, learn, and evolve.
Plus the marketing writes itself
Don’t miss DABUS!