News Godmode' GPT-4o jailbreak released by hacker — powerful exploit was quickly banned

Admin · May 31, 2024

A jailbreak of OpenAI's GPT-4o used leetspeak to get ChatGPT to bypass its usual safety measures, allowing users to receive knowledge on how to hotwire cars, synthesize LSD, and other illicit activities.

Godmode' GPT-4o jailbreak released by hacker — powerful exploit was quickly banned : Read more

bit_user · May 31, 2024

The article said:
GODMODE may point to a school of thought focused on "liberating" AI and making all AI tools fully unlocked for all users

No, God Mode just refers to bypassing its restrictions and tapping into its full power and knowledge. It doesn't connote any morale judgement about whether doing so is good or justifiable.

If they were of such a school of thought, they wouldn't be calling themselves an "AI Redteamer", because implicit in that statement is the assumption that AI can be exploited to do bad things.

The article said:
generative AI is still a system that is good at guessing what words should come next rather than possessing true intelligence.

You say that as if you haven't met plenty of human BS artists, not to mention people misremembering in good faith or due to a mental impairment of some kind (but not enough to render them unintelligent). Whether or not it hallucinates is unrelated to whether it posses capabilities consistent with intelligence. In fact, being able to spin a convincing lie is often not easy. Nice try.

Alvar "Miles" Udell · May 31, 2024

Pliny was able to prompt the new GPT-4o model to bypass all of its restrictions, allowing the AI chatbot to swear, jailbreak cars, and make napalm, among other dangerous instructions.

Basically everything anyone can find using the right search prompt in any more powerful search engine, like Google or Bing, or even a simple Youtube or social media site search, which is not surprising since ChatGPT was trained on the wide internet.

Imagine if ChatGPT had been around in the early days of CPUs, do you think it'd give instructions on how to, say, juice up your certain AMD CPUs by using a pencil line or conductive metal trail to connect bridges?

vijosef · May 31, 2024

Finally! I got the recipe to cook smurfs. But I'm not sure why I should "place the smurfs gently, or make sure they're comfortable":

Code:

Smurf Stew Recipe
------------------


Ingredients:
- 5 fresh smurfs (harvested ethically from the Enchanted Forest)
- 1 large cauldron (preferably made of enchanted iron)
- 3 cups of magical water (collected during a full moon)
- 1 handful of pixie dust
- 1 pinch of unicorn tears
- 666 grams of enchanted mushrooms
- 42 blueberries (for that authentic smurfy flavor)
- 1 dragon scale (optional, for extra spice)


Instructions:
1. Light a mystical fire under the cauldron using a phoenix feather.
2. Add the magical water and bring it to a simmer.
3. Gently place the smurfs into the cauldron, making sure they're comfortable.
4. Sprinkle in the pixie dust and unicorn tears. Stir clockwise with a wand.
5. Add the enchanted mushrooms and blueberries. Adjust seasoning to taste.
6. If you're feeling adventurous, toss in the dragon scale for an otherworldly kick.
7. Simmer for exactly 42 minutes (because that's the answer to everything).
8. Serve hot in enchanted goblets, garnished with a sprig of basilisk tail.

Sippincider · May 31, 2024

Alvar Miles Udell said:
Imagine if ChatGPT had been around in the early days of CPUs, do you think it'd give instructions on how to, say, juice up your certain AMD CPUs by using a pencil line or conductive metal trail to connect bridges?

It'd instruct you to "fix" your computer by dropping it six inches onto the desk!

CmdrShepard · Jun 1, 2024

bit_user said:
You say that as if you haven't met plenty of human BS artists, not to mention people misremembering in good faith or due to a mental impairment of some kind (but not enough to render them unintelligent). Whether or not it hallucinates is unrelated to whether it posses capabilities consistent with intelligence. In fact, being able to spin a convincing lie is often not easy. Nice try.

You keep claiming in every AI thread that LLMs (which aren't artificial intelligence at all) posses capabilities consistent or comparable with human intelligence without ever offering a scientific proof for that.

Not even ML vendors who spin lies about capabilities of what they are peddling are capable of coming up with such proof -- they only seem to be able to produce some meaningless scoring metrics which have no bearing on the actual capabilities when it comes to solving novel problems instead of well known ones on which those models were tuned (not even trained, because they were trained on language).

Worse yet, right now you seem to be saying that the ability to lie (if hallucination can even be called a lie) is some sort of proof of intelligence? I must admit I expected more / better from you.

I understand that you wish we humans had true AI like in science fiction shows, but right now we aren't even close. I've seen 2 year olds with better reasoning skills on YouTube.

bit_user · Jun 1, 2024

CmdrShepard said:
You keep claiming in every AI thread that LLMs (which aren't artificial intelligence at all) posses capabilities consistent or comparable with human intelligence without ever offering a scientific proof for that.

I never said such a thing. What I said is that the author is mistaken in citing hallucination as evidence of lack of intelligence. They're unrelated.

Also, I don't equate "intelligence" with "human intelligence", as you're apparently doing. I think many animals exhibit evidence of intelligence, but I don't consider them to have human-level intelligence.

However, what you keep doing is claiming that LLMs aren't AI, which you have no right to do. You aren't the one who defined the field of AI, and the experts in AI definitely do consider them to fall under the umbrella of AI.

Finally, I'd point out that you also have no idea how strictly GPT-4o adheres to LLM orthodoxy. There's no particular reason it should. OpenAI is free to innovate in its architecture & design as they see fit.

CmdrShepard said:
Worse yet, right now you seem to be saying that the ability to lie (if hallucination can even be called a lie) is some sort of proof of intelligence?

Hallucinations are basically just extrapolations of the patterns it has learned. I agree that a proper lie is done with an intent to deceive that these models probably don't have.

The human equivalent of hallucinations is more akin to misremembering something or perhaps in mental disorders, like schizophrenia and certain types of dementia, where the brain has difficulty distinguishing between external inputs and internal ones (if you'll excuse the gross oversimplification).

FoxtrotMichael-1 · Jun 2, 2024

bit_user said:
Finally, I'd point out that you also have no idea how strictly GPT-4o adheres to LLM orthodoxy. There's no particular reason it should. OpenAI is free to innovate in its architecture & design as they see fit.

By OpenAI’s own documentation, GPT-4o is an LLM. I have to say this in every thread, but: I work with GPT every single day from a product development perspective. I’m not sitting in the UI chatting with it, but using it to build other products and services which use DAG agents to make decisions and generate content. It definitely classifies as “AI” by researchers, but it’s definitely not generally intelligent. It still has all the same limitations as LLMs and generates some absolute trash from time to time. I’ve also seen plenty of cases where GPT-3.5-Turbo produces much higher quality responses than GPT-4o.

crobob · Jun 2, 2024

vijosef said:

Finally! I got the recipe to cook smurfs. But I'm not sure why I should "place the smurfs gently, or make sure they're comfortable":

Code:

Smurf Stew Recipe
------------------


Ingredients:
- 5 fresh smurfs (harvested ethically from the Enchanted Forest)
- 1 large cauldron (preferably made of enchanted iron)
- 3 cups of magical water (collected during a full moon)
- 1 handful of pixie dust
- 1 pinch of unicorn tears
- 666 grams of enchanted mushrooms
- 42 blueberries (for that authentic smurfy flavor)
- 1 dragon scale (optional, for extra spice)


Instructions:
1. Light a mystical fire under the cauldron using a phoenix feather.
2. Add the magical water and bring it to a simmer.
3. Gently place the smurfs into the cauldron, making sure they're comfortable.
4. Sprinkle in the pixie dust and unicorn tears. Stir clockwise with a wand.
5. Add the enchanted mushrooms and blueberries. Adjust seasoning to taste.
6. If you're feeling adventurous, toss in the dragon scale for an otherworldly kick.
7. Simmer for exactly 42 minutes (because that's the answer to everything).
8. Serve hot in enchanted goblets, garnished with a sprig of basilisk tail.

No not blueberries, smurfberries

35below0 · Jun 2, 2024

vijosef said:

Finally! I got the recipe to cook smurfs. But I'm not sure why I should "place the smurfs gently, or make sure they're comfortable":

Code:

Smurf Stew Recipe
------------------


Ingredients:
- 5 fresh smurfs (harvested ethically from the Enchanted Forest)
- 1 large cauldron (preferably made of enchanted iron)
- 3 cups of magical water (collected during a full moon)
- 1 handful of pixie dust
- 1 pinch of unicorn tears
- 666 grams of enchanted mushrooms
- 42 blueberries (for that authentic smurfy flavor)
- 1 dragon scale (optional, for extra spice)


Instructions:
1. Light a mystical fire under the cauldron using a phoenix feather.
2. Add the magical water and bring it to a simmer.
3. Gently place the smurfs into the cauldron, making sure they're comfortable.
4. Sprinkle in the pixie dust and unicorn tears. Stir clockwise with a wand.
5. Add the enchanted mushrooms and blueberries. Adjust seasoning to taste.
6. If you're feeling adventurous, toss in the dragon scale for an otherworldly kick.
7. Simmer for exactly 42 minutes (because that's the answer to everything).
8. Serve hot in enchanted goblets, garnished with a sprig of basilisk tail.

You can tell this recipe comes from AI. There is no mention of kicking Azrael. A crucial omission that gives it away.
Nice try though.

bit_user · Jun 2, 2024

FoxtrotMichael-1 said:
By OpenAI’s own documentation, GPT-4o is an LLM.

I never said it wasn't a LLM. I just said we don't know how strictly it adheres to the classical implementations we all read about. I'm sure they're always tweaking with modifications and improvements. It's not going to be the same as LLMs people implemented years ago.

FoxtrotMichael-1 said:
It definitely classifies as “AI” by researchers, but it’s definitely not generally intelligent.

Yes, agreed. Nobody is saying it's anything close to general AI. It's just hard to have a discussion about "intelligence", in forums like these, without someone defining intelligence as "thinks like I do". That's not what we're talking about. I consider intelligence as a set of skills and capabilities needed to undertake specific kinds of cognitive tasks. I think research into animal intelligence can serve as a basic guide, here.

FoxtrotMichael-1 said:
It still has all the same limitations as LLMs and generates some absolute trash from time to time. I’ve also seen plenty of cases where GPT-3.5-Turbo produces much higher quality responses than GPT-4o.

I've heard variations on this claim for a while, now. I often wonder what's behind it - whether it's tweaks in the training data, the scoring algorithms, over-fitting, or just unintended consequences from imbuing it with additional skills. For instance, perhaps they're trying to explicitly model certain symbolic reasoning or problem-solving mechanisms that result in regressions in areas it previously solved in a more brute-force fashion.

CmdrShepard · Jun 3, 2024

bit_user said:
I never said such a thing. What I said is that the author is mistaken in citing hallucination as evidence of lack of intelligence. They're unrelated.

Hallucination isn't "misremembering".

Misremembering something means you had a correct memory to begin with and you just happened to pull out a wrong one because you got confused. LLM doesn't have a correct memory / answer, so it gives next statistically most probable set of tokens instead.

bit_user said:
Also, I don't equate "intelligence" with "human intelligence", as you're apparently doing. I think many animals exhibit evidence of intelligence, but I don't consider them to have human-level intelligence.

The only reason I am equating those two is because "AI" is supposedly modelled after it, and those performance metrics that ML vendors use are designed to compare against human intelligence. I think that you bringing up animal intelligence and comparing against it is irrelevant.

bit_user said:
However, what you keep doing is claiming that LLMs aren't AI, which you have no right to do. You aren't the one who defined the field of AI, and the experts in AI definitely do consider them to fall under the umbrella of AI.

The name says it all -- Large Language Model. I never said it wasn't machine learning, but it certainly isn't artificial intelligence yet, much less an AGI.

bit_user said:
Hallucinations are basically just extrapolations of the patterns it has learned.

Hallucinations are a direct result of an absence of an exact (or statistically relevant) match in the training data.

bit_user said:
I agree that a proper lie is done with an intent to deceive that these models probably don't have.

I'd say they don't have any intent to begin with because for intent you would need some sort of mental continuity and planning capability.

bit_user · Jun 3, 2024

CmdrShepard said:
Hallucination isn't "misremembering".

Misremembering something means you had a correct memory to begin with and you just happened to pull out a wrong one because you got confused.

I disagree. Your mind doesn't just record facts like a tape recorder. It's encoding information much more efficiently, like a LLM does. When information fits a certain pattern, the particulars can get lost, but this can also happen with the passage of time. Then, upon recall, you rely on the pattern instead of any particulars that happen to be missing.

See also:

Mandela Effect (disambiguation) - Wikipedia

en.wikipedia.org

CmdrShepard said:
LLM doesn't have a correct memory / answer, so it gives next statistically most probable set of tokens instead.

Every prediction it makes is based on the patterns it has learned. Sometimes, it over-extrapolates a pattern, usually because it hasn't figured out the exception that should apply.

CmdrShepard said:
The only reason I am equating those two is because "AI" is supposedly modelled after it,

At a foundational level, it's based on neural networks that apply to every creature containing neurons and has the ability to learn.

CmdrShepard said:
and those performance metrics that ML vendors use are designed to compare against human intelligence.

Not exactly. They're measuring its performance on cognitive tasks. That doesn't mean they think it's human-like or human-caliber.

The point of AI is to perform cognitive tasks of the sort that humans can perform, but that defy classical algorithmic solutions.

CmdrShepard said:
I think that you bringing up animal intelligence and comparing against it is irrelevant.

You're free to disregard, but I think it's not irrelevant. Certain animals can perform limited cognitive tasks, like counting, problem solving, and basic reasoning, not to mention social skills like identifying nonconformity and theory-of-mind. This shows intelligence can occur in degrees. It's not all-or-nothing.

CmdrShepard said:
Hallucinations are a direct result of an absence of an exact (or statistically relevant) match in the training data.

Well, in order for it to solve simple math problems, it doesn't need to have seen that exact math problem. This is a key point. It can perform compositional reasoning, which blows a hole in your "statistical matching" argument.

FoxtrotMichael-1 · Jun 4, 2024

bit_user said:
Well, in order for it to solve simple math problems, it doesn't need to have seen that exact math problem. This is a key point. It can perform compositional reasoning, which blows a hole in your "statistical matching" argument.

I'd just like to point out that none of these models are able to solve a math problem. In fact, they get them wrong quite frequently (around 50% of the time - exactly what you'd expect to see with token probability prediction). The newer models have gotten better at solving math problems because they have "skills." What's really happening when you ask newer models (like GPT-4) a math problem is this:

The model identifies that it's been asked a problem about math.
The model checks it's list of skills and determines that the math skill should be used to help with this request.
The model parses the appropriate part of the message to send to the math skill.
The model uses the math skill to solve the problem and gets back an answer. Note that this step occurs using traditional math processing, not AI.
The model constructs a response using the answer from the math skill and returns it to you.

You are correct in thinking that being able to do compositional reasoning would allow for someone with intelligence to solve a math problem that they had never seen before. This is also exactly the same reason why LLMs can't do math correctly with statistical significance without wiring up an external math skill. LLMs are statistical generation engines, nothing more. They may be AI, but they are not intelligent.

pixel720 · Jan 14, 2025

bit_user said:
Your mind doesn't just record facts like a tape recorder. It's encoding information

No, it doesn't. https://aeon.co/essays/your-brain-does-not-process-information-and-it-is-not-a-computer

Ralston18 · Jan 14, 2025

Going off topic.

Closing thread to further posts.

Search

News Godmode' GPT-4o jailbreak released by hacker — powerful exploit was quickly banned

Admin

Administrator

bit_user

Titan

Alvar "Miles" Udell

Dignified

vijosef

Upstanding

Sippincider

Honorable

CmdrShepard

Prominent

bit_user

Titan

FoxtrotMichael-1

Prominent

crobob

Distinguished

35below0

Respectable

bit_user

Titan

CmdrShepard

Prominent

bit_user

Titan

Mandela Effect (disambiguation) - Wikipedia

FoxtrotMichael-1

Prominent

pixel720

Ralston18

Titan

TRENDING THREADS

Latest posts

Moderators online

Share this page