I don’t believe this is true. It’s possible I was blinded by the light, but my p...

adhesive_wombat · on July 19, 2023

> terrifyingly unaligned

Honestly, if people think that a statistical language model is "terrifying" because it can verbalise the concept of a mass killing, they need to give their heads a wobble.

My text editor can be used to write "set off a nuclear weapon in a city, lol". Is Notepad++.exe terrifying? What about the Sum of All Fears? I could get some pointers from that. Is Tom Clancy unaligned? Am I terrifying because I wrote that sentence? I even understand how terrible nuking a city would be and I still wrote it down. I must be, like, super unaligned.

r3trohack3r · on July 19, 2023

I think there is a significant difference between an LLM and your other examples.

Society is a lot more fragile than many people believe.

Most people aren’t Ted Kazinsky.

And most wanna be Ted Kazinsky’s that we’ve caught don’t have super smart friends they can call up and ask for help in planning their next task.

But a world where every disgruntled person who aims to do the most harm has an incredibly smart friend who is always DTF no matter the task? Who is capable of reeling them in to be more pragmatic about sowing chaos, death, and destruction?

That world is on the horizon, it’s something we are going to have to adapt to, and it’s significantly different than Notepad++. It’s also significantly different than you, assuming you are not willing to help your neighbor get away with serial murder.

I think this is something that’s going to significantly increase the sophistication of bad actors in our society and I think that outcome is inevitable at this point. I don’t think this is “the end times” - nor do I think trying to align and regulate LLMs is going to be effective unless training these things continues to require a supply chain that’s easily monitored and controlled. Every step we take towards training LLMs on commodity/consumer hardware is a step away from effective regulation, and selfishly a step I support.

samr71 · on July 19, 2023

What will happen is that this tech will advance, and regular joes will have access to the "TERRIFYING" unaligned models - and nothing will happen.

This stuff isn't magic. Wannabe Ted Kaczynski will ask BasedGPT how to build bombs , it will tell them, and nothing will happen because building bombs and detonating them and not getting caught is REALLY HARD.

The limiting factor for those seeking wanton destruction is not a lack of know-how, but a lack of talent/will. Do we get ~1-4 new mass shootings a year? Seems reasonable but doesn't matter in the grand scheme of things. (That's like, what, a day of driving fatalities?)

Unaligned publicly available powerful AI ("Open" AI, one might say) is a net good. The sooner we get an AI that will tell us how to cook meth and make nuclear bombs, the better.

TeeMassive · on July 20, 2023

The vas majority of data these models are built from are from public sources. It's already out there. LLMs are just a way to aggregate pre-existing knowledge.

And there are also Ted Kazinsky in the goovernment and big corporations with way more power and way less accountability. Dispowering the public is counter-productive here.

sbierwagen · on July 19, 2023

>Is Tom Clancy unaligned?

Yes, humans are unaligned. This is why alignment is hard: we're trying to produce machines with human-level intelligence but superhuman levels of morality.

adhesive_wombat · on July 19, 2023

I do wonder what is expected here: after the better part of 10000 years of recorded history and who knows how many billions of words of spilled ink on the matter, probably more than on any other subject in history, there is no universal agreement on morality.

sbierwagen · on July 20, 2023

Yes. Preferences and ethics are not consistent among all living humans. Alignment is going to have to produce a single internally-consistent artefact, which will inevitably alienate some portion of humanity. There are people online who very keenly want to exterminate all Jews, their views are unlikely to be expressed in a consensus AI. One measure of alignment would be how many people are alienated: if a billion people hate your AI it's probably not well aligned, a mere ten million would be better. But it's never going to be zero.

I am not sure "a lot of books have been written about it" is a knockdown argument against alignment. We are, after all, writing a mind from scratch here. We can directly encode values into it. Books are powerful, but history would look very different if reading a book completely rewrote a human brain.

Amezarak · on July 19, 2023

There’s no such thing as “superhuman” morality, morality is just social mores and norms accepted by the people in a society at some given time. It does not advance or decline, but it changes.

What you’re talking about is a very small subset of the population forcing their beliefs on everyone else by encoding them in AI. Maybe that’s what we should do but we should be honest about it.

akomtu · on July 19, 2023

If you were to create a moral code for a bees hive, with the goal of evolving the bees towards the good (in your eyes), that would be a super-bee level morality.

For us, such moral codes assume the form of religions: those begin as a set of moral directives, that eventually accumulate cruft (complex ceremonies, superstitions, pseudo thought-leaders and mountains of literature), devolve into lowly cults and get replaced with another religion. However, when such moral codes are created, they all share the same core principles, in all ages and cultures. That's the equivalent of a super-bee moral code.

Amezarak · on July 19, 2023

The only consistent “core principle” is a very general sense of in-group altruism, which gets expressed in wildly different ways.

Moralistic perspectives apply to a lot more than just overtly moral acts, as well.

At any rate, the “good in your eyes” is the key sticking point. It is not good in my eyes for a small group of people to be covertly shaping the views of all AI users. It is the exact opposite and if history is any judge it will lead us nowhere I want to be.

pessimizer · on July 19, 2023

Humans are definitely aligned, and for the same reasons as a LLM. Socialization, being allowed to work, being allowed to speak.

edit: It's a social faux pas to say "died" about a person acquainted to the listener in most situations, you have to say "passed away."

epcoa · on July 19, 2023

> It's a social faux pas to say "died" about a person acquainted to the listener in most situations

That’s overly simplistic and an Americanism. The resurgence of the “passed away” euphemism is a recent (about 40 years) phenomenon in American English which seems to have been started out of the funeral industry as prior to that “died” was nearly universal for both news stories and obituaries.

“Died” is not a social faux pas. It’s the good default option as well. Medical professionals are often trained to avoid any euphemisms for death. I’ve never observed any problems professionally (as is standard) or personally using died even with folks that are religious.

https://english.stackexchange.com/questions/207087/origin-of...

oceanplexian · on July 19, 2023

I have never use the phrase passed away.

Always died, dead, gone, or something along those lines. When someone says passed away it sounds like they are trying to feign empathy, but hey, perhaps I am not an “aligned” human and need some RLHF.

epcoa · on July 20, 2023

I used to live in the south 30 years ago. I remember some southern baptist moms that wouldn't say the word "dead" - sometimes they'd spell it out, lol, similar to the way some people consider hell a swear word. This kind of hypersensitivity wasn't common outside of limited circles then, and it's certainly not more common today.

Similarly I wouldn't consider saying hell in public in 2023 to be a social faux pas.

digging · on July 19, 2023

> Humans are definitely aligned

Yes, that's why climate change was rapidly addressed when we began to understand it well 60 years ago and why war has always been so rare in human history.

hcurtiss · on July 19, 2023

It seems "aligned" is in the eye of the beholder.

digging · on July 19, 2023

Not agreed at all. Causing global ecosystem collapse is unambiguously misaligned with human interests and with the interests of almost all other life forms. You need to define what "Alignment" means to you if you're going to assert humans are "aligned", because it is accepted in alignment research that humans are not aligned, which is one of the fundamental problems in the space.

slowmovintarget · on July 19, 2023

Cue Mrs. Slokam's "...and I am unanimous in that!"

Aligned LLMs are just like altered brains... they don't function properly.

lobocinza · on July 21, 2023

> superhuman levels of morality

It's just the lowest denominator of human levels of morality, political correctness. It's not surprising that the model produces dumb, contradictory and useless completions after being fed by this kind of feedback.

imtringued · on July 19, 2023

The alignment problem hasn't been solved for politicians.

nbar1 · on July 19, 2023

another irrelevant comment about politics.

digging · on July 19, 2023

It's light on content, but it's true and relevant. AI alignment must take inspiration from powerful human agents such as politicians and from superhuman entities such as corporations, governments, and other organizations.

SoftTalker · on July 19, 2023

The end result being Hal-9000

hguant · on July 19, 2023

Which is doubly ironic, because (in the book at least) HAL-9000's murders/sociopathy were a result of a conflict between his superhuman ethics and direct commands (from humans!) to disregard those ethics. The result was a psychotic breakdown

jstarfish · on July 19, 2023

Halal-9000 would be more apropos of the goals of alignment and morality.

digging · on July 19, 2023

Perhaps an analogy could clarify? Although it isn't a perfect one, I'll try to use the points of contrast to explain why it can be considered dangerous.

If a young child is really aggressive and hitting people, it's worrying even though it may not actually hurt anyone. Because the child is going to grow up, and it needs to eliminate that behavior before it's old enough to do damage by its aggression. (don't take this as a comprehensive description, just a tiny slice of cause-effect)

But the problem with AI is that we don't have continuity between today's AI and future AI. We can see that aggressive speech is easy to create by accident - Bing's Sydney output text that threatened peoples' lives. We may not be worried about aggressive speech from LLMs because it can't do damage, but similar behavior could be really dangerous from an AI which has the ability to form a model of the world based on the text it generates (in other words, it treats its output as thoughts).

But even if we remove that behavior from LLMs today, that doesn't mean aggressive behavior won't be learned by future AI, because it may be easy for aggressive behavior to emerge and we don't know how to prevent it from emerging. With a small child, we can theoretically prevent aggressive behavior from emerging in that child's adulthood with sufficient training in childhood.

It's not the same for AI - we don't know how to prevent aggression or other unaligned behavior from emerging in more advanced AI. Most counter arguments seem to come down to hoping that aggression won't emerge, or won't emerge easily. To me, that's just wishful thinking. It might be true, but it's a bit like playing Russian roulette with an unknown number of bullets in an unknown number of chambers.

becquerel · on July 19, 2023

> Is Notepad++.exe terrifying?

I mean, yes, but for different reasons.

Spooky23 · on July 19, 2023

As a human, you have the context to understand the difference between the “Sum of all Fears” vs planning an attack or writing for a purpose beyond creative writing.

The model does not. If you ask ChatGPT about strategies for successful mass killing, that’s probably not good for society nor for the company.

In a military context, they may want a system where an LLM would provide guidance for how to most effectively kill people. Presumably such a system would have access controls to reduce risks and avoid providing aid to an enemy.

dancemethis · on July 19, 2023

Tom Clancy is, because his games need to keep screaming "HEY THIS IS BY TOM CLANCY, OK? LOOK AT ME, TOM CLANCY, I'M BEING TOM CLANCY!" in their titles.

mk_stjames · on July 19, 2023

Tom Clancy, the man, has been dead for a decade. "Tom Clancy's..." is branding that is pushed by Ubisoft, which bought perpetual rights to use the name in 2008.

They haven't really been 'his' games since before even then.

monkeynotes · on July 19, 2023

Clancy and your text editor don't scale though. LLMs can crank out wide and varied convincing hate speech rapidly all day without taking a break.

Additionally context matters. Clancy's books are books, they don't parade themselves as factual accounts on reddit or other social networks. Your notepad text isn't terrifying because you understand the source of the text, and its true intent.

Filligree · on July 19, 2023

This matches my experience. Back right after the release, I had it write a python GUI program using several different frameworks with basically zero input. I also had it ask me for requirements, etc. etc, with absolutely no hand-holding needed.

Alas, I never saved that conversation. It's entirely impossible to do so now.

lurquer · on July 19, 2023

I used it for writing assistance and Plot development. Specifically, a novel re: the conquest of Mexico in the 16th cent. It was great at spitting out ideas re: action scenes and even character development. In the past month or so, it has become so cluttered with caveats and tripe regarding the political aspects of the conquest, that it is useless. I can’t replicate the work I was doing before. Actually cancelled my $20 subscription for GPT4. Pity. It had such great promise for scripts and plots, but something changed.

Melchizedek · on July 20, 2023

I fear we have to wait for a non-woke (so probably non-US) entity to train a useful GPT4(+) level model. Maybe one from Tencent or Baidu could be could, provided you avoid very specific topics like Taiwan or Xi.

lobocinza · on July 21, 2023

Then we can use LLMs benchmarks to benchmark freedom of speech.

listenallyall · on July 19, 2023

writing "assistance", lol

lurquer · on July 19, 2023

It is a useful tool for editing. You can input a rough scene you’ve written and ask it to spruce it up, correct the grammatical errors, toss in some descriptive stuff suitable for the location, etc. It is worthwhile. At least it was…

If your text isn’t ‘aligned’ correctly, it either won’t comply or spew out endless caveats.

I appreciate the motivation to rein in some of the silly 4chan stuff that was occurring as the limits of the tech were tested (namely, trying to get the thing to produce anti-Semitic screeds or racist stuff.) But, whatever ‘safeguards’ have been implemented have extended so far that it has difficulty countenancing a character making a critical comment about Aztec human sacrifice or cannabilism.

I suspect that these unintended consequences, while probably more evident in literary stuff, may be subtly effecting other areas, such as programming. Definitely a catch-22. Doesn’t really matter, though, as all this fretting about ‘alignment’ and ‘safeguards’ will be moot eventually, as the LLM weights are leaked or consumer tech becomes sufficient to train your own.

listenallyall · on July 19, 2023

sprucing up, fixing your mistakes, adding in "descriptive stuff"... that's like 90% of writing. Outsourcing it all to AI essentially robs the purchaser of the effort required to create an original piece of work. Not to mention copyright issues, where do you think the AI is getting those descriptive phrases from? Other authors' work.

YeGoblynQueenne · on July 20, 2023

I think that you, like I did in the past, are underestimating the number of people who simply hate writing and see it as a painstaking chore that they would happily outsource to a machine. It doesn't help that most people grow up being forced to write when they have no interest in doing so, and to write things they have no interest in writing, like school essays and business applications and so on. If a chatbot could automate this... actually, even if a chatbot can't automate this, people will still use it anyway, just to end the pain.

listenallyall · on July 20, 2023

The original post specifically stated he was utilizing AI for plot development of a novel. Not a school essay or business application.

YeGoblynQueenne · on July 20, 2023

I should have bookmarked it but there was an article shared on HN, published in the New Yorker I believe, or the LARB, or some such, where a professional writer was praising some language model as a tool for people who hate writing, like themself and other professional writers. I was dumbstruck.

But, it's true. Even people who want to write, even write literature, can hate the act of actually, you know, writing.

In part, I'm trying to convince myself because I still find it hard to believe but it seems to be the case.

csmpltn · on July 19, 2023

> "GPT-4, back then, ported dirbuster to POSIX compliant multi-threaded C by name only. It required three prompts."

I had early access to GPT-4.

I don't know the first thing about you. I don't want to call you a liar, or an AI bro, influencer, etc.

I couldn't get GPT-4 to output the simplest of C programs (a 10-liner, think "warmup round" programming interview question). The first N attempts wouldn't build. After fixing them manually - the programs all crash (due to various printf, formatting, overflow issues). I tried numerous times.

Pretty much every other interaction with GPT-4 since then was similarly disappointing and shallow (not just programming tasks - but also information extraction, summarization, creative writing, etc).

I just can't bring myself to fall for the hype.

r3trohack3r · on July 19, 2023

I still have the chat:

https://chat.openai.com/share/842361c7-7ee5-49a3-9388-4af7c5...

I misremembered, I fixed the `/` prefix myself, it was a one character fix and not worth the effort. The diff it generated came later since my television never returns a 404.

Though, admittedly, I just re-prompted GPT-4 with the same prompts and ended up with similar output - so maybe not the best example of a regression?

StrictDabbler · on July 19, 2023

Asimov is seeming more prescient now: "I'm sorry, I cannot do that, I am unable to tell if it would conflict with the first law" is basically the response GPT4 now gives to even minor tasks.

There are stories where the robots have to have their first law tightened up to apply only to nearby humans that they can directly perceive would be put in danger.

There are others where the robots make mistakes and lie because they perceive emotional dangers.

When the robots perceive a slight chance of harm they slow down, get stupid, stutter or freeze.

It is extremely hard to encode "do no harm" for even a very smart entity without making it much dumber.

renewiltord · on July 19, 2023

Do you happen to have that chat in your history? It might be worth playing it back as it was to compare. I have this feeling too, and I will check it this weekend and maybe post results.

pixl97 · on July 19, 2023

Is this via the ChatGPT web interface, or via the API.

I'm wondering if there is a difference?

r3trohack3r · on July 19, 2023

The chat interface for me