Are LLMs capable of writing good code?

chknbwl@lemmy.world · 7 months ago

Are LLMs capable of writing good code?

saltesc@lemmy.world · 7 months ago

In my experience, not at all. But sometimes they help with creativity when you hit a wall or challenge you can’t resolve.

They have been trained off internet examples where everyone has a different style/method of coding, like writing style. It’s all very messy and very unreliable. It will be years for LLMs to code “good” and will require a lot of training that isn’t scraping.

Emily (she/her)@lemmy.blahaj.zone · 7 months ago

After a certain point, learning to code (in the context of application development) becomes less about the lines of code themselves and more about structure and design. In my experience, LLMs can spit out well formatted and reasonably functional short code snippets, with the caveate that it sometimes misunderstands you or if you’re writing ui code, makes very strange decisions (since it has no special/visual reasoning).

Anyone a year or two of practice can write mostly clean code like an LLM. But most codebases are longer than 100 lines long, and your job is to structure that program and introduce patterns to make it maintainable. LLMs can’t do that, and only you can (and you can’t skip learning to code to just get on to architecture and patterns)

chknbwl@lemmy.world · 7 months ago

Very well put, thank you.

Em Adespoton@lemmy.ca · 7 months ago

The other thing is, an LLM generally knows about all the existing libraries and what they contain. I don’t. So while I could code a pretty good program in a few days from first principles, an LLM is often able to stitch together some elegant glue code using a collection of existing library functions in seconds.

jacksilver@lemmy.world · edit-2 7 months ago

I think this is the best response in this thread.

Software engineering is a lot more than just writing some lines of code and requires more thought and planning than can be realistically put into a prompt.

bionicjoey@lemmy.ca · 7 months ago

This question is basically the same as asking “Are 2d6 capable of rolling a 9?”

etchinghillside@reddthat.com · 7 months ago

Yes, two six-sided dice (2d6) are capable of rolling a sum of 9. Here are the possible combinations that would give a total of 9:

3 + 6
4 + 5
5 + 4
6 + 3

So, there are four different combinations that result in a roll of 9.

…

See? LLMs can do everything!

Batman@lemmy.world · 7 months ago

Wow that’s pretty good

xmunk@sh.itjust.works · 7 months ago

Now ask it how many r’s are in Strawberry!

Fonzie!@ttrpg.network · edit-2 7 months ago

I asked four LLM-based chatbots over DuckDuckGo’s anonymised service the following:

“How many r’s are there in Strawberry?”

GPT-4o mini

There are three “r’s” in the word “strawberry.”

Claude 3 Haiku

There are 3 r’s in the word “Strawberry”.

Llama 3.1 70B

There are 2 r’s in the word “Strawberry”.

Mixtral 8x7B

There are 2 “r” letters in the word “Strawberry”. Would you like to know more about the privacy features of this service?

They got worse at the end, but at least GPT and Claude can count letters.

chknbwl@lemmy.world · 7 months ago

I have no knowledge of coding, my bad for asking a stupid question in NSQ.

bionicjoey@lemmy.ca · 7 months ago

Sorry, I wasn’t trying to berate you. Just trying to illustrate the underlying assumption of your question

etchinghillside@reddthat.com · 7 months ago

Wouldn’t exactly take the comment as negative.

The output of current LLMs is hit or miss sometimes. And when it misses you might find yourself in a long chain of persuading a sassy robot into writing things as you might intend.

chknbwl@lemmy.world · 7 months ago

Thank you for extrapolating for them.

TootSweet@lemmy.world · 7 months ago

A broken clock is right twice a day.

A_A@lemmy.world · 7 months ago

Yes … and it doesn’t know when it is on time.
Also, machines are getting better and they can help us with inspiration.

PlzGivHugs@sh.itjust.works · 7 months ago

AI can only really complete tasks that are both simple and routine. I’d compare the output skill to that of a late-first-year University student, but with the added risk of halucination. Anything too unique or too compex tends to result in significant mistakes.

In terms of replacing programmers, I’d put it more in the ballpark of predictive text and/or autocorrect for a writer. It can help speed up the process a little bit, and point out simple mistakes but if you want to make a career out of it, you’ll need to actually learn the skill.

Arbiter@lemmy.world · 7 months ago

No LLM is trust worthy.

Unless you understand the code and can double check what it’s doing I wouldn’t risk running it.

And if you do understand it any benefit of time saved is likely going to be offset by debugging and verifying what it actually does.

FlorianSimon@sh.itjust.works · 7 months ago

Since reviewing code is much harder than checking code you wrote, relying on LLMs too heavily is just plain dangerous, and a bad practice, especially if you’re working with specific technologies with lots of footguns (cf C or C++). The amount of crazy and hard to detect bad things you can write in C++ is insane. You won’t catch CVE-material by just reading the output ChatGPT or Copilot spits out.

And there’s lots of sectors like aerospace, medical where that untrustworthiness is completely unacceptable.

edgemaster72@lemmy.world · edit-2 7 months ago

understanding what the machine spits out

This is exactly why people will still need to learn to code. It might write good code, but until it can write perfect code every time, people should still know enough to check and correct the mistakes.

chknbwl@lemmy.world · 7 months ago

I very much agree, thank you for indulging my question.

667@lemmy.radio · edit-2 7 months ago

I used an LLM to write some code I knew I could write, but was a little lazy to do. Coding is not my trade, but I did learn Python during the pandemic. Had I not known to code, I would not have been able to direct the LLM to make the required corrections.

In the end, I got decent code that worked for the purpose I needed.

I still didn’t write any docstrings or comments.

Em Adespoton@lemmy.ca · 7 months ago

I would not trust the current batch of LLMs to write proper docstrings and comments, as the code it is trained on does not have proper docstrings and comments.

And this means that it isn’t writing professional code.

It’s great for quickly generating useful and testable code snippets though.

GBU_28@lemm.ee · 7 months ago

It can absolutely write a docstring for a provided function. That and unit tests are like some of the easiest things for it, because it has the source code to work from

dandi8@fedia.io · 7 months ago

In my experience LLMs do absolutely terribly with writing unit tests.

visor841@lemmy.world · 7 months ago

For a very long time people will also still need to understand what they are asking the machine to do. If you tell it to write code for an impossible concept, it can’t make it. If you ask it to write code to do something incredibly inefficiently, it’s going to give you code that is incredibly inefficient.

scarabic@lemmy.world · edit-2 7 months ago

I’ve even seen human engineers’ code thrown out because no one else could understand it. Back in the day, one webdev took it upon himself to whip up a mobile version of our company’s very complex website. He did it as a side project. It worked. It was complete. It looked good. It was very fast. The code was completely unreadable by anyone else. We didn’t use it.

Nomecks@lemmy.ca · 7 months ago

I use it to write code, but I know how to write code and it probably turns a week of work for me into a day or two. It’s cool, but not automagic.

Manifish_Destiny@lemmy.world · 7 months ago

I find it better at things under 100 lines. Otherwise it starts to lose context. Any ideas how to address this?

Nomecks@lemmy.ca · 7 months ago

Ask it to make a function, then do some other function, then make them work together etc. Making it write a lot in one go won’t work. It’s more pair programming than having it write for you.

ImplyingImplications@lemmy.ca · 7 months ago

Writing code is probably one of the few things LLMs actually excell at. Few people want to program something nobody has ever done before. Most people are just reimplimenting the same things over and over with small modifications for their use case. If imports of generic code someone else wrote make up 90% of your project, what’s the difference in getting an LLM to write 90% of your code?

MajorHavoc@programming.dev · 7 months ago

Great question.

is there any legit reason anyone should learn advanced coding techniques?

Don’t buy the hype. LLMs can produce all kinds of useful things but they don’t know anything at all.

No LLM has ever engineered anything. And there’s no current evidence that any AI ever will.

Current learning models are like trained animals in a circus. They can learn to do any impressive thing you an imagine, by sheer rote repetition.

That means they can engineer a solution to any problem that has already been solved millions of times already. As long as the work has very little value and requires no innovation whatsoever, learning models do great work.

Horses and LLMs that solve advanced algebra don’t understand algebra at all. It’s a clever trick.

Understanding the problem and understanding how to politely ask the computer to do the right thing has always been the core job of a computer programmer.

The bit about “politely asking the computer to do the right thing” makes massive strides in convenience every decade or so. Learning models are another such massive stride. This is great. Hooray!

The bit about “understanding the problem” isn’t within the capabilities of any current learning model or AI, and there’s no current evidence that it ever will be.

Someday they will call the job “prompt engineering” and on that day it will still be the same exact job it is today, just with different bullshit to wade through to get it done.

chknbwl@lemmy.world · 7 months ago

I appreciate your candor, I had a feeling it was cock and bull but you’ve answered my question fully.

ConstipatedWatson@lemmy.world · 7 months ago

Wait, if you can (or anyone else chipping in), please elaborate on something you’ve written.

When you say

That means they can engineer a solution to any problem that has already been solved millions of times already.

Hasn’t Google already made advances through its Alpha Geometry AI?? Admittedly, that’s a geometry setting which may be easier to code than other parts of Math and there isn’t yet a clear indication AI will ever be able to reach a certain level of creativity that the human mind has, but at the same time it might get there by sheer volume of attempts.

Isn’t this still engineering a solution? Sometimes even researchers reach new results by having a machine verify many cases (see the proof of the Four Color Theorem). It’s true that in the Four Color Theorem researchers narrowed down the cases to try, but maybe a similar narrowing could be done by an AI (sooner or later)?

I don’t know what I’m talking about, so I should shut up, but I’m hoping someone more knowledgeable will correct me, since I’m curious about this

MajorHavoc@programming.dev · edit-2 7 months ago

Isn’t this still engineering a solution?

If we drop the word “engineering”, we can focus on the point - geometry is another case where rote learning of repetition can do a pretty good job. Clever engineers can teach computers to do all kinds of things that look like novel engineering, but aren’t.

LLMs can make computers look like they’re good at something they’re bad at.

And they offer hope that computers might someday not suck at what they suck at.

But history teaches us probably not. And current evidence in favor of a breakthrough in general artificial intelligence isn’t actually compelling, at all.

Sometimes even researchers reach new results by having a machine verify many cases

Yes. Computers are good at that.

So far, they’re no good at understanding the four color theorum, or at proposing novel approaches to solving it.

They might never be any good at that.

Stated more formally, P may equal NP, but probably not.

Edit: To be clear, I actually share a good bit of the same optimism. But I believe it’ll be hard won work done by human engineers that gets us anywhere near there.

Ostensibly God created the universe in Lisp. But actually he knocked most of it together with hard-coded Perl hacks.

There’s lots of exciting breakthroughs coming in computer science. But no one knows how long and what their impact will be. History teaches us it’ll be less exciting than Popular Science promised us.

Edit 2: Sorry for the rambling response. Hopefully you find some of it useful.

I don’t at all disagree that there’s exciting stuff afoot. I also thing it is being massively oversold.

metiulekm@sh.itjust.works · 7 months ago

Hasn’t Google already made advances through its Alpha Geometry AI?? Admittedly, that’s a geometry setting which may be easier to code than other parts of Math and there isn’t yet a clear indication AI will ever be able to reach a certain level of creativity that the human mind has, but at the same time it might get there by sheer volume of attempts.

Wanted to focus a bit on this. The thing with AlphaGeometry and AlphaProof is that they really treat doing math as a game, not unlike chess. For example, AlphaGeometry has a basic set of rules, it can apply them and it knows when it is done. And when it is done, you can be 100% sure that the solution is correct, because the rules of the game are known; the 28/42 score reported in the article is really four perfect scores and three zeros. Those systems do use LLMs, but they really are only there to suggest to the system what to try doing next. There is a very enlightening picture in the AlphaGeometry paper here: https://www.nature.com/articles/s41586-023-06747-5#Fig1

You can automatically verify correctness of code the same way. For example Lean, the language AlphaProof uses internally, can be used for general programming. In general, we call similar programming techniques formal methods. But most people don’t do this, since this is more time-consuming than normal programming, and in many cases we don’t even know how to define the goal of our code (how to define correct rendering in a game?). So this is only really done when the correctness of the program is critical, like famously they verified the code of the automatic metro in Paris this way. And so most people don’t try to make programming AI work this way.

nous@programming.dev · edit-2 7 months ago

They can write good short bits of code. But they also often produce bad and even incorrect code. I find it more effort to read and debug its code then just writing it myself to begin with the vast majority of the time and find overall it just wastes more of my time overall.

Maybe in a couple of years they might be good enough. But it looks like their growth is starting to flatten off so it is up for debate as to if they will get there in that time.

WraithGear@lemmy.world · 7 months ago

Its the most ok’est coder with the attention span of a 5 year old.

PenisDuckCuck9001@lemmynsfw.com · edit-2 7 months ago

Ai is excellent at completing low effort ai generated Pearson programming homework while I spend all the time I saved on real projects that actually matter. My hugging face model is probably trained on the same dataset as their bot. It gets it correct about half the time and another 25% of the time, I just have to change a few numbers or brackets around. It takes me longer to read the instructions than it takes the ai bot to spit out the correct answer.

None of it is “good” code but it enables me to have time to write good code somewhere else.

JeeBaiChow@lemmy.world · 7 months ago

Dunno. I’d expect to have to make several attempts to coax a working snippet from the ai, then spending the rest of the time trying to figure out what it’s done and debugging the result. Faster to do it myself.

E.g. I once coded Tetris on a whim (45 min) and thought it’d be a good test for ui/ game developer, given the multi disciplinary nature of the game (user interaction, real time engine, data structures, etc) Asked copilot to give it a shot and while the basic framework was there, the code simply didn’t work as intended. I figured if we went into each of the elements separately, it would have taken me longer than if i’d done it from scratch anyway.

xmunk@sh.itjust.works · 7 months ago

No, a large part of what “good code” means is correctness. LLMs cannot properly understand a problem so while they can produce grunt code they can’t assemble a solution to a complex problem and, IMO, it is impossible for them to overtake humans unless we get really lazy about code expressiveness. And, on that point, I think most companies are underinvesting into code infrastructure right now and developers are wasting too much time on unexpressive code.

The majority of work that senior developers do is understanding a problem and crafting a solution appropriate to it - when I’m working my typing speed usually isn’t particularly high and the main bottleneck is my brain. LLMs will always require more brain time while delivering a savings on typing.

At the moment I’d also emphasize that they’re excellent at popping out algorithms I could write in my sleep but require me to spend enough time double checking their code that it’s cheaper for me to just write it by hand to begin with.