One thing that is cool about LLM‘s right now is how there’s actual... - Random

fasterthanlime , vor 16 Tagen Englisch

One thing that is cool about LLM‘s right now is how there’s actual competition, the likes of which I haven’t seen in other fields lately.

I’ve been enjoying GPT-4o, now Claude Sonnet 3.5 is out and I can just switch to it with one line of config. Until next time.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

Skyper , vor 16 Tagen

@fasterthanlime Any thoughts on their environmental impact?

I've heard the infra providing ChatGPT to the public (so without considering what's used to train the models), it's about 1 graphic card which dies every 90 seconds in average...

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

remidupre , vor 16 Tagen

@fasterthanlime my issue about LLM is kot really whether they are relevant or not but how they so obviously included data they don't own to train their models.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

I remember reading about how AI companies were currently undercharging for models and they would pull the rug later — experimenting with local models, which are much smaller, but also still useful has changed my mind about that.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen (Bearbeitet vor 15 Tagen)

I guess we’ll see over time, but we are carrying in our pockets an amount of computing power that was completely unfathomable at some point in the past, so… we’ll see.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

One thing that hasn’t changed is how polarized people are about LLMs. Everyone is entitled to their opinions, but it is changing things even if you don’t personally use them, so I would recommend getting some first-hand experience to learn more about what they can and cannot do.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

I feel like a lot of people keep seeing pathological cases with poorly written prompts or tasks really not well suited to LLMs, and end up dismissing them as “completely useless”. That’s not true!

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

I feel getting what you want out of an LLM is a good exercise for software engineering types in general.

Coming up with a prompt that is unambiguous and contains all the relevant context is an incredibly valuable skill that obviously translates to human collaboration.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

I’ve gotten good results carefully designing prompts in the Notes app before submitting them to the model.

If you know where you’re going, use technology with safeguards (e.g Rust), and the task fits in the context window, then you get to operate one level of abstraction higher.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

It’s gotten me excited about working on some systems again. It used to be complete drudgery but large language models are excellent at taking care of boilerplate for you.

They end up plugging the developer experience holes in a lot of technologies .

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

I think LLMs make “loose” languages less appealing than before. Nowadays, I would rather instruct a model on which Rust code to write for me, and end up with a very fast solution, than hack some Python/JS myself and pay the performance tax (+ maintenance burden) forever.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

Even if models don’t get any bigger or any “smarter”, a big difference compared to even a year ago, is how up-to-date the recent models are?

I’ve gotten GPT-4o to port code between two rust crates just by mentioning them my name.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

Even hallucination in the context of software development is not necessarily a bad thing? When an LLM tries to use an API that doesn’t exist, it’s often a sign that it should exist!

Think “they didn’t know it was impossible so they just did”, junior dev perspective

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

I’m just tired of reading that they are either completely useless or going to replace us all.

They’ve become a really useful tool, in my opinion there’s never been a better time for small teams to compete with larger companies. And it’ll drive the rebirth of bespoke software!

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

I think we’re still not collectively over the “well if we don’t need to go pump water out of the well ourselves anymore, then what does it even mean to be human??” moment, but we’ll get there.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

piecritic , vor 16 Tagen

@fasterthanlime yes, however, we're also at very much end-stage capitalism where investors call absolutely all of the shots, even if at first you don't think they do, or are shielded enough to think they don't.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

Oh btw here's a prompt I used with GPT-4o a few days ago (I was working on my CDN): https://gist.github.com/fasterthanlime/35782ceffb16ba61ac7296114ec52013

(I of course reviewed the integration test but it saved me like 20 minutes of back and forth with the compiler)

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

whitequark , vor 16 Tagen

@fasterthanlime can you post the code as well?

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

@whitequark Yup! I updated the gist with the generated code.

I probably adjusted it a bit after generating, but 95% is stilla s-is.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

yerke , vor 16 Tagen

@fasterthanlime did you write this whole prompt in 1 go or in several iterations?

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

@yerke In one go, I knew it was a big one and wanted to get it right.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

dngrs , vor 16 Tagen

@fasterthanlime the deterministic seed part from the prompt is missing I think? And I don't understand how this prompt led to testing with the wrong amount of ranges, clearing cache and retesting...
Either way, it's impressive indeed but also underwhelming at the same time, at least to me. I think LLMs can be good for exploratory work but I'd hesitate to commit anything generated

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

pants , vor 16 Tagen

@fasterthanlime any advice on working on phrasing prompts? I’ve had good luck with boilerplate code, but that’s the kind of stuff a good IDE does anyway, but writing Jest tests it happily hallucinates nonsense and I spend more time fixing than if I’d just written it myself. Is the issue I’m just using GitHub Copilot and not a better local LLM?

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

kinnison , vor 16 Tagen

@fasterthanlime I'd love for them to be a useful tool; but I struggle, particularly when it comes to an LLM producing code, to overcome my inate fear that they are a licence-violating nightmare box.

Is there a model which has been exclusively trained on content with known licences and which is capable of telling you if it has just regurgitated, wholesale, someone else's GPL code into your MIT/Apache codebase?

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

whitequark , vor 16 Tagen

@fasterthanlime "mostly actively harmful" is closer to reality than "completely useless" or "going to replace us all"

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

thejpster , vor 16 Tagen

@fasterthanlime now I can see how a smart autocomplete is useful in this scenario - and for all kinds of cases where a human needs some help producing new words. As long as the human knows enough to know what is right and what is wrong.

But I am tired of people telling me it’s going to revolutionise their business, I’m tired of companies using LLMs instead of paying humans, and I’m tired of companies stealing work to build the models they intend to make massive profits with.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

Huubje , vor 16 Tagen

@fasterthanlime I've become less of a fan of LLMs by the day. The fact that I often have to be as specific in my descriptions as I would have been just writing code (without the guarantee that things won't backfire in an unforseen manner) just makes me see the point less and less. The only thing I still really use them for is to trudge through the mess that other LLMs have made of search results. Oh, and finding mistakes like typos! They're really good at that. How does your experience compare?

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

jamiemccarthy , vor 16 Tagen

@fasterthanlime This is a great thread and you're spot-on with it. Thanks. In particular, more people should get more hands-on with LLMs before opining.

Your comment that threw me for a loop was that you mentioned your name to GPT-4o to get it to do something? Like your prompt was "port code from X to Y like Amos would"? And that worked better than not mentioning your name? This is a new twist for me!

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

hgrsd , vor 16 Tagen

@fasterthanlime I'm finding this a really interesting point, one that I've also explored a bit in the past. To make LLMs be more effective assistants when doing programming work, it is really useful to have 1) thorough typing and 2) great test coverage in your codebase. At that point you can just blindly accept suggestions and trust that something will complain if it breaks.

So maybe enabling the LLMs to be effective also pushes up the maintainability of your code?

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

vladimir_lu , vor 16 Tagen

@fasterthanlime this is the frustrating thing - the “right thing to do” is to go and fix the technologies to not have those gaps, not paper over them in an increasingly tall tower of workarounds-built-on-workarounds.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

__head__ , vor 16 Tagen

@fasterthanlime would you recommend LLMs for people who are not good programmers (yet) and are trying to build solutions?

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 14 Tagen (Bearbeitet vor 14 Tagen)

@__head__ Like another answer mentioned, it's delicate when you lack the requisite skills to validate whether a solution makes sense or not, but I would recommend using chat interfaces to even know what to search for.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

thek3nger , vor 16 Tagen

@fasterthanlime I consider talking with LLM for debugging like "Rubber Duck Debugging," but with the Duck that answers back. Because sometimes I realize the problem while writing the prompt. 😆

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

scherzog , vor 16 Tagen

@fasterthanlime "a prompt that is unambiguous and contains all the relevant context" we used to just feed those to compilers and interpreters and call them "code"

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 16 Tagen

@scherzog yes but no — you're well aware there's different levels of abstractions.

Some folks snicker at C compilers because it's really hard to get them to output the assembly you want 🤷

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

scherzog , vor 16 Tagen

@fasterthanlime I mean, C can hardly be called unambiguous since no two compilers can seem to agree on how it is supposed to be interpreted. Amusingly enough, that exact problem is even worse with LLMs since they have to deal with natural languages being even more ambiguous by, well, nature. And if your solution is to come up with a subset of English that is not ambiguous... congratulations, you have basically invented Ada

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

voltagex , vor 16 Tagen

@fasterthanlime It's been fascinating watching my colleagues use an LLM the way I would use a search engine.

What concerns me is if I'm not using it (say, because I don't agree with using that much electricity and water) am I going to be left behind?

An organisation that measures "productivity" is going to see LLM-using developers being "faster".

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

Eunakria , vor 15 Tagen

@fasterthanlime I'm not sure about this. I think most uses of language modeling in production (e.g. sentiment analysis, zero-shot classification, machine translation, summarization, question answering) usually come from interfaces other than text generation, while most LLM services and in fact the entire notion of "prompt engineering" is predicated on treating the LLM as a text-generating black box

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

Eunakria , vor 15 Tagen

@fasterthanlime peering into the black box, even just seeing how certain semantic ideas map onto embeddings, is also very important for language model consumers trying to build worthwhile applications with the technology

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 15 Tagen

@Eunakria That's interesting. Do you have any further reading on the topic?

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

aoanla , vor 16 Tagen

@fasterthanlime I see a lot more of the, accurate, "LLMs are both convincing and prone to making 'mistakes' that are hard to spot because they're convincing, which is a problem if you're not an expert and are relying on them". In some cases this does make them actively bad, relative to other means to getting information.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

hungryjoe , vor 16 Tagen

@fasterthanlime I don't entirely disagree with this; and I'd describe myself as "conflicted" on LLMs rather than "opposed"

But personally I have enough of a struggle finding the time to investigate tooling that I'm actively interested in

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

aiono , vor 16 Tagen

@fasterthanlime I think the problem is because investors all have a stake in AI, they do all sorts of marketing to make people believe that it's much better than it really is, or it will be really good in the future. Since there is already optimisim pushed by monetary incentives I think some people need to be critical to balance. Critical people don't get any or much less media support to raise their voice. That's why I am leaning on the critical side.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

SonnyBonds , vor 16 Tagen

@fasterthanlime Do you have any suggestions for local models?

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

dluz , vor 15 Tagen

@fasterthanlime How thoroughly have you tested them? I've tried maybe half a dozen smaller (sub-9 GB) models, and my conclusion is that for general knowledge they're the worst of all worlds — still sound plausible, but their odds of getting anything correct is abysmal. I suppose for writing tools or code autocomplete they can be decent, but for “conversational assistant” my hopes hit rock bottom.

The next step up seems to be ~30 GB, but I don't have the resources to run that locally atm

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

dluz , vor 15 Tagen

@fasterthanlime I'm hopeful that Apple's work on models that don't run entirely on RAM will spill over to others, and give us large-ish models that can I can run within the next several months

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...

fasterthanlime OP , vor 15 Tagen

@dluz Yeah, I agree the next few months will be crucial to the space and I'll be keeping a close eye on this.

Antworten

Melden

Aktivität

Ursprüngliche URL öffnen

Original-URL kopieren

Mbin URL kopieren

Loading...