Andreas Erben

Aug 20, 2025 • World Congress 2025

You are not my model anymore - understanding LLM model behavior

Your LLM is a shoggoth with a smiley face mask. Learn what happens when the mask slips and your application breaks.

#1about 2 minutes

Unexpected LLM behavior from hidden platform updates

A practical demonstration shows how a cloud provider's content filter update can unexpectedly block access to documents, causing application failures.

#2about 3 minutes

How LLMs generate text and learn behavior

Large language models use a transformer architecture to predict the next token based on probability, with instruction tuning and alignment shaping their final behavior.

#3about 2 minutes

The opaque and complex stack of modern LLM services

Major LLM providers operate in secrecy, and the full technology stack from model weights to the API is complex, leaving developers with limited visibility and control.

#4about 3 minutes

Managing risks from provider filters and short API lifecycles

Cloud provider content filters can change without notice, creating vulnerabilities, while the short lifecycle of model APIs requires constant adaptation.

#5about 4 minutes

Understanding LLMs as alien minds with fragile alignment

LLMs are conceptually like alien intelligences with a fragile, human-like alignment layer that can be bypassed by jailbreaks exploiting internal model circuits.

#6about 2 minutes

How model personalities and behaviors shift between versions

Different LLM versions exhibit distinct behaviors and may ignore system prompts, as shown by a comparison between GPT-4 and a newer reasoning model.

#7about 3 minutes

Using evaluations to systematically test model behavior

Systematically test model behavior using evaluations, which can be automated by generating prompt variations or using pre-built cloud and open-source frameworks.

#8about 4 minutes

Using prompt engineering to mitigate model drift

Mitigate model behavior drift by using advanced prompt engineering techniques like forcing reasoning, providing few-shot examples, and being highly explicit in instructions.

Andrew Comp
Cosio Valtellino, Italy

Intermediate

TypeScript

Cards Co

Remote

Intermediate

JavaScript

TypeScript

Name of

Remote

Intermediate

PHP

Java

+1

Analyzing the risks and architecture of current AI models

04:34 MIN

Analyzing the risks and architecture of current AI models

Opening Keynote by Sir Tim Berners-Lee

Shifting from traditional code to AI-powered logic

02:58 MIN

Shifting from traditional code to AI-powered logic

WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA

Addressing the core challenges of large language models

05:18 MIN

Addressing the core challenges of large language models

Accelerating GenAI Development: Harnessing Astra DB Vector Store and Langflow for LLM-Powered Apps

The ethical risks of outdated and insecure AI models

02:19 MIN

The ethical risks of outdated and insecure AI models

AI & Ethics

Understanding the GenAI lifecycle and its operational challenges

05:39 MIN

Understanding the GenAI lifecycle and its operational challenges

LLMOps-driven fine-tuning, evaluation, and inference with NVIDIA NIM & NeMo Microservices

Addressing the key challenges of large language models

02:55 MIN

Addressing the key challenges of large language models

Large Language Models ❤️ Knowledge Graphs

The challenge of moving AI from demo to production

03:18 MIN

The challenge of moving AI from demo to production

What’s New with Google Gemini?

AI privacy concerns and prompt engineering

03:43 MIN

AI privacy concerns and prompt engineering

Coffee with Developers - Cassidy Williams -

Featured Partners

Three years of putting LLMs into Software - Lessons learned

Three years of putting LLMs into Software - Lessons learned

Simon A.T. Jiménez

about 6 months ago • World Congress 2025

Beyond the Hype: Building Trustworthy and Reliable LLM Applications with Guardrails

Beyond the Hype: Building Trustworthy and Reliable LLM Applications with Guardrails

Alex Soto

about 6 months ago • World Congress 2025

AI: Superhero or Supervillain? How and Why with Scott Hanselman

AI: Superhero or Supervillain? How and Why with Scott Hanselman

Scott Hanselman

about 2 years ago • World Congress 2024

Prompt Injection, Poisoning & More: The Dark Side of LLMs

Prompt Injection, Poisoning & More: The Dark Side of LLMs

Keno Dreßel

about 6 months ago • World Congress 2025

Inside the Mind of an LLM

Inside the Mind of an LLM

Emanuele Fabbiani

about 6 months ago • World Congress 2025

From Traction to Production: Maturing your GenAIOps step by step

From Traction to Production: Maturing your GenAIOps step by step

Maxim Salnikov

about 6 months ago • World Congress 2025

Bringing the power of AI to your application.

Bringing the power of AI to your application.

Krzysztof Cieślak

about 2 years ago • World Congress 2024

How AI Models Get Smarter

How AI Models Get Smarter

Ankit Patel

about 7 months ago • World Congress 2025

Related Articles

View all articles

BB

Benedikt Bischof

MLops – Deploying, Maintaining And Evolving Machine Learning Models in Production

Welcome to this issue of the WeAreDevelopers Live Talk series. This article recaps an interesting talk by Bas Geerdink who gave advice on MLOps.‍About the speaker:‍Bas is a programmer, scientist, and IT manager. At ING, he is responsible for the Fast...

MLops – Deploying, Maintaining And Evolving Machine Learning Models in Production

DC

Daniel Cranney

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

IntroductionIn the ever-evolving landscape of artificial intelligence, the concept of "responsible AI" has emerged as a cornerstone for ethical and practical AI implementation. During the WWC24 Panel discussion, three eminent experts—Mina, Bjorn Brin...

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

CH

Chris Heilmann

Exploring AI: Opportunities and Risks for Developers

In today's rapidly evolving tech landscape, the integration of Artificial Intelligence (AI) in development presents both exciting opportunities and notable risks. This dynamic was the focus of a recent panel discussion featuring industry experts Kent...

Exploring AI: Opportunities and Risks for Developers

CH

Chris Heilmann

Dev Digest 116 - WWWAI?

This time, learn how to un-AI Google's search results, what's new on the web, avoid a new security hole and go back to BASICS with us. News and ArticlesWhat a week. Google, Microsoft, OpenAI and many others had their big flagship events announcing th...

Dev Digest 116 - WWWAI?

From learning to earning

Jobs that call for the skills explored in this talk.

Machine Learning Engineer (m/f/d)

evoila Frankfurt GmbH
Mainz, Germany

Senior

Keras

DevOps

Tensorflow

Machine Learning & Data Engineer

vengine GmbH
Hamburg, Germany

Junior

Intermediate

Python

Product Owner/Projektleiter (m/w/d)

relyon AG
Tübingen, Germany

Junior

Intermediate

Senior

Scrum

AI & Embedded ML Engineer (Real-Time Edge Optimization)

autonomous-teaming

Remote

GIT

Linux

PyTorch

Lead AI Governance & Platform Engineer

q.beyond AG

Remote

Senior

Kubernetes

Continuous Integration

ML Data Engineer - Object Detection & Active Learning

autonomous-teaming

Remote

NoSQL

NumPy

Pandas

Docker

ML Data Engineer - Object Detection & Active Learning

autonomous-teaming

Remote

NoSQL

NumPy

Pandas

Docker

Conversational AI & Machine Learning Engineer

Deloitte

Machine Learning

Conversational AI & Machine Learning Engineer

Deloitte

DevOps

Docker

PyTorch

Tensorflow

Kubernetes

+2