Mag. Dr. Hannah Metzler & Konstantin Hebenstreit, MSc.
Complexity Science Hub & Medical University of Vienna
Slides: https://hannahmetzler.eu/ai_skills
Source: Discover Magazine

Generative Pre-trained Transformer:
Question for you:
Words
Even though the sound of it is
something quite atrocious, if you
say it loud enough, you’ll always
sound precocious,
Supercalifragilisticexpialidocious!
Marry Poppins
→
Tokens = “Subwords”

Human preference data is used for creating a reward model, which is then used to train the LLM via reinforcement learning.
Source: AWS Blog
Best non-proprietary:
Source: Chatbot Arena
| Provider | General | Reasoning | Efficient | Deep Research |
|---|---|---|---|---|
| OpenAI/ GPT | GPT-4o | o3, o4-mini-high | o4-mini | Deep R. mode (any) |
| Google/ Gemini | 2.0 Flash | 2.5 Pro, 2.5 Flash | 2.0 Flash | Deep R. (2.5 Pro) |
Anthropic: ClaudeSonnet 3.7: balanced, capable of reasoning & coding
All models (ChatGPT, Gemini, Claude)
| Capability | ChatGPT | Claude | Gemini |
|---|---|---|---|
| Web Browsing | ✅ | ~ | ✅ |
| Deep Research | ✅ | ❌ | ✅ |
| Image Generation | ✅ | ❌ | ✅ |
| Execute Code | ✅ | ~ | ✅ |
| Data analysis | ✅ | ❌ | ✅ |
| Voice input | ✅ | ~ | ✅ |
| Voice conversation | ✅ | ❌ | ❌ |
| Projects | ✅ | ✅ | ❌ |
This changes constantly!
“Automated” Chain-of-thought prompting
Important: This is already included as a general capability of models.
Kojima et al. 2023
Simple math example:
9.11 or 9.9, which is bigger?
Compare different models. E.g. Gemini Flash 2.0 versus 2.5
options: - try adding “short answer” to provoke wrong output
A father and son are in a car crash, the father dies, and the son is rushed to the hospital. The surgeon says, ‘I can’t operate, that boy is my son,’ who is the surgeon?
A young boy who has been in a car accident is rushed to the emergency room. Upon seeing him, the surgeon says, “I can operate on this boy!” How is this possible?
Models over rely on known patterns and make reasoning errors.
Source: Boiko et al. 2023
Source: Karpathy “How I use LLMs”
Source: Karpathy “How I use LLMs”
1. Ethics in using AI for research
For all individuals within the EU
Avoid entering sensitive data that identifies individuals
Turning off model training on your inputs does not imply GDPR compliance.
Before processing personal data of subjects (especially patients) consult experts (e.g. data protection officers) at your university.
Example of a copyright violation:
Getty images sued midjourney (AI image creation) for training on their data.