Short summary from https://www.youtube.com/watch?v=wG8-IUtqu-s
What’s GPT?
GPT stands for Generative Pre-trained Transformer.
GPT is a self-supervised learning model. It’s a subset of unsupervised learning. ChatGPT is based on GPT + fine-tuning via supervised learning.
Unlike traditional supervised learning, this way can save a lot of effort and may get better result.
An example provided from the video.
To get even better performance, it also applied Reinforcement Learning from Human Feedback (RLHF). Comparing to supervised learning needs to provide an answer, reinforcement learning tells the result that generated from the model is good or bad.
There is a term to describe these steps, which are supervised learning -> reinforcement learning, is Alignment. BUT what it aligns to? It aligns human’s requirements.
Til now, it’s how GPT-3.5 trained.
About GPT-4, OpenAI has released a technical report, but not much details. Surprisingly, GPT-4 can read images.
How to unleash the maximum potential of ChatGPT?
- Clear and specific questions
- Give extra information
- Provide examples
- Chain of thought https://arxiv.org/abs/2205.11916
- Find a better prompt via your own experience or use AI by leveraging RL https://arxiv.org/abs/2309.03409
- Utilize third party tools
- Break down your tasks https://arxiv.org/abs/2210.06774
- Tell ChatGPT to break down tasks and do planning
搶先發佈留言