The LLM ecosystem has expanded significantly since ChatGPT’s initial release, with various models available from companies like Google (Gemini), Meta (co-pilot), Microsoft, Anthropic (Claude), xAI (Grok), Deep seek and Mistral.
First things first: how do you stay up to date with the constant updates to the models Leaderboards such as Chatbot Arena and the SEAL leaderboard from Scale can be used to keep track of the performance of different models.
What you should know about LLM basics:
- Context Window Conversations with LLMs build a token stream called the context window, which acts as the model’s working memory. Starting a new chat wipes this context window. So it’s best practice to start a new chat when you are moving on to a different topic – the chat will actually run faster if you do so.
- Thinking Models Thinking models are tuned with reinforcement learning, allowing them to perform additional reasoning and problem-solving. They are more effective for complex tasks like math and coding.
- Model Training LLMs undergo pre-training (learning from internet data) and post-training (adopting a conversational persona).
- Model Knowledge LLMs have a knowledge cut-off date due to the cost of pre-training, making them slightly outdated.
- Tool Use LLMs can use tools like internet search to access information beyond their training data. Perplexity AI and ChatGPT with the search function are examples of this.
- Deep Research Deep research combines internet search and thinking for in-depth analysis. ChatGPT’s Deep Research, Perplexity AI’s Deep research, and Grok’s Deep search are examples.
- File Uploads Uploading documents allows LLMs to reference specific information within those files.
- Python Interpreter LLMs with access to a Python interpreter can write and execute code, enhancing their problem-solving capabilities.
- Advanced Data Analysis ChatGPT’s Advanced Data Analysis allows users to analyze and visualize data.
- Multimodality LLMs are increasingly capable of handling multiple modalities, including speech, audio, images, and video.
- Voice Interaction Voice interaction can be “fake” (transcription to text) or “true” (native audio processing).
- Image Input and Output LLMs can process images by representing them as token streams. They can also generate images using tools like DALL-E.
- Video Input Some LLMs can process video input, allowing users to interact with them in a more natural way.
- Quality of Life Features Features like memory, custom instructions, and custom GPTs enhance the user experience and allow for personalization.
- Custom GPTs Custom GPTs allow users to create specialized versions of ChatGPT for specific tasks.
Differentiation Between Models:
- ChatGPT: The incumbent and most feature-rich, with advanced data analysis, deep research, and custom GPTs.
- Claude: Known for its artifacts feature and strong performance.
- Grok: Offers advanced voice mode and a less restricted personality.
- Gemini: Google’s LLM, with varying capabilities depending on the specific model (e.g., 2.0 Pro vs. 2.0 Flash).
- Perplexity AI: Excels in internet search and offers a deep research feature.
Want to learn more about LLMs? Comment for more info.
Leave a comment