LLMs
How to get Started with AI?
· ☕ 6 min read · 🤖 Naresh Mehta

AI has expanded into multiple territories. The pace of expansion has been exponential after 2022 when the first “free” LLMs were exposed to the general public for use. Suddenly overnight we have a variety of demography using AI for a variety of purposes. The chat interface offered by companies such as OpenAI, Google, ChatGPT, etc. provides for the most basic usage. People from all walks of life, all age groups and all professions use the chat interface to use the power of AI. Natural Language Processing (NLP) is a game changer in that context. I work in a multi-cultural environment with customers spread across the globe speaking different languages. Just about 5 years ago, a customer email in a local language had to be translated manually. And the quality of translation left a lot to be desired. Fast forward to today and language barriers seem almost non-existent! Most of the linked-in job postings now talk about practical AI usage experience rather than proficiency in certain language as a key skill. Language though comes as an added skillset. With the development of headsets that do on the fly translation, I guess it would be pushed further down in priority, all thanks to NLP and LLMs.


LLM Parameters
· ☕ 8 min read · 🤖 Naresh Mehta

When we start learning about Large Language Models (LLMs), it is but natural to become quite interested in how the various parameters, training data size, context size, tokens, etc. affect the performance of the model. And how the existing models out there in the wild; both open and closed source; use the different parameters, what are their strengths and weaknesses, etc. It is also important to know and compare the training data sizes used in such models so one can understand how much resources would a relative model need in order to be trained from scratch.


Build LLMs From Scratch
· ☕ 3 min read · 🤖 Naresh Mehta

Building Large Language Models (LLMs) from scratch is a complex and challenging task. It requires a deep understanding of the underlying mathematics and a strong foundation in computer science. In this post, we will explore the process of building a LLM from scratch and provide a step-by-step guide to help anyone get started.

LLMs are incredibly versatile, aiding in tasks such as checking grammar, composing emails, summarizing lengthy documents, and much more. They are “large”—very large—encompassing millions to billions of parameters. LLMs are a unique subset of AI. There is a very nice book Build LLMs from Scratch by Sebastian Raschka which shows a practical approach to building your own LLM.