Video length is 12:28

Open and Reproducible LLM of Code

From the series: MathWorks Research Summit

Arjun Guha, Northeastern University

Large Language Models (LLMs) have gained widespread popularity since the release of ChatGPT by OpenAI, GitHub Copilot, and other models. LLMs are based on transformer models, which are a special case of deep learning models. These transformers are designed to track relationships in sequential data and rely on a self-attention mechanism to capture global dependencies between input and output. LLMs have revolutionized natural language processing because they can capture complex relationships between words and nuances present in human language. It requires utilize extensive data to train a neural network, which can then be used to predict and generate text, perform sentiment analysis, build chatbots, or write code.

In his talk, Arjun Guha from Northeastern University dives into the history of LLMs and how they tokenize, predict, and generate code suggestions based on probability distributions. The talk focuses on the BigCode project, which is a collaboration of scientists working to create an open and responsible use of LLMs for coding. As part of the talk, StarCoder is introduced, which is an open language model trained on 80 programming languages detailing access to model architecture, weights, and training data, enabling users to adapt and fine tune the model for various programming languages.

Published: 13 Mar 2025

Related Products