Videos

See AI2's full collection of videos on our YouTube channel.
Viewing 21-30 of 259 videos
  • Benchmarking Compositionality with Formal Languages Thumbnail

    Benchmarking Compositionality with Formal Languages

    November 1, 2023  |  Josef Valvoda
    Abstract: Recombining known primitive concepts into larger novel combinations is a quintessentially human cognitive capability. Whether large neural models in NLP can acquire this ability while learning from data is an open question. In this paper, we investigate this problem from the perspective of formal…
  • Studying Large Language Model Generalization with Influence Functions Thumbnail

    Studying Large Language Model Generalization with Influence Functions

    October 31, 2023  |  Roger Grosse
    Abstract: When trying to gain better visibility into a machine learning model in order to understand and mitigate the associated risks, a potentially valuable source of evidence is: which training examples most contribute to a given behavior? Influence functions aim to answer a counterfactual: how would the…
  • Modular Language Models Thumbnail

    Modular Language Models

    October 16, 2023  |  Suchin Gururangan
    Conventional language models (LMs) are trained densely: all parameters are updated with respect to all data. We argue that dense training leads to a variety of well-documented issues with LMs, including their prohibitive training cost and unreliable downstream behavior. We then introduce a new class of LMs that…
  • Towards Cost-Efficient Use of Pre-trained Models Thumbnail

    Towards Cost-Efficient Use of Pre-trained Models

    October 10, 2023  |  Alan Ritter
    Abstract: Large language models are leading to many exciting breakthroughs, but this comes at a significant cost in terms of both computational and data labeling expenses. Training state-of-the-art models requires access to high-end GPUs for pre-training and inference, in addition to labeled data for fine-tuning…
  • Reliability and interactive debugging for language models Thumbnail

    Reliability and interactive debugging for language models

    October 6, 2023  |  Bhargavi Paranjape
    Abstract: Large language models have permeated our everyday lives and are used in critical decision making scenarios that can affect millions of people. Despite their impressive progress, model deficiencies may result in exacerbating harmful biases or lead to catastrophic failures. In this talk, I discuss several…
  • The University of Washington eScience Institute: a Home for Data-Intensive Discovery Thumbnail

    The University of Washington eScience Institute: a Home for Data-Intensive Discovery

    Abstract: The University of Washington eScience Institute, one of the nation's first university data science institutes, grew out of the Moore-Sloan Data Science Environment effort which was focused on identifying and tackling impediments to the broad and sustainable adoption of data-intensive discovery. With a…
  • Reliable Evaluation and High-Quality Data: Building Blocks for Helpful Question Answering Systems Thumbnail

    Reliable Evaluation and High-Quality Data: Building Blocks for Helpful Question Answering Systems

    September 26, 2023  |  Ehsan Kamalloo
    Abstract: As models continue to rapidly evolve in complexity and scale, the status quo of how they are being evaluated and the quality of benchmarks has not significantly changed. This inertia leaves challenges in evaluation and data quality unaddressed, which results in the potential for erroneous conclusions…
  • Vision Without Labels Thumbnail

    Vision Without Labels

    September 13, 2023  |  Bharath Hariharan/Cornell University
    Bio: Bharath Hariharan is an assistant professor at Cornell University. He works on problems in computer vision and machine learning that defy the big data label. He did his PhD at University of California, Berkeley with Jitendra Malik. His work has been recognized with an NSF CAREER and a PAMI Young Researcher…
  • Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models Thumbnail

    Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models

    August 31, 2023  |  Mayee Chen, PhD Student, Stanford University
    Bio: Mayee Chen is a PhD student in the Computer Science department at Stanford University advised by Professor Christopher Ré. She is interested in understanding and improving how models learn from data. Recently, she has focused on problems in data selection, data labeling, and data representations, especially…
  • From Compression to Convection: A Latent Variable Perspective Thumbnail

    From Compression to Convection: A Latent Variable Perspective

    August 30, 2023  |  Prof. Stephan Mandt/UC Irvine
    Abstract: Latent variable models have been an integral part of probabilistic machine learning, ranging from simple mixture models to variational autoencoders to powerful diffusion probabilistic models at the center of recent media attention. Perhaps less well-appreciated is the intimate connection between latent…