MScAC Talk: A deep dive into the challenges of productionizing large language models with Stephen Zhen Gou - Department of Mechanical & Industrial Engineering

Thursday, March 31, 2022
4:00pm-5:00pm

Platform: Zoom Webinar
Register here: https://mscac.utoronto.ca/talks

About this MScAC talk:

Cohere is a fast-growing AI startup based in Toronto, that aspires to make the most powerful NLP models accessible to developers and businesses. Large transformer-based language models have taken over NLP in the past few years because of their elegant architectures, ability to understand texts and memorize knowledge, but more importantly, because their performance scales incredibly well with model size. However, it’s prohibitive for any individual or most organizations to attempt to train or deploy these models with hundreds of billions to parameters due to the extremely high cost of infrastructure. As a result, the life cycle of these gigantic models is truly a blackbox. In this talk, Stephen will provide a lens into each critical stage of productionizing such models. From data collection, training, to inference and model safety, the talk will dive into the unique technical and organizational challenges posed by each stage in model productions.

About Stephen Zhen Gou:

Stephen Gou is a founding engineer at Cohere AI and now manages a team on large language model optimization and productionisation. He specializes in distributed training, compression, and efficient model architectures of massive transformer models. Prior to Cohere, he worked extensively on perception models for self-driving cars at Uber ATG. In addition, he had experience in rendering engines in the computer graphics industry before switching his career path to focus on machine learning.