Assistant Professor
the University of Washington, Information School.
Title: Understanding and Countering Problematic Information on Social Media Platforms Date: 31st October, 2023 Duration: 20:00-21:00 Venue: Online
Online social media platforms have brought numerous positive changes, including access to vast amounts of news and information. Yet, those very opportunities have created new challenges—our information ecosystem is now rife with problematic content, ranging from misinformation, conspiracy theories, to hateful and incendiary propaganda. As a social computing researcher, my work introduces computational methods and systems to understand and design defenses against such problematic online content. In this talk, I will focus on two aspects of problematic online information: 1) conspiracy theories and 2) extremist propaganda. First, leveraging data spanning millions of conspiratorial posts on Reddit, 4chan, and 8chan, I will present scalable methods to unravel who participates in online conspiratorial discussions, what causes users to join conspiratorial communities and then potentially abandon them. Second, I will dive into a special type of problematic content: extremist hate groups. Merging theories from social movement research with big data analyses, I will discuss the ecosystem of extremists’ communication and the roles played by them. Finally, I will close by previewing important new opportunities to address some of these problems, including conducting social audits to defend against algorithmically generated misinformation and designing socio-technical interventions to promote meaningful credibility assessment of information.
anu Mitra is an Assistant Professor at the University of Washington, Information School, where she leads the Social Computing research group. She and her students study and build large- scale social computing systems to understand and counter problematic information online. Her research spans auditing online systems for misinformation and conspiratorial content, understanding digital misinformation, unraveling narratives of online extremism and hate, and building technology to foster critical thinking online. Her work employs a range of interdisciplinary methods from the fields of human computer interaction, data mining, machine learning, and natural language processing. Dr. Mitra’s work has been supported by grants from the NSF, NIH, DoD, Social Science One, and other Foundations. Her research has been recognized through multiple awards and honors, including an NSF-CRII, an early career ONR-YIP, Adamic-Glance Distinguished Young Researcher award and Virginia Tech College of Engineering Outstanding New Assistant Professor award, along with several best paper honorable mention awards. Dr. Mitra received her PhD in Computer Science from Georgia Tech’s School of Interactive Computing.
Professor
Luddy School of Informatics, Computing, and Engineering (SICE) Indiana University, Bloomington, USA
Title: Network communities, embeddings, and the science of science. Date: May 8th, 2023 Duration: 19:30-20:30 Venue: Online
In this talk I will highlight some contributions in my focus research areas: network science and the science of science. We propose a measure based on the concept of robustness, that avoids Newman-Girvan modularity's biases by design: robustness modularity is the probability to find trivial partitions when the structure of the network is randomly perturbed. Also, I will discuss the effectiveness of graph embeddings, specifically node2vec, at discovering community structure. I will also show how an iterative procedure, that alternates embedding and edge weighting, makes clusters more easily detectable. In the science of science I have focused on the dynamics of impact and the evolution of science. The distributions of citations of papers published in the same discipline and year rescale to a universal curve, by properly normalizing the raw number of cites. Nobel Laureates are an endangered species. Finally I will present evidence of social contagion in science: active authors in a certain field induce their collaborators to work in that field.
Santo Fortunato is the Director of the Indiana University Network Science Institute (IUNI) and a Professor at Luddy School of Informatics, Computing, and Engineering of Indiana University. Previously he was professor of complex systems at the Department of Computer Science of Aalto University, Finland. Prof. Fortunato got his PhD in Theoretical Particle Physics at the University of Bielefeld In Germany. His focus areas are network science, especially community detection in graphs, computational social science and science of science. His research has been published in leading journals, including Nature, Science, Nature Physics, PNAS, Physical Review Letters, Physical Review X, Reviews of Modern Physics, Physics Reports and has collected over 40,000 citations (Google Scholar). His single-author article Community detection in graphs (Physics Reports 486, 75-174, 2010) is one of the best known and most cited papers in network science. Fortunato received the Young Scientist Award for Socio- and Econophysics 2011, a prize given by the German Physical Society, for his outstanding contributions to the physics of social systems. He is Fellow of the Network Science Society (2022) and of the American Physical Society (2022). He is the Founding Chair of the International Conference of Computational Social Science (IC2S2), which he first organized in Helsinki in June 2015. He was Chair of Networks 2021, the largest ever event on network science, a historical merger of the NetSci and Sunbelt conferences. He is author of the book A First Course in Network Science, by Cambridge University Press (2020), the most accessible textbook on the new science of networks.
Associate Professor
Carnegie Mellon University
Title: Is My NLP Model Working? The Answer is Harder Than You Think
Date: April 20, 2023 Duration: 19:00-20:00 Venue: Online
As natural language processing now permeates many different applications, its practical use is unquestionable. However, at the same time NLP is still imperfect, and errors cause everything from minor inconveniences to major PR disasters. Better understanding when our NLP models work and when they fail is critical to the efficient and reliable use of NLP in real-world scenarios. So how can we do so? In this talk I will discuss two issues: automatic evaluation of generated text, and automatic fine-grained analysis of NLP system results, which are some first steps towards a science of NLP model evaluation.
Graham Neubig is an associate professor at the Language Technologies Institute of Carnegie Mellon University and CEO of Inspired Cognition. His research focuses on natural language processing, with a focus on multilingual NLP, natural language interfaces to computers, and machine learning methods for NLP system building and evaluation. His final goal is that every person in the world should be able to communicate with each-other, and with computers in their own language. He also contributes to making NLP research more accessible through open publishing of research papers, advanced NLP course materials and video lectures, and open-source software, all of which are available on his website
Postdoctoral Associate under Prof. Marzyeh Ghassemi, at MIT CSAIL.
Title: Model Editing
Date: April 11th, 2023 Duration: 3:00 – 4:00 PMVenue: Online
TBD
Tom Hartvigsen is a Postdoctoral Associate at MIT's Computer Science and Artificial Intelligence Laboratory, where he works with Marzyeh Ghassemi. Tom focuses on core challenges in making machine learning and data mining systems responsibly deployable in healthcare settings, mostly for time series and text. His work has appeared at top venues such as KDD, ACL, NeurIPS, and AAAI. He also ran the 2022 NeurIPS workshop on Learning from Time Series for Health and is the general chair of the 2023 Machine Learning for Health Symposium. Tom received his Ph.D. in Data Science from Worcester Polytechnic Institute in 2021, where he was advised by Elke Rundensteiner and Xiangnan Kong.
Assistant Professor TU Munich School of Computation, Information and Technology
Title: The Myths about Overfitting Date: March 22, 2023 Venue: EE committee Room, 3rd Floor, Block III, IIT Delhi
Overfitting is the practice of using a complex machine-learning model that perfectly fits the training data. Historically, overfitting has been considered a "bad practice" that is expected to produce predictors which perform poorly on new data. However, the recommendation has changed in recent years, where overfitted neural networks perform surprisingly well in computer vision and natural language processing among others. For instance, in the ImageNet image classification benchmark (with 14 million images), the best architectures have more than a billion parameters and yet achieve 90% accuracy. This naturally raises the question of whether overfitting is a good practice or a bad practice. In this talk, I will discuss the mathematical foundations behind the classical and modern views about overfitting. I will start with a brief introduction to the statistical theory that leads to the conclusion "overfitting is a bad practice". I will then discuss some recent theoretical results debunk the following myths: 1. large models with too many parameters always overfit the training data; 2. models that perfectly fit the training data cannot predict well on unseen data. The above results are the basis of two promising research directions in machine learning theory: Neural Tangent Kernels -- that capture the training dynamics of wide neural networks -- and Double-Descent phenomenon -- a precise characterisation of the performance of overfitted models. We will finally see why the classical and the modern theories of overfitting are not at odds with each other. As examples, I will briefly talk about two recent works (arXiv:2202.09054, arXiv: 2210.09809), where we use how these tools can be used to get better understanding of modern machine/deep learning practices.
Debarghya Ghoshdastidar is an assistant professor at TU Munich School of Computation, Information and Technology, where he leads the research group on Theoretical Foundations of Artificial Intelligence. Previously, he obtained a bachelor's degree in electrical engineering from Jadavpur University, and a master's degree and PhD from the Indian Institute of Science. His research is at the intersection of statistics and machine learning, with a strong focus on understanding when can we learn from data, particularly beyond supervised learning. He also collaborates on applications of ML in physics and engineering. He is also a fellow at the Munich Data Science Institute (MDSI) and a member of the European Laboratory for Learning and Intelligent Systems (ELLIS).
Professor
Mohamed Bin Zayed University of Artificial Intelligence, UAE
Title: Machine learning and natural language processing in support of interactive automated tutoring for non-native writers Date: March 22, 2023 Venue: A-006, R & D Block, IIIT-Delhi
Massive on-line open courseware, adaptive learning management systems and the like are well-established, but their impact outside of a few subjects such as computer science and maths has been modest so far. The missing ingredient and essential difference between on-line maths or computer science and language learning courseware is objective automated and meaningful learning-oriented assessment (LOA). LOA, to be useful, must include both a summative and formative component, must address the appropriate linguistic tasks and skills, must be interpretable and actionable by the learner, and must be integrated into an adaptive and therefore personalised learning platform. I'll describe how we have exploited (deep) machine learning and natural language processing to develop LOA for non-native writers, building accurate automated graders benchmarked to Cambridge English exams as well as pedagogically useful feedback models to support learners interactively and incrementally improve their writing. Our writing assessment technology is already in use in Cambridge English courseware such as Empower and Linguaskill, and in Write and Improve (e.g: https://www.youtube.com/watch?v=LpTF_o_eyao).
Ted Briscoe was Professor of Computational Linguistics at the University of Cambridge until 2023 when he moved to MBZUAI. He has published over 150 research articles and three books in the areas of automated speech and language processing. He is co-founder and was CEO of iLexIR Ltd., a consultancy and technology provider specialising in language processing applications, and was the inaugural director of the Alta Institute which conducts research into automated language teaching and assessment. In 2009, iLexIR spun-out SwiftKey, maker of the world's most popular predictive keyboard for smartphones. In 2014, he co-founded English Language iTutoring Ltd and was its chief scientist until 2019. https://mbzuai.ac.ae/study/faculty/ted-briscoe/
Professor
IIT Gandhinagar
Title: Coresets for Machine Learning
Date: February 22, 2023
In the face of the data onslaught, smart algorithms have a big role to play. Over the last couple of decades, coresets, a small and efficiently calculable summary of the data, have grown in their popularity, both in theoretical and practical settings. They enable approximating large optimizations while needing only a fraction of the resources. In this talk, we will discuss a few recent results related to creating coresets for tensor factorization and Bregman clustering. Our coresets are online in nature, i.e., for every incoming point, it takes an irrevocable decision whether to include it in the coreset. We will also discuss some in-progress results related to creating coresets with deterministic guarantees.
Anirban Dasgupta is currently the N. Rama Rao Chair Professor of Computer Science & Engineering at IIT Gandhinagar. Prior to being at IIT Gandhinagar, he was a Senior Scientist at Yahoo Labs Sunnyvale. Anirban works on algorithmic problems for massive data sets, large-scale machine learning, analysis of large social networks, and randomized algorithms in general. He did his undergraduate studies at IIT Kharagpur and doctoral studies at Cornell University. He has received the Google Faculty Research Award (2015), the Cisco University Award (2016), the ICDT Best Newcomer Award (2016), and the Google India AI/ML Award (2020).
Associate Professor, Georgia Tech
Title: Data Science and AI for Epidemic Response: Time-Series Forecasting and Network Interventions
Date: February 17, 2022
B. Aditya Prakash is an Associate Professor in the College of Computing at the Georgia Institute of Technology (“Georgia Tech”). He received a Ph.D. from the Computer Science Department at Carnegie Mellon University, and a B.Tech. in Computer Science and Engineering from the Indian Institute of Technology (IIT) -- Bombay. He has published one book, more than 90 papers in major venues, holds two U.S. patents and has given several tutorials at leading conferences. His work has also received multiple best-of-conference, best paper and travel awards. His research interests include Data Science, Machine Learning and AI, with emphasis on big-data problems in large real-world networks and time-series, with applications to computational epidemiology/public health, urban computing, security and the Web. Tools developed by his group have been in use in places like ORNL, CDC and Walmart. He has also received several awards such as the Facebook Faculty Award (2015 and 2021), the NSF CAREER award and was named as one of ‘AI Ten to Watch’ by IEEE. His work has won awards in multiple global data science challenges (e.g., the C3AI COVID19 Grand Challenge) and been highlighted by several media outlets/popular press like FiveThirtyEight.com too. He is a member of the infectious diseases modeling MIDAS network and core-faculty at the Center for Machine Learning (ML@GT) and the Institute for Data Engineering and Science (IDEaS) at Georgia Tech.
Aditya’s Twitter handle is @badityap.
Professor, Northwestern University
Title: People Analytics: Using Digital Exhaust from the Web to Leverage Network Insights in the Algorithmically Infused Workplace
Date: January 10, 2022
In order to bring the performance of people analytics in the algorithmically infused workplace up — and in line with the hype — organizations need to do more than analyze data on demographic attributes. We need to focus not only on who people are but also who they know. The potential for social network analysis to identify “high potentials,” who has good ideas, who is influential, what teams will get work done efficiently and effectively is well established based on decades of research. The challenge has been the collection of network data via surveys that are time consuming, elicit low response rates and have a high obsolescence. This talk presents empirical examples ranging from corporate enterprises to simulated long duration space exploration to demonstrate how we can leverage people analytics – and in particular relational analytics - to mine “digital exhaust”— data created by individuals every day in their digital transactions, such as e‐mails, chats, “likes,” “follows,” @mentions, and file collaboration— to address challenges they face with issues such as team conflict, team assembly, diversity and inclusion, succession planning, and post-merger integration.
Noshir Contractor is the Jane S. & William J. White Professor of Behavioral Sciences in the McCormick School of Engineering & Applied Science, the School of Communication and the Kellogg School of Management and Director of the Science of Networks in Communities (SONIC) Research Group at Northwestern University. He is also the President-Elect of the International Communication Association (ICA).
Professor Contractor has been at the forefront of three emerging interdisciplines: network science, computational social science and web science. He is investigating how social and knowledge networks form – and perform – in contexts including business, scientific communities, healthcare and space travel. His research has been funded continuously for over 25 years by the U.S. National Science Foundation with additional funding from the U.S. National Institutes of Health, NASA, DARPA, Army Research Laboratory and the Bill & Melinda Gates Foundation.
His book Theories of Communication Networks (co-authored with Peter Monge) received the 2003 Book of the Year award from the Organizational Communication Division of the National Communication Association and the 2021 International Communication Association’s Fellows Book Award. He is a Fellow of the American Association for the Advancement of Science (AAAS), the Association for Computing Machinery (ACM), and the International Communication Association (ICA). He also received the Distinguished Scholar Award from the National Communication Association and the Lifetime Service Award from the Organizational Communication & Information Systems Division of the Academy of Management. He was selected as the recipient of the 2022 Simmel Award from the International Network for Social Network Analysis. In 2018 he received the Distinguished Alumnus Award from the Indian Institute of Technology, Madras where he received a Bachelor’s in Electrical Engineering. He received his Ph.D. from the Annenberg School of Communication at the University of Southern California.
Senior Developer Advocate, Google
Title: TensorFlow Hub for the NLP domain
Date: November 24, 2021
Gus is a Developer Advocate in the Google AI team. His main passion is making Machine Learning easier for developers. As of lately he's been working with TensorFlow Hub and helping make ML models more accessible to everyone
Senior Research Scientist, Snap
Title: Machine Learning on Graphs with Scarce Labels
Date: October 27, 2021
ML on graphs is a prominent, and rapidly developing research area in recent years. Notably, graph neural networks (GNNs) have been proposed as a means of learning representations which utilize graph topology and node features, and are optimized toward targeted tasks of interest. Several such tasks deal with the challenging problem of learning with scarce, or nonexistent labeled data, common in domains like online trust and integrity. In this talk, I'll overview two recent works on learning in label-free and label-scarce settings on graph data with GNNs. The first work, published at TNNLS'21, discusses the use of integrating traditional unsupervised and modern GNN-based approaches for graph anomaly detection in a unified framework, demonstrating marked improvements in learned representation quality for anomaly detection tasks over both traditional and modern independent baselines. The second work, published at AAAI'21, discusses an approach for data augmentation for GNNs which shows large improvements in semi-supervised node classification, particularly in label-scarce settings. Together, the two works paint a promising picture for practical use of graph ML methods in settings with limited labels, by incorporating useful priors and stretching available labeled data with augmentation strategies. I'll also very briefly mention a few other exciting works our group has been up to, grounded in practically minded GNN research.
Bio: Neil Shah is a Lead Research Scientist at Snap Inc, with interests spanning data mining, machine learning and computational social science, specifically in the contexts of graph-based modeling of user behavior and misbehavior. His work has resulted in 35+ conference and journal publications, in top venues such as KDD, WSDM, ICDM, WWW, CIKM, SDM, AAAI, TKDD and more, including several best-paper awards. He has also served as an organizer, chair and senior program committee member at a number of these. He has had previous research experiences at Lawrence Livermore National Laboratory, Microsoft Research, and Twitch.tv. He earned a PhD in Computer Science in 2017 from Carnegie Mellon University's Computer Science Department, funded partially by the NSF Graduate Research Fellowship.
Principal Researcher, Microsoft Research
Title: HeteGCN: Heterogeneous Graph Convolutional Networks for Text Classification
Date: September 29, 2021
We consider the problem of learning efficient and inductive graph convolutional networks for text classification with a large number of examples and features. Existing state-of-the-art graph embedding based methods such as predictive text embedding (PTE) and TextGCN have shortcomings in terms of predictive performance, scalability and inductive capability. To address these limitations, we propose a heterogeneous graph convolutional network (HeteGCN) modeling approach that unites the best aspects of PTE and TextGCN together. The main idea is to learn feature embeddings and derive document embeddings using a HeteGCN architecture with different graphs used across layers. We simplify TextGCN by dissecting into several HeteGCN models which (a) helps to study the usefulness of individual models and (b) offers flexibility in fusing learned embeddings from different models. In effect, the number of model parameters is reduced significantly, enabling faster training and improving performance in small labeled training set scenario. Our detailed experimental studies demonstrate the efficacy of the proposed approach.
Bio: I did my Bachelors in Computer Science and Engineering at University of Mumbai. I went on to do my Masters at Indian Institute of Technology, Kanpur (IIT Kanpur) where I did my thesis in Reinforcement Learning. To get some experience in building practical AI/ML system, I spent 4 years as a Research Engineer at Yahoo! Research, Bangalore. There I worked on building large scale information tagging, extraction and management systems. Then I went on to do my PhD at Indian Institute of Technology, Bombay (IIT Bombay) under Prof. Sunita Sarawagi and Prof. Saketa Nath. During my PhD, I worked on building machine learning models for predicting aggregate label statistics. Currently, post my PhD, I have joined Microsoft Research, exploring two areas - 1] Learning Representations on Graphs and 2] Intersection of Program Synthesis and Machine Learning.
Research Scientist, Facebook AI
Assistant Professor, UCLA
Title: Pretrained Transformers as Universal Computation Engines
Date: September 1, 2021
We investigate the capability of a transformer pretrained on natural language to generalize to other modalities with minimal finetuning -- in particular, without finetuning of the self-attention and feedforward layers of the residual blocks. We consider such a model, which we call a Frozen Pretrained Transformer (FPT), and study finetuning it on a variety of sequence classification tasks spanning numerical computation, vision, and protein fold prediction. In contrast to prior works which investigate finetuning on the same modality as the pretraining dataset, we show that pretraining on natural language can improve performance and compute efficiency on non-language downstream tasks. Additionally, we perform an analysis of the architecture, comparing the performance of a random initialized transformer to a random LSTM. Combining the two insights, we find language-pretrained transformers can obtain strong performance on a variety of non-language tasks.
Bio: Aditya Grover is a research scientist in the Core ML team at Facebook AI Research, a visiting postdoctoral researcher at UC Berkeley, and an incoming assistant professor of computer science at UCLA. His research centers around probabilistic modeling, approximate inference, and sequential decision making, with applications at the intersection of physical sciences and sustainability. His research has been recognized with a best paper award, a best undergraduate thesis award, several research fellowships (Microsoft Research, Lieberman, Google-Simons-Berkeley), and the ACM SIGKDD Doctoral Dissertation Award. Aditya received his PhD and bachelors in computer science from Stanford and IIT Delhi respectively.
Department Head and Professor of Economics
Virginia Tech
Title: Multilayer Networks
Date: August 6, 2021
Professor and Head of Machine Intelligence Unit
ISI Kolkata
Title: Deep Learning: Basics, Applications and Our Contribution
Date: August 5, 2021
Assistant Professor
SUTD
Title: Automated Hate Speech Detection
Date: July 29, 2021
Online hate speech is an important issue that breaks the cohesiveness of online social communities and even raises public safety concerns in our societies. Motivated by this rising issue, researchers have developed many traditional machine learning and deep learning methods to detect hate speech in online social platforms automatically. This talk aims to introduce the pressing problem of online hate speeches and discuss the automated hate speech detection methods.
Bio: Roy is an Assistant Professor at the Information Systems Technology and Design Pillar, Singapore University of Technology and Design. He is a faculty of the transformative Design and Artificial Intelligence programme. His research lies in the intersection of data mining, machine learning, social computing, and natural language processing. He is leading the Social AI Studio, a research group that focuses on designing the next-generation social artificial intelligence systems. He has published in top-tier venues in his relevant field, such as WWW, IJCAI, ICDM, TKDE, ACL, COLING, etc. He serves in the program committees of multiple top conferences. He is currently part of the editorial board for the Social Network Analysis and Mining (SNAM) journal.
Associate Professor, UPenn
Editor in Chief, TACL
Title: Interpretability Analysis for Named Entity Recognition
Date: July 27, 2021
In this talk I will present a set of experiments designed to help us understand what neural named entity recognition systems learn and why they make the predictions we see. Specifically, we seek to understand if systems learn name strings (Ani, Julia, Kathy) or if they are able to identify textual contexts that constrain the semantics class of whatever word appears in that context ("My name is __ "). I will present evidence that the performance of neural methods is largely driven by their ability to recognize word tokens as belonging to certain semantic classes. In a study with people, we find that in many cases people are indeed able to identify constraining contexts and figure out the class only from the context in the sentence. People's recognition of constraining contexts aligns better with predictions from biLSTM-CRF models than BERT models. I will present compelling evidence that current models do not integrate contextual clues effectively. These results indicate that NER is a challenging yet practical domain for testing machine text comprehension abilities.
Bio: Ani Nenkova is a Principal Scientist at Adobe Research and associate professor of computer and information science at the University of Pennsylvania (on leave). Her work on summarization, discourse, multi-modal emotion prediction and information extraction in the biomedical domain has been recognized with best paper awards at SIGDIAL 2010, EMNLP-CoNLL 2012, AVEC 2012 and AMIA 2021. Ani was program chair for NAACL in 2016 and currently serves as editor-in-chief for the Transactions of the Association for Computational Linguistics (TACL).
Assistant Professor
NYU
Title: Guarding Against Spurious Correlations in Natural Language
Understanding
Date: July 19, 2021
While we have made great progress in natural language understanding, transferring the success from benchmark datasets to real applications has not always been smooth. Notably, models sometimes make mistakes that are confusing and unexpected to humans. In this talk, I will discuss shortcuts in NLP tasks and present our recent works on guarding against spurious correlations in natural language understanding tasks (e.g. textual entailment and paraphrase identification) from the perspectives of both robust learning algorithms and better data coverage. Motivated by the observation that our data often contains a small amount of "unbiased" examples that do not exhibit spurious correlations, we present new learning algorithms that better exploit these minority examples. On the other hand, we may want to directly augment such "unbiased" examples. While recent works along this line are promising, we show several pitfalls in the data augmentation approach.
Bio: He He is an assistant professor in the Center for Data Science and Courant Institute at New York University. Before joining NYU, she spent a year at Amazon Web Services and was a postdoc at Stanford University. She received her PhD from University of Maryland, College Park. She is broadly interested in machine learning and natural language processing. Her current research interests include text generation, dialogue systems, and robust language understanding.
Associate Professor
UCI
Title: How to Win LMs and Influence Predictions
Date: July 08, 2021
Current NLP pipelines rely significantly on finetuning large pre-trained language models. Relying on this paradigm makes such pipelines challenging to use in real-world settings since massive task-specific models are neither memory- nor inference-efficient, nor do we understand how they fare in adversarial settings. This talk will describe our attempts to address these seemingly unrelated concerns by investigating how specific short phrases in the input can control model behavior. These short phrases (which we call triggers) will help us identify model vulnerabilities and introduce new paradigms of training models. In the first part of the talk, I will focus on the adversarial setting. I will show how easy it is for adversaries to craft triggers that cause a target model to misbehave when the trigger appears in the input. I will also introduce a data poisoning technique that enables adversaries to inject arbitrary triggers into the target model. However, in the second part of the talk, I will show how these triggers can also be used to “prompt” language models to act as task-specific models, providing a negligible-memory, no-learning way to create classifiers. I will end with a comprehensive study of the interplay between prompting and finetuning, providing some guidelines for effectively performing few-shot learning with large language models.
Bio: Dr. Sameer Singh is an Associate Professor of Computer Science at the University of California, Irvine (UCI). He is working primarily on the robustness and interpretability of machine learning, and models that reason with text and structure for natural language processing. Sameer was a postdoctoral researcher at the University of Washington and received his Ph.D. from the University of Massachusetts, Amherst. He has received the NSF CAREER award, selected as a DARPA Riser, UCI ICS Mid-Career Excellence in research award, and the Hellman and the Noyce Faculty Fellowships. His group has received funding from Allen Institute for AI, Amazon, NSF, DARPA, Adobe Research, Hasso Plattner Institute, NEC, Base 11, and FICO. Sameer has published extensively at machine learning and natural language processing venues, including conference paper awards at KDD 2016, ACL 2018, EMNLP 2019, AKBC 2020, and ACL 2020.
https://sameersingh.org/VP & Head | AI Garage
Mastercard
Title: Machine Learning Portfolio at AI Garage and Research Problems in the Fintech World
Date: June 15, 2021
Assistant Professor
UCSB
Title: Table Based Question Answering and Zero Shot Fact Checking
Date: June 10, 2021
Associate Professor
UMich
Title: Examining the Impact of Shocks on Collaborative Crowdsourcing
Date: April 13, 2021
Professor
IIT Bombay
Title: Explainable Artifical Intelligence
Date: April 10, 2021