One day in December 2009, Sumit Gulwani met a businesswoman on a flight from a seminar to his home. After the woman knew Gulwani was both a Ph.D. in computer science and a Microsoft researcher, she asked one question, “is there a way to merge two columns in Excel, when one column has a first name, the other the last, so that a column has both first and last names?” Gulwani couldn’t offer an answer.

There were hundreds of millions of people worldwide using spreadsheets, but most of them were non-programmers who can hardly create small and one-off applications to support business functions. Meanwhile, spreadsheet programming was far from satisfactory.

That anecdote inspired Gulwani, partner research manager at Microsoft, to delve deep into the programming-by-example, a novel approach that enables non-programmers to create programs by giving examples. After users present examples of input and output, the computer must find a method of transforming similar input data to desired outputs.

Programming-by-examples (PBE)

Born in India, Gulwani went to IIT Kanpur for his undergraduate study in computer science. Then he came to the United States to study in UC Berkeley and obtained his Ph.D. in computer science in 2005. His dissertation on using randomized algorithms to verify and discover program properties received Outstanding Doctoral Dissertation Award from ACM SIGPLAN.

After joining Microsoft in 2005, Gulwani has been devoted to shipping innovative program synthesis technologies, the idea of computers automatically writing programs. Over the next 13 years, His technologies have powered software like Excel and PowerBI, and Windows systems like Cortana and Powershell.

In 2011, Gulwani introduced algorithms for synthesizing string transformation programs by examples, published on Principles of Programming Languages Symposium. His program synthesis system can use input-output examples to create a wide range of string processing programs in spreadsheets, such as extracting bold or uppercase letters from a spreadsheet or transforming phone numbers into a uniform format.


Gulwani’s synthesizer system aims to replace the role of human experts on some Excel help forums and enable users to solve their problems in a few seconds as opposed to a few days.

PBE is an especially useful technique for data wrangling, the process of cleaning, structuring and enriching raw data into the desired format, such as column splitting, field extraction from log files/web pages, normalizing semi-structured spreadsheet into structured tables.

Flash Fill

That technology led to the Flash Fill feature of Microsoft Excel 2013 used by hundreds of millions of people. Flash Fill synthesizes millions of small programs — 10–20 lines of code — that might accomplish the task.

How does Flash Fill work? The Microsoft blog of Flash Fill explains in details, “Let’s say an Excel user has a column of Social Security numbers. However, they’re not formatted correctly–they’re 123456789, not 123–45–6789. An Excel user creates a new column adjacent to the existing column, then types in the correct example: 123–45–6789. Flash Fill immediately fills in all rows below the example with properly formatted figures. The user needs to click to accept them all.”

Today, Windows Excel is getting smarter at recognizing what users want to do with their data, from creating a new column of initials to extracting italic letters from a spreadsheet. Just type in the first few boxes to demonstrate your intention, select the rest of the column and press Enter. Voila!


Flash Fill-Excel Interface

To recognize his pioneering contributions to end-user programming and intelligent tutoring system, Gulwani was awarded the prestigious ACM SIGPLAN Robin Milner Young Researcher Award in 2014.

“Gulwani recognized the important connection between program verification and program synthesis. His research has demonstrated that imprecise human intent, in natural language and other kinds of input, can be transformed into incomplete program specifications, which can then be used to synthesize intended programs.” from the ACM’s award statement.

Today, Gulwani is leading the PROSE research and engineering team that develops APIs for program synthesis.

Gulwani believes some traditional machine learning technologies can be leveraged in PBE (Programming by Example) to improve its effectiveness and maintainability, including search algorithm, ranking strategy, and user interaction model. These technologies are fit to help solve challenges among building an applicable PBE, such as how to search for programs that are consistent with the examples provided by the user, how computers understand users’ intent from as few instances as possible, and how to provide transparency to users.

The ongoing AI revolution will further drive the integration between PBE and machine learning to facilitate the creation of intelligent software in general, Gulwani envisions. However, will emerging technologies like deep learning genuinely disrupt the existing program learning area? We look forward to the talk from Gulwani.

On Nov 9, 2018, Sumit Gulwani will speak at AI Frontiers Conference in San Jose, California.


AI Frontiers Conference brings together AI thought leaders to showcase cutting-edge research and products. Besides Sumit Gulwani, other speakers include: Ilya Sutskever (Founder of OpenAI), Jay Yagnik (VP of Google AI), Kai-Fu Lee (CEO of Sinovation), Mario Munich (SVP of iRobot), Quoc Le(Google Brian), Pieter Abbeel (Professor of UC Berkeley) and more.

For more information, please visit aifrontiers.com