Natural Language Interface
July 29, 2013
It's a scene played out in many a comic book, novel, television show, and movie. The protagonist, say Captain Kirk, is talking to his computer and expecting answers.
Kirk: "Computer, how many furlongs in a light year."
There's the memorable scene in the movie, Star Trek IV: The Voyage Home, in which Scotty attempts to interact with a computer by talking into a mouse. Also, by 2001, we should have had talking computers like the HAL 9000.
Everyone seems to think that computers should accept verbal questions and give good answers; that is, computers should have a natural language user interface (NL). Some applications are now coming close to that ideal. Siri would be one example.
One reason why we're getting close to the Star Trek ideal is the phenomenal computing power that's now available at such a low cost in many consumer devices. Another reason might be the decision by NL scientists to divorce themselves from the artificial intelligence (AI) field. Many computer scientists decided to stop identifying their work as artificial intelligence when there was a backlash against AI's being over-sold to funding agencies in the last decades of the twentieth century.
NL was long in coming, since the problem goes far beyond parsing speech into a text file for further processing. The question, "List all bloggers on web sites with physics degrees," might be a problem to an NL system, since there are probably no web sites with physics degrees.
Computer scientists at MIT's Computer Science and Artificial Intelligence Laboratory have tackled a natural language interface for the specific task of forming regular expressions. Regular expressions are a scripting language contained in many programming languages and word processors to aid text search and replacement. This research was presented in June at the annual conference of the North American Chapter of the Association for Computational Linguistics.[1-2]
I used a regular expression in OpenOffice to remove extra linefeeds from the manuscript of one of my novels. Unless you use these expressions regularly (pun intended), it takes a while to craft the exact expression that solves your problem. As can be seen in the example in the following figure, you essentially need to be a computer scientist to craft even a simple regular expression.
It's interesting to note that even computer scientist have a hard time crafting regular expressions. When Nate Kushman, an MIT graduate student, presented the paper on a natural language interface of regular expressions to a roomful of computer scientists, he asked them to write a regular expression for a simple search. After displaying the proper expression, he polled the audience to see how many had written the right expression. Just a few had.
A natural language interface for regular expressions would allow both absent-minded computer scientists and non-programmers alike to do efficient search and replacement tasks in their word processors and spreadsheets. Kushman and Regina Barzilay, an associate professor of computer science and electrical engineering, used examples that they harvested from the Internet to train a computer system to generate regular expressions from natural language queries.
There are some examples where a forced syntax for a language query yields excellent results. One of these is structured query language (SQL) which is used for many database applications. Unfortunately, there isn't a good mapping between natural language and regular expressions. As an example, a regular expression for a search for three letter words starting with an "X," as shown in the figure, doesn't have any part that means "three letters."
Kushman and Barzilay found that it's possible to write regular expressions that map to natural language and are equivalent to the usual expressions. These expressions are not very succinct, nor are they intuitive. However, when they're found, they can be identified with their succinct counterpart using a graph. An example of the regular expression for finding three letter words starting with an 'X' is shown below. Compare with the regular expression in the first figure.
Computer: "Working... There are 46,996,813,387,000 furlongs
in a light year."
Kirk: "Well, Spock, I don't think he traveled by horse."
Spock: "Agreed, Captain."
In other work on natural language processing at MIT, Barzilay, Tao Lei, Fan Long and Martin Rinard have developed a program, called an input parser, to sort the data from other information in a computer file. A text file, for example, might have information about text formatting along with the actual text.
Their parser interprets the natural language specification of the file format, something that a programmer needs to do when creating a program to read and write such files. The MIT team had a good development resource, about 180 file format examples used in the Association for Computing Machinery's International Collegiate Programming Contest. The MIT natural language interface succeeded in about 80 percent of the specifications. In the failed cases, changing just a word or two of the specification usually gave a working input parser.
The natural language interface was efficient, taking about ten minutes of calculation on an ordinary laptop to produce the parsers for all these specifications. Luke Zettlemoyer, an assistant professor of computer science and engineering at the University of Washington, who was not a part of the natural language interface team, said that "the techniques they have developed should definitely generalize to other related programming tasks."
|starting with ’X’||X.*|
- Larry Hardesty, "Writing programs using ordinary language," MIT Press Release, July 11, 2013.
- Nate Kushman and Regina Barzilay, "Using Semantic Unification to Generate Regular Expressions from Natural Language," Preprint Available at MIT Web Site (PDF File).
- Tao Lei, Fan Long, Regina Barzilay and Martin Rinard, "From Natural Language Specifications to Program Input Parsers," Preprint Available at MIT Web Site (PDF File).
Permanent Link to this article
Linked Keywords: Comic book; novel; television program; television show; film; movie; protagonist; Captain Kirk; computer; furlong; light year; Spock; horse; Star Trek IV: The Voyage Home; Montgomery Scott; Scotty; HAL 9000; natural language user interface; Siri; Star Trek; consumer electronics; consumer device; scientist; artificial intelligence; computer scientist; AI winter; backlash; twentieth century; parsing; blog; blogger; physics; Massachusetts Institute of Technology; MIT; Computer Science and Artificial Intelligence Laboratory; regular expression; scripting language; programming language; word processor; research; conference; North American Chapter of the Association for Computational Linguistics; OpenOffice; newline; linefeed; manuscript; Christine Daniloff; Nate Kushman; postgraduate education; graduate student; voting; poll; audience; absent-mindedness; absent-minded; spreadsheet; Regina Barzilay; associate professor; computer science and electrical engineering; Internet; Jabberwocky; mathematician; Lewis Carroll; Through the Looking-Glass, and What Alice Found There; Tjatterskott; Harry Lundin; syntax; SQL; structured query language; database; mapping; graph; Tao Lei; Fan Long; Martin Rinard; parsing; input parser; data; specification; programmer; Association for Computing Machinery; International Collegiate Programming Contest; algorithmic efficiency; efficient; laptop; Luke Zettlemoyer; computer science and engineering; University of Washington.
Latest Books by Dev Gualtieri
Thanks to Cory Doctorow of BoingBoing for his favorable review of Secret Codes!
Blog Article Directory on a Single Page
- Levitation - March 27, 2017
- Soybean Graphene - March 23, 2017
- Income Inequality and Geometrical Frustration - March 20, 2017
- Wireless Power - March 16, 2017
- Trilobite Sex - March 13, 2017
- Freezing, Outside-In - March 9, 2017
- Ammonia Synthesis - March 6, 2017
- High Altitude Radiation - March 2, 2017
- C.N. Yang - February 27, 2017
- VOC Detection with Nanocrystals - February 23, 2017
- Molecular Fountains - February 20, 2017
- Jet Lag - February 16, 2017
- Highly Flexible Conductors - February 13, 2017
- Graphene Friction - February 9, 2017
- Dynamic Range - February 6, 2017
- Robert Boyle's To-Do List for Science - February 2, 2017
- Nanowire Ink - January 30, 2017
- Random Triangles - January 26, 2017
- Torricelli's law - January 23, 2017
- Magnetic Memory - January 19, 2017
- Graphene Putty - January 16, 2017
- Seahorse Genome - January 12, 2017
- Infinite c - January 9, 2017
- 150 Years of Transatlantic Telegraphy - January 5, 2017
- Cold Work on the Nanoscale - January 2, 2017
- Holidays 2016 - December 22, 2016
- Ballistics - December 19, 2016
- Salted Frogs - December 15, 2016
- Negative Thermal Expansion - December 12, 2016
- Verbal Cues and Stereotypes - December 8, 2016
- Capacitance Sensing - December 5, 2016
- Gallium Nitride Tribology - December 1, 2016
- Lunar Origin - November 27, 2016
- Pumpkin Propagation - November 24, 2016
- Math Anxiety - November 21, 2016
- Borophene - November 17, 2016
- Forced Innovation - November 14, 2016
- Combating Glare - November 10, 2016
- Solar Tilt and Planet Nine - November 7, 2016
- The Proton Size Problem - November 3, 2016
- Coffee Acoustics and Espresso Foam - October 31, 2016
- SnIP - An Inorganic Double Helix - October 27, 2016
- Seymour Papert (1928-2016) - October 24, 2016
- Mapping the Milky Way - October 20, 2016
- Electromagnetic Shielding - October 17, 2016
- The Lunacy of the Cows - October 13, 2016
- Random Coprimes and Pi - October 10, 2016
- James Cronin (1931-2016) - October 6, 2016
- The Ubiquitous Helix - October 3, 2016
- The Five-Second Rule - September 29, 2016
- Resistor Networks - September 26, 2016
- Brown Dwarfs - September 22, 2016
- Intrusion Rheology - September 19, 2016
- Falsifiability - September 15, 2016
- Fifth Force - September 12, 2016
- Renal Crystal Growth - September 8, 2016
- The Normality of Pi - September 5, 2016
- Metering Electrical Power - September 1, 2016
Deep Archive 2006-2008