The Web of Language

blog posts

It's alive! New computer learns language like a human, almost

Oct 8, 2010 3:15 pm by debaron@illinois.edu
Images
A computer at Carnegie Mellon University is reading the internet and learning from it in much the same way that humans learn language and acquire knowledge, by soaking it all up and figuring it out in our heads.
People’s brains work better some days than others, and eventually we will all run out of steam, but the creators of NELL, the Never Ending Language Learner, want it to run forever, getting better every day in every way, until it becomes the largest repository imaginable of all that’s e’er been thought or writ.

Not all brains run out of steam. In the 1953 movie “Donovan’s Brain,” the preserved brain of a millionaire uses the body of a well-meaning cognitive scientist to take revenge on the enemies who outlived him.

Since the first “electronic brains” began to appear in the late 1940s, it has been the goal of computer engineers and the occasional mad scientist to fashion machines that think and learn like people do. Or at least machines that perform functions analogous to some aspects of human thought, and which also self-correct by analyzing their mistakes and doing better next time around.

Above, Dr. Frankenstein cries, “Alive! It’s alive!” in Mel Brooks’ 1974 film, “Young Frankenstein.” Below, the New York Times announced yet another in a growing string of “electronic” or “electric” brains designed to “take the place of human minds” (July 3, 1949).

Setting out to create an infinite and immortal database is a big task: there’s a lot for NELL to learn in cyberspace, and a whole lot more that has yet to be digitized. But since NELL was activated a few months ago it has learned over 440,000 separate things with an accuracy of 74% which, to put it in terms that any Carnegie Mellon undergraduate can understand, is a C. In contrast, I have no idea how to count what I’ve learned since my own brain went on line, and no idea how many of the things that I know are actually correct, which suggests that all I’ve got on my cerebral transcript is an Incomplete.

An early infinite, immortal database: In this 1956 episode of “Science Fiction Theatre,” Dr. Lewis Milton (above) demonstrates his “mind machine,” which stores brain waves as electrical impulses and doubles as a hair dryer. The computer can then type out the uploaded brain waves on the “mindwriter” (below). The mind machine allows Milton to communicate his own uploaded spooky thoughts after he dies.

NELL’s programmers seeded it with some facts and relations so that it had something to start with, then set it loose on the internet to look for more. NELL sorts what it finds into categories like mountains, scientists, writers, reptiles, universities, web sites, or sports teams, and relations like “teamPlaysSport, bookWriter, companyProducesProduct.”

NELL also judges the facts it finds, promoting some of them to the higher category of “beliefs” if they come from a single trusted source, or if they come from multiple sources that are less reliable. According to the researchers, “More than half of the beliefs were promoted based on evidence from multiple [i.e., less reliable] sources,” making NELL more of a rumor mill than a trusted source. And once NELL promotes a fact to a belief, it stays a belief: “In our current implementation, once a candidate fact is promoted as a belief, it is never demoted,” a process that sounds more like religion than science.

Sometimes NELL makes mistakes: the computer incorrectly labeled “right posterior” as a body part. NELL proved smart enough to call ketchup a condiment, not a vegetable, a mislabeling that we owe to the “great [human] communicator,” Ronald Regan. But its human handlers had to tell NELL that Klingon is not an ethnic group, despite the fact that many earthlings think it is. Alex Trebek would be happy to know that, unlike Sean Connery, NELL has no trouble classifying therapists as a "profession," but the computer trips up on the rapists, which it thinks could possibly be "awardtrophytournament" (confidence level, 50%).

NELL knows that cookies are a “baked good,” but that caused the computer to assume that persistent cookies and internet cookies are also baked goods. But that’s not surprising, since it still hasn’t learned what metaphors are—NELL is only 87.5% confident that metaphors are “tools” (plus, according to NELL, there’s a 50-50 chance that metaphors are actually “book writers”).

Told by its programmers that Risk is a board game, NELL predicts with 91.4% confidence that security risk is also a board game. NELL knows that a number is a character, but then incorrectly classifies the plural, numbers, as a character trait (93.8% confidence). The computer is also 99.9% confident that business is an academic field, which may be reassuring to those in the b-school if not to those small business owners worrying about the continuation of the Bush tax cuts.

Most recently, NELL learned that grain products is also a “baked good” and anti-American cleric Muqtada al Sadr is a “terrorist organization.” But First Amendment proves a stumper: NELL with weak confidence calls the First Amendment a musical instrument, classifies the Second Amendment as a ‘hobby,’ and is completely unwilling to confess any knowledge of the fifth amendment at all.

NELL’s classification of First Amendment as, possibly, a musical instrument

But NELL’s programmers weren’t at all surprised that they needed to perform some minor tweaks to get the computer back on track, since as they put it, “One might expect a nonnative reader of English to make similar mistakes.” In their view, NELL is only human.

According to Bret and Jemaine, in the distant future computers won’t need humans to tweak them

It remains to be seen exactly how life-like NELL’s language learning really is. For one thing, the computer is reading its input, while most human language learners acquire language by listening and talking. Putting our love or fear of anthropomorphic computers aside for the moment, it’s clear that while NELL may have a bigger and more accurate memory than any human, it’s still a long way from being able to parse a question like, “What has four wheels and flies?”—something children learning language find both easy and funny, but machines don’t.

What’s next in the effort to make computers more like us? It wasn’t just memory that prompted the analogy between computers and human brains. Above: even MIT’s Norbert Weiner thought computers, often called calculating machines, could be programmed with emotions as well as logic. New York Times, May 30, 1949. Below: as we learned from Star Trek, mind machines like Mr. Data can't understand jokes until their emotion chips are turned on.

"The clown can stay, but the Ferengi in the gorilla suit has to go."
stats
- 3,161 Views
- 0 Comments
additional actions
- Add Comment

blog navigation

blog posts

It's alive! New computer learns language like a human, almost

Images

stats

additional actions