• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Humans Fight, But Watson’s Chips Beat Quiz Champs

    February 21, 2011 Timothy Prickett Morgan

    Like a lot of Americans, the only time I ever watch the Jeopardy! game show is if I happen to be exhausted on a Friday night, I happen to have my work done early, and I happen to feel like spending a little cash to go down to the local beer and burger joint with the wife and kids. And when we do go, we inevitably fight over who gets to sit in the benches that face the TV so we can watch the game clues and try to come up with the questions and play along.

    But last week, from Monday through Wednesday, my kids and I sat absolutely riveted to the TV at 7 p.m., watching the contest between the all-time Jeopardy! champs, Ken Jennings and Brad Rutter, who took on IBM‘s Watson question-answer machine (what some people are calling a supercomputer) in a two-round, three-day championship tournament that pitted men against machine on the slippery puns of this game show, which is as old as IBM’s System/360 mainframe.

    In my conversation with IBM principle investigator, David Ferruci, who I talked to back in April 2009 when the “Blue J” project was launched to the world as the Watson QA system, that the machine would be processing natural language like we humans do. And in a classic example of how tough it is to actually communicate in this world, what I heard when Ferruci said that was “audio and visual text processing” and what Ferruci meant was “accepting textual input and breaking sentences down into components.” Watson did not hear Jeopardy! host Alex Trebek read the clues and could not hear when a human contestant gave a response that was wrong and then inadvertently repeated it. (Over the course of three days, it happened once, I believe.)

    I was a bit disappointed by this, but David Gondek, one of the researchers at IBM’s TJ Watson Research Center in Hawthorne, New York, where the three-day tournament was played in a full mock-up of the Jeopardy! set, explained to me that IBM felt that the challenges in parsing sentences, trying to come up with responses, and creating a category and betting strategy were hard enough without adding speech recognition for three different people that Watson would have been able to hear, if it had ears. As it turns out, the moment a clue is revealed and Trebek starts reading the clue outloud (where human players Jennings and Rutter could read it on their screens), the same clue was transmitted as a text message to Watson. The buzzer is dead for all three players until Trebek finishes reading the clue and a human operator behind the panel flips a switch that turns the clue screen border red. At that moment, the player buzzers are enabled, and if a player hits a buzzer before the clue screen turns red, their buzzer button is put into statis for a half second, thus giving someone else the jump.

    Jennings is known to be at one with the Jeopardy buzzer, which is one reason why he was able to win 74 straight Jeopardy! games. I happen to think that after Jennings had been playing for a while the other human players got psyched out as they took their turns trying to take him down, probably jumping the gun as many times as they missed the buzz. Watson doesn’t have emotions, can’t be psyched, and viciously ruled the buzzer during most of the three days of play.

    Watson’s original hardware was a few racks of BlueGene/P massively parallel supercomputing nodes, but last year, knowing that the Jeopardy! challenge would be the biggest infomercial for Power Systems that Big Blue could ever wish for, the company switched to the absolutely midrange Power 750 servers. Watson’s processing nodes are four-socket Power 750 machines, which use the latest eight-core Power7 chips from IBM. There are nine machines in each rack, for a total of 90 servers and 2,880 cores. The cores are spinning at 3.55 GHz.

    The cluster has 16 TB of main memory and 4 TB of clustered storage and has been stuffed with some 200 million pages of text data. It is not clear if Watson uses flash or disk drives, and what connects the server nodes together. (Gondek is not a hardware guy.) But I suspect that the machine has flash drives and uses either 40 Gb/sec InfiniBand or 10 Gigabit Ethernet switches to link the nodes. I would be using InfiniBand and its Remote Direct Memory Access (RDMA) capability if I were designing the hardware. What I can tell you is that the data is distributed across multiple nodes, and there is redundancy in the way the data is spread around so that a node failure doesn’t kill the machine as it plays.

    IBM’s researchers chose Novell‘s SUSE Linux Enterprise Server 11 operating system to run on the nodes, which has slightly better performance on certain kinds of HPC work than either AIX or IBM i. The secret sauce in Watson is a set of software that IBM calls DeepQA, and as you can see from here, the company is not saying much about precisely what this stack of code is. But I have been able to piece together a few things.

    The Apache Software Foundation was bragging that Watson makes use of the Hadoop data chunking program to organize information and make is accessible in a parallel–and therefore superfast–fashion. (Hadoop is an open source analog to the data-chewing technique created by Google as implemented by geeks at Yahoo after they read a paper describing Google’s proprietary MapReduce data chunking technique.)

    Hadoop organizes the information, but it is Apache UIMA, short for Unstructured Information Management Architecture, that allows for unstructured information–text, audio, and video streams in theory, but text in the Watson example–to be analyzed and run through natural language parsing algorithms to figure out what is going on in that text. IBM started the UIMA effort in 2005, and the OmniFind semantic search engine in its DB2 data warehouses, for instance, are based on it. Since then, IBM has proposed UIMA as a standard and converted it to an open source project.

    UIMA has frameworks for Java and C++, and Gondek says that most of the analytic algorithms created for Watson were written in Java, such as question analysis, passage scoring, and confidence estimation routines. There is a mix of C and C++ for algorithms where speed is important, and Prolog is used to do the question analysis. There’s about a million line of code in these routines.

    So you can’t just take Hadoop and UIMA and create your own Watson Jeopardy!-playing machine. Sorry.

    In addition to searching data in memory and parsing clues, Watson also was taught with pattern recognition software how to take the words in a clue and figure out what category of clue it is; meaning, is it looking for a person, place, or thing? Is it geography or a movie? Once these algorithms, which learned the kinds of words tend to appear in what kinds of categories by being fed the data from over 15,000 clue-response sets from real Jeopardy! games, figure out the category, then sets loose hundreds of algorithms that were created, largely by trial and error, to help it best sift through its data to find the right answer on particular kinds of clues. By chewing through those 15,000 clue-response sets from real games, Watson learns which algorithms are best for Jeopardy! specific categories. These algorithms, which took years to develop, are what gave Watson confidence in its answers–or showed when it did not have confidence. The different algorithms are given different weights for different categories of questions, and the overall probabilities shown during the tournament are some kind of average of all these statistics.

    Here are the clever bits about Watson. First, the eureka moment that turned Watson into a much better player than it was originally was when the IBM Research team figured out that unlike a normal search engine, which gives each term in a collection of words it is tracking down and cross-linking the same weight, Watson would have to learn how to zoom in on the important words and do so quickly. This helps it identify the category and reduce the number of possible answers, which helps the machine come up with the answer quickly.

    Another key insight was to limit the data. Rather than just suck all of the data out of the Internet (which would be a lot more than that), Watson relies on Wikipedia, the Bible, the Oxford English Dictionary, and various encyclopedias that summarize a lot of different data as its information source. Feeding it raw data–such as novels or full technical manuals–would only end up confusing the machine and make it worse at playing the game, Gondek explained to me. By restricting itself to what are essentially encyclopedic resources that have already culled down lots of data about zillions of things, Watson has something akin to Cliff Notes for the very broad domain encompassed by Jeopardy!

    If you missed the Jeopardy! challenge, you can watch it on YouTube here, although you will have to hunt and peck around for the different video pieces. (I am fairly certain these are protected by copyright and may not be available when you go to view them.)

    If you want to review the clues and responses, check out the J-Archive, a community driven site that posts the clues and responses to thousands of games. The first day’s game is number 3575 in the archive, from February 14. The second part of the first round is number 3576 from February 15, and the final day, which was for a whole, normal Jeopardy! match, is number 3577.

    I kept track of the scores in real time, and built a table showing how each player progressed over the three days. Humanity did OK in two of the four rounds, but got whupped in two others, as you can see:

    Ken IBM Brad
    Jennings Watson Rutter
    1 $200
    2 $400
    3 Daily Double, Bet $1,000
    $1,400
    4 $1,600
    5 $1,800
    6 $200
    7 $2,800
    8 $3,000
    9 $3,200
    10 $3,600
    11 $4,000
    12 $600
    13 $4,600
    14 $1,000
    15 $5,200
    Commercial Break Commercial Break Commercial Break
    16 Wrong
    $600 $4,800
    17 $1,600
    18 $5,800
    19 $6,400
    20 $2,200
    21 Wrong Wrong No Guess
    $1,200 $5,400
    22 $2,200
    23 Wrong Wrong
    $1,200 $4,400 $2,000
    24 $2,600
    25 Wrong
    $3,600 $3,400
    26 $4,200
    27 $5,000
    28 $2,000
    29 $4,400
    30 $5,000
    Round One
    Finals:
       
    $2,000 $5,000 $5,000
      Ken IBM Brad
      Jennings Watson Rutter
    Round One
    Finals:
       
    $2,000 $5,000 $5,000
    1 $7,000
    2 $8,600
    3 $10,200
    4 $11,400
    5 $13,400
    6 $14,600
    7 Daily Double, Bet $6,435
    $21,035
    8 Wrong Wrong Wrong
    $400 $19,435 $3,400
    9 $21,035
    10 Daily Double, Bet $1,246
    $22,281
    11 $22,681
    12 $23,081
    13 $23,481
    14 $23,881
    15 $1,200
    Commercial Break Commercial Break Commercial Break
    16 $4,600
    17 $25,881
    18 $26,681
    19 $2,000
    20 Wrong Wrong Wrong
    21 $28,281
    22 $28,681
    23 $2,400
    24 $30,681
    25 $31,881
    26 $5,400
    27 $33,081
    28 $33,881
    29 $35,881
    30 $36,681
    Final Jeopardy
    FJ Bet $2,400 Bet $947 Bet $5,000
    Right Wrong Right
    Round Two
    Finals:
       
    $4,800 $35,734 $10,400
      Ken IBM Brad
      Jennings Watson Rutter
    Round Three    
    1 $200
    2 $1,000
    3 Wrong Wrong
    $0 -$1,000
    4 $800
    5 $1,800
    6 $2,600
    7 $3,600
    8 -$200
    9 $800
    10 $800
    11 $1,400
    12 $2,400
    13 $1,600
    14 Wrong
    $3,400 $600
    15 $4,200
    Commercial Break Commercial Break Commercial Break
    16 $4,400
    17 $3,600
    18 Daily Double, Bet $3,600
    $7,200
    19 $7,800
    20 $1,200
    21 $5,000
    22 $1,400
    23 $1,800
    24 $8,400
    25 $8,600
    26 Wrong Wrong No response
    $8,200 $4,600
    27 $2,200
    28 $8,600
    29 $2,400
    30 $4,800
      Ken IBM Brad
      Jennings Watson Rutter
    Round Four    
    1 $3,600
    2 $10,200
    3 $6,800
    4 $8,400
    5 Daily Double, Bet $2,127
    Wrong
    $6,273
    6 $11,800
    7 $13,400
    8 $15,000
    9 $7,473
    10 $8,673
    11 $9,873
    12 $11,873
    13 $13,873
    14 $17,000
    15 $15,073
    16 Daily Double, Bet $367
    $15,440
    17 $15,840
    18 $16,240
    19 $4,400
    20 $18,240
    21 $17,800
    22 $20,240
    23 $18,200
    24 $5,200
    25 $5,600
    26 $21,040
    27 $21,440
    28 $21,840
    29 $22,640
    30 $23,440
    Final Jeopardy
    FJ Bet $1,000 Bet $17,973 Bet $5,600
    Right Right Right
    $19,200 $41,413 $11,200
    Grand Totals    
    $24,000 $77,147 $21,600

    As Ken Jennings, who actually beat Watson in a full beta test run of the game back in January (IBM didn’t tell us that ahead of the show), put it as he answered his Final Jeopardy question: “I, for one, personally welcome our computer overlords.”

    It will be interesting to see how many doctors, lawyers, and middle managers welcome a Watson into their offices when the system is reprogramed to do question-answer analysis on medicine, the law, and business.

    RELATED STORIES

    Humans $4,600, Watson $4,400 in Jeopardy! Beta Test Round

    IBM’s Watson Supercomputer to Play Jeopardy! and Challenge Humanity

    IBM Gets Hybrid with Servers, Talks Up BAO Boxes



                         Post this story to del.icio.us
                   Post this story to Digg
        Post this story to Slashdot

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    Midrange Dynamics North America

    With MDRapid, you can drastically reduce application downtime from hours to minutes. Deploying database changes quickly, even for multi-million and multi-billion record files, MDRapid is easy to integrate into day-to-day operations, allowing change and innovation to be continuous while reducing major business risks.

    Learn more.

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Admin Alert: Six Techniques to Prevent Power i Upgrades from Slowing Down Infor Has High Hopes for New S&OP Application

    Leave a Reply Cancel reply

Volume 20, Number 7 -- February 21, 2011
THIS ISSUE SPONSORED BY:

BCD
Vision Solutions
Townsend Security
ManageEngine
Shield Advanced Solutions

Table of Contents

  • New Power Systems VP Talks IBM i Strategy, Roadmaps
  • Humans Fight, But Watson’s Chips Beat Quiz Champs
  • Lotus on IBM i: A Chat with Some Users
  • As I See It: The Digital Uprising
  • Zend Gently Nudges Customers to New PHP Runtime
  • Reader Feedback on AS/400 to i Mystery Solved
  • Soltis to Explain the Future of IBM i in Summit Keynote
  • Big Blue Predicts Cloudy IT Skies by 2015
  • IBM Discounts BladeCenter 10 GE Switches Bought Online
  • Novell Shareholders Vote Yes for $2.2 Billion Attachmate Acquisition

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Meet The Next Gen Of IBMers Helping To Build IBM i
  • Looks Like IBM Is Building A Linux-Like PASE For IBM i After All
  • Will Independent IBM i Clouds Survive PowerVS?
  • Now, IBM Is Jacking Up Hardware Maintenance Prices
  • IBM i PTF Guide, Volume 27, Number 24
  • Big Blue Raises IBM i License Transfer Fees, Other Prices
  • Keep The IBM i Youth Movement Going With More Training, Better Tools
  • Remain Begins Migrating DevOps Tools To VS Code
  • IBM Readies LTO-10 Tape Drives And Libraries
  • IBM i PTF Guide, Volume 27, Number 23

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle