Years ago, artificial intelligence (AI) was one of the hottest areas in computer science. In the mid 1970s I managed a small team in our Watson Research Center doing work in AI (my first management job in IBM). When you look back, all the efforts from those days to create computers with AI capabilities pretty much fizzled, including a massive project launched in Japan in the 1980s called The Fifth Generation Computer Project. For a variety of reasons we got it all wrong, were way off base in thinking that we could program computers to act intelligently, and naively underestimated the kind of computing power and storage needed for such problems.
In the 1990s, we finally started making progress with a very different approach. This time around, we took a brute force approach -- relying on a computer's ability to store huge amounts of information and analyze it with vast amounts of computational power -- and discovered that this mixture, when properly focused on a problem, produced something akin to intelligence or knowledge. Deep Blue, IBM's chess playing supercomputer demonstrated this point by beating then reigning chess champion Gary Kasparov in a celebrated match in May 1997 using this brute force approach.
Since that time, analyzing or searching large amounts of information has become increasingly important and commonplace. Today, most of us use search engines as the primary mechanism for finding information in the World Wide Web, and increasingly in our PCs. Search engines rely primarily on finding specific words or phrases. It is amazing how useful these key-word based approaches have proven to be in everyday use, but they can only go so far.
The next major frontier involves discovering the valuable knowledge that is embedded deep down in collections of information, not just in the WWW, but in the enormous amounts of unstructured information all around us that is now being digitized including all manner of business and government documents, technical manuals, customer service reports, e-mails, voice conversations, images, videos, blogs, podcasts, and on and on. We are not only digitizing just about everything in sight, but we now are able to store, access and analyze this growing mountain of unstructured information by relying on our increasingly powerful and inexpensive computer technologies. Whole new classes of applications are now emerging to leverage all that discovered knowledge to satisfy customers, anticipate problems and quickly find a solution, and develop new business opportunities in health care, pharmaceuticals, customer services, security, and many others areas.
To make it possible to extract or discover useful knowledge, unstructured information must be analyzed to locate the basic entities and relationships of interest, which must then be structured so that powerful search technologies can efficiently find what you need, when you need it. Since there are so many types of information and so many forms that useful knowledge can take, there is (so far) no one universal analysis engine that can do it all. Rather, you need a platform on which to develop and run the variety of analysis and search engines that are needed to bridge from the unstructured to the structured worlds.
The Unstructured Information Management Architecture (UIMA) developed in IBM Research over the last four years is such a software architecture and framework for supporting the development, integration and deployment of search and analysis technologies. You can learn more about UIMA in a recent issue of the IBM Systems Journal, which includes papers on UIMA’s applications in life sciences and market intelligence.
Given the sheer complexity of the subject as well as its importance to the overall IT community, UIMA is an open, collaborative initiative in which IBM is playing a leading role. The project has received significant support from DARPA the research arm of the US Department of Defense that is probably best known for having funded the development of the Internet. Several leading universities have been participating in the project, including Carnegie Mellon, Columbia, Stanford and The University of Massachusetts (Amherst). Other organizations actively supporting UIMA include Science Applications International Corp., BBN Technologies, the Mayo Clinic and MITRE Corporation.
To encourage everyone to experiment with UIMA, the software has been available for download free of charge for a while now. This week, we are going a step further by announcing our plans to donate UIMA to the open source community. We are also announcing that we are integrating UIMA capabilities in our enterprise search platform, WebSphere Information Integrator OmniFind Edition, and that more than 15 companies are announcing plans to develop UIMA-compliant software, solutions and services. (Update, see also a very good blog story on Monday's UIMA announcement.)
A lot has changed since I first worked on artificial intelligence and knowledge-based problems thirty years ago (probably foremost for me being the fact that I am twice as old as I was then). We have learned how incredibly complicated it is to make our dumb computers appear intelligent, perhaps still the grandest challenge of all. But our technologies have advanced way beyond our expectations; we have the Internet as the most wonderful platform for innovation anyone ever created; and we are learning to collaborate with each other in tackling the toughest problems. We have a lot to do, but we are making real progress.
Very good information about UIMA. Your link is incorrect in that you need to take off the /index.htm piece.
Posted by: Bill Zobrist | August 09, 2005 at 08:55 PM