Nova Spivack - My Public Twine Nova Spivack - My Public Twine / Items

Wolfram Alpha is Coming -- and It Could be as Important as Google (But It's Completely Different)

Get Feed

Notes:

- This article last updated on March 11, 2009.

- For follow-up, connect with me about this on Twitter here.

- See also: for more details, be sure to read the new review by Doug Lenat, creator of Cyc. He just saw the Wolfram Alpha demo and has added many useful insights.

--------------------------------------------------------------------

Introducing Wolfram Alpha

Stephen Wolfram is building something new -- and it is really impressive and significant. In fact it may be as important for the Web (and the world) as Google, but for a different purpose. It's not a "Google killer" -- it does something different. It's an "answer engine" rather than a search engine.

Stephen was kind enough to spend two hours with me last week to demo his new online service -- Wolfram Alpha (scheduled to open in May). In the course of our conversation we took a close look at Wolfram Alpha's capabilities, discussed where it might go, and what it means for the Web, and even the Semantic Web.

Stephen has not released many details of his project publicly yet, so I will respect that and not give a visual description of exactly what I saw. However, he has revealed it a bit in a recent article, and so below I will give my reactions to what I saw and what I think it means. And from that you should be able to get at least some idea of the power of this new system.

A Computational Knowledge Engine for the Web

In a nutshell, Wolfram and his team have built what he calls a "computational knowledge engine" for the Web. OK, so what does that really mean? Basically it means that you can ask it factual questions and it computes answers for you.

It doesn't simply return documents that (might) contain the answers, like Google does, and it isn't just a giant database of knowledge, like the Wikipedia. It doesn't simply parse natural language and then use that to retrieve documents, like Powerset, for example.

Instead, Wolfram Alpha actually computes the answers to a wide range of questions -- like questions that have factual answers such as "What is the location of Timbuktu?" or "How many protons are in a hydrogen atom?," "What was the average rainfall in Boston last year?," "What is the 307th digit of Pi?," or "what would 80/20 vision look like?"

Think about that for a minute. It computes the answers. Wolfram Alpha doesn't simply contain huge amounts of manually entered pairs of questions and answers, nor does it search for answers in a database of facts. Instead, it understands and then computes answers to certain kinds of questions.

(Update: in fact, Wolfram Alpha doesn't merely answer questions, it also helps users to explore knowledge, data and relationships between things. It can even open up new questions -- the "answers" it provides include computed data or facts, plus relevant diagrams, graphs, and links to other related questions and sources. It also can be used to ask questions that are new explorations between relationships, data sets or systems of knowledge. It does not just provides textual answers to questions -- it helps you explore ideas and create new knowledge as well)

How Does it Work?

Wolfram Alpha is a system for computing the answers to questions. To accomplish this it uses built-in models of fields of knowledge, complete with data and algorithms, that represent real-world knowledge.

For example, it contains formal models of much of what we know about science -- massive amounts of data about various physical laws and properties, as well as data about the physical world.

Based on this you can ask it scientific questions and it can compute the answers for you. Even if it has not been programmed explicity to answer each question you might ask it.

But science is just one of the domains it knows about -- it also knows about technology, geography, weather, cooking, business, travel, people, music, and more.

Alpha does not answer natural language queries -- you have to ask questions in a particular syntax, or various forms of abbreviated notation. This requires a little bit of learning, but it's quite intuitive and in some cases even resembles natural language or the keywordese we're used to in Google.

The vision seems to be to create a system wich can do for formal knowledge (all the formally definable systems, heuristics, algorithms, rules, methods, theorems, and facts in the world) what search engines have done for informal knowledge (all the text and documents in various forms of media).

How Does it Differ from Google?

Wolfram Alpha and Google are very different animals. Google is designed to help people find Web pages. It's a big lookup system basically, a librarian for the Web. Wolfram Alpha on the other hand is not at all oriented towards finding Web pages, it's for computing factual answers. It's much more like a giant calculator for computing all sorts of answers to questions that involve or require numbers. Alpha is for calculating, not for finding. So it doesn't compete with Google's core business at all. In fact, it is much more comptetive with the Wikipedia than with Google.

On the other hand, while Alpha doesn't compete with Google, Google may compete with Alpha. Google is increasingly trying to answer factual questions directly -- for example unit conversions, questions about the time, the weather, the stock market, geography, etc. But in this area, Alpha has a powerful advantage: it's built on top of Wolfram's Mathematica engine, which represents decades of work and is perhaps the most powerful calculation engine ever built.

How Smart is it and Will it Take Over the World?

Wolfram Alpha is like plugging into a vast electronic brain. It provides extremely impressive and thorough answers to a wide range of questions asked in many different ways, and it computes answers, it doesn't merely look them up in a big database.

In this respect it is vastly smarter than (and different from) Google. Google simply retrieves documents based on keyword searches. Google doesn't understand the question or the answer, and doesn't compute answers based on models of various fields of human knowledge.

But as intelligent as it seems, Wolfram Alpha is not HAL 9000, and it wasn't intended to be. It doesn't have a sense of self or opinions or feelings. It's not artificial intelligence in the sense of being a simulation of a human mind. Instead, it is a system that has been engineered to provide really rich knowledge about human knowledge -- it's a very powerful calculator that doesn't just work for math problems -- it works for many other kinds of questions that have unambiguous (computable) answers.

There is no risk of Wolfram Alpha becoming too smart, or taking over the world. It's good at answering factual questions; it's a computing machine, a tool -- not a mind.

One of the most surprising aspects of this project is that Wolfram has been able to keep it secret for so long. I say this because it is a monumental effort (and achievement) and almost absurdly ambitious. The project involves more than a hundred people working in stealth to create a vast system of reusable, computable knowledge, from terabytes of raw data, statistics, algorithms, data feeds, and expertise. But he appears to have done it, and kept it quiet for a long time while it was being developed.

Computation Versus Lookup

For those who are more scientifically inclined, Stephen showed me many interesting examples -- for example, Wolfram Alpha was able to solve novel numeric sequencing problems, calculus problems, and could answer questions about the human genome too. It was also able to compute answers to questions about many other kinds of topics (cooking, people, economics, etc.). Some commenters on this article have mentioned that in some cases Google appears to be able to answer questions, or at least the answers appear at the top of Google's results. So what is the Big Deal? The Big Deal is that Wolfram Alpha doesn't merely look up the answers like Google does, it computes them using at least some level of domain understanding and reasoning, plus vast amounts of data about the topic being asked about.

Computation is in many cases a better alternative to lookup. For example, you could solve math problems using lookup -- that is what a multiplication table is after all. For a small multiplication table, lookup might even be almost as computationally inexpensive as computing the answers. But imagine trying to create a lookup table of all answers to all possible multiplication problems -- an infinite multiplication table. That is a clear case where lookup is no longer a better option compared to computation.

The ability to compute the answer on a case by case basis, only when asked, is clearly more efficient than trying to enumerate and store an infinitely large multiplication table. The computation approach only requires a finite amount of data storage -- just enough to store the algorithms for solving general multiplication problems -- whereas the lookup table approach requires an infinite amount of storage -- it requires actually storing, in advance, the products of all pairs of numbers.

(Note: If we really want to store the products of ALL pairs of numbers, it turns out this is impossible to accomplish, because there are an infinite number of numbers. It would require an infinite amount of time to simply generate the data, and an infinite amount of storage to store it. In fact, just to enumerate and store all the multiplication products of the numbers between 0 and 1 would require an infinite amount of time and storage. This is because the real-numbers are uncountable. There are in fact more real-numbers than integers (see the work of Georg Cantor on this). However, the same problem holds even if we are speaking of integers -- it would require an infinite amount of storage to store all their multiplication products, although they at least could be enumerated, given infinite time.)

Using the above analogy, we can see why a computational system like Wolfram Alpha is ultimately a more efficient way to compute the answers to many kinds of factual questions than a lookup system like Google. Even though Google is becoming increasingly comprehensive as more information comes on-line and gets indexed, it will never know EVERYTHING. Google is effectively just a lookup table of everything that has been written and published on the Web, that Google has found. But not everything has been published yet, and furthermore Google's index is also incomplete, and always will be.

Therefore Google does and always will contain gaps. It cannot possibly index the answer to every question that matters or will matter in the future -- it doesn't contain all the questions or all the answers. If nobody has ever published a particular question-answer pair onto some Web page, then Google will not be able to index it, and won't be able to help you find the answer to that question -- UNLESS Google also is able to compute the answer like Wolfram Alpha does (an area that Google is probably working on, but most likely not to as sophisticated a level as Wolfram's Mathematica engine enables).

While Google only provide answers that are found on some Web page (or at least in some data set they index), a computational knowledge engine like Wolfram Alpha can provide answers to questions it has never seen before -- provided however that it at least knows the necessary algorithms for answering such questions, and it at least has sufficient data to compute the answers using these algorithms. This is a "big if" of course.

Wolfram Alpha substitutes computation for storage. It is simply more compact to store general algorithms for computing the answers to various types of potential factual questions, than to store all possible answers to all possible factual questions. In then end making this tradeoff in favor of computation wins, at least for subject domains where the space of possible factual questions and answers is large. A computational engine is simply more compact and extensible than a database of all questions and answers.

This tradeoff, as Mills Davis points out in the comments to this article is also referred to as the tradeoff between time and space in computation. For very difficult computations, it may take a long time to compute the answer. If the answer was simply stored in a database already of course that would be faster and more efficient. Therefore, a hybrid approach would be for a system like Wolfram Alpha to store all the answers to any questions that have already been asked of it, so that they can be provided by simple lookup in the future, rather than recalculated each time. There may also already be databases of precomputed answers to very hard problems, such as finding very large prime numbers for example. These should also be stored in the system for simple lookup, rather than having to be recomputed. I think that Wolfram Alpha is probably taking this approach. For many questions it doesn't make sense to store all the answers in advance, but certainly for some questions it is more efficient to store the answers, when you already know them, and just look them up.

Other Competition

Where Google is a system for FINDING things that we as a civilization collectively publish, Wolfram Alpha is for COMPUTING answers to questions about what we as a civilization collectively know. It's the next step in the distribution of knowledge and intelligence around the world -- a new leap in the intelligence of our collective "Global Brain." And like any big next-step, Wolfram Alpha works in a new way -- it computes answers instead of just looking them up.

Wolfram Alpha, at its heart is quite different from a brute force statistical search engine like Google. And it is not going to replace Google -- it is not a general search engine: You would probably not use Wolfram Alpha to shop for a new car, find blog posts about a topic, or to choose a resort for your honeymoon. It is not a system that will understand the nuances of what you consider to be the perfect romantic getaway, for example -- there is still no substitute for manual human-guided search for that. Where it appears to excel is when you want facts about something, or when you need to compute a factual answer to some set of questions about factual data.

I think the folks at Google will be surprised by Wolfram Alpha, and they will probably want to own it, but not because it risks cutting into their core search engine traffic. Instead, it will be because it opens up an entirely new field of potential traffic around questions, answers and computations that you can't do on Google today.

The services that are probably going to be most threatened by a service like Wolfram Alpha are the Wikipedia, Cyc, Metaweb's Freebase, True Knowledge, the START Project, and natural language search engines (such as Microsoft's upcoming search engine, based perhaps in part on Powerset's technology), and other services that are trying to build comprehensive factual knowledge bases.

As a side-note, my own service, Twine.com, is NOT trying to do what Wolfram Alpha is trying to do, fortunately. Instead, Twine uses the Semantic Web to help people filter the Web, organize knowledge, and track their interests. It's a very different goal. And I'm glad, because I would not want to be competing with Wolfram Alpha. It's a force to be reckoned with.

Relationship to the Semantic Web

During our discussion, after I tried and failed to poke holes in his natural language parser for a while, we turned to the question of just what this thing is, and how it relates to other approaches like the Semantic Web.

The first question was could (or even should) Wolfram Alpha be built using the Semantic Web in some manner, rather than (or as well as) the Mathematica engine it is currently built on. Is anything missed by not building it with Semantic Web's languages (RDF, OWL, Sparql, etc.)?

The answer is that there is no reason that one MUST use the Semantic Web stack to build something like Wolfram Alpha. In fact, in my opinion it would be far too difficult to try to explicitly represent everything Wolfram Alpha knows and can compute using OWL ontologies and the reasoning that they enable. It is just too wide a range of human knowledge and giant OWL ontologies are too difficult to build and curate.

It would of course at some point be beneficial to integrate with the Semantic Web so that the knowledge in Wolfram Alpha could be accessed, linked with, and reasoned with, by other semantic applications on the Web, and perhaps to make it easier to pull knowledge in from outside as well. Wolfram Alpha could probably play better with other Web services in the future by providing RDF and OWL representations of it's knowledge, via a SPARQL query interface -- the basic open standards of the Semantic Web. However for the internal knowledge representation and reasoning that takes places in Wolfram Alpah, OWL and RDF are not required and it appears Wolfram has found a more pragmatic and efficient representation of his own.

I don't think he needs the Semantic Web INSIDE his engine, at least; it seems to be doing just fine without it. This view is in fact not different from the current mainstream approach to the Semantic Web -- as one commenter on this article pointed out, "what you do in your database is your business" -- the power of the Semantic Web is really for knowledge linking and exchange -- for linking data and reasoning across different databases. As Wolfram Alpha connects with the rest of the "linked data Web," Wolfram Alpha could benefit from providing access to its knowledge via OWL, RDF and Sparql. But that's off in the future.

It is important to note that just like OpenCyc (which has taken decades to build up a very broad knowledge base of common sense knowledge and reasoning heuristics), Wolfram Alpha is also a centrally hand-curated system. Somehow, perhaps just secretly but over a long period of time, or perhaps due to some new formulation or methodology for rapid knowledge-entry, Wolfram and his team have figured out a way to make the process of building up a broad knowledge base about the world practical where all others who have tried this have found it takes far longer than expected. The task is gargantuan -- there is just so much diverse knowledge in the world. Representing even a small area of it formally turns out to be extremely difficult and time-consuming.

It has generally not been considered feasible for any one group to hand-curate all knowledge about every subject. The centralized hand-curation of Wolfram Alpha is certainly more controllable, manageable and efficient for a project of this scale and complexity. It avoids problems of data quality and data-consistency. But it's also a potential bottleneck and most certainly a cost-center. Yet it appears to be a tradeoff that Wolfram can afford to make, and one worth making as well, from what I could see. I don't yet know how Wolfram has managed to assemble his knowledge base in less than a very long time, or even how much knowledge he and his team have really added, but at first glance it seems to be a large amount. I look forward to learning more about this aspect of the project.

Building Blocks for Knowledge Computing

Wolfram Alpha is almost more of an engineering accomplishment than a scientific one -- Wolfram has broken down the set of factual questions we might ask, and the computational models and data necessary for answering them, into basic building blocks -- a kind of basic language for knowledge computing if you will. Then, with these building blocks in hand his system is able to compute with them -- to break down questions into the basic building blocks and computations necessary to answer them, and then to actually build up computations and compute the answers on the fly.

Wolfram's team manually entered, and in some cases automatically pulled in, masses of raw factual data about various fields of knowledge, plus models and algorithms for doing computations with the data. By building all of this in a modular fashion on top of the Mathematica engine, they have built a system that is able to actually do computations over vast data sets representing real-world knowledge. More importantly, it enables anyone to easily construct their own computations -- simply by asking questions.

The scientific and philosophical underpinnings of Wolfram Alpha are similar to those of the cellular automata systems he describes in his book, "A New Kind of Science" (NKS). Just as with cellular automata (such as the famous "Game of Life" algorithm that many have seen on screensavers), a set of simple rules and data can be used to generate surprisingly diverse, even lifelike patterns. One of the observations of NKS is that incredibly rich, even unpredictable patterns, can be generated from tiny sets of simple rules and data, when they are applied to their own output over and over again.

In fact, cellular automata, by using just a few simple repetitive rules, can compute anything any computer or computer program can compute, in theory at least. But actually using such systems to build real computers or useful programs (such as Web browsers) has never been practical because they are so low-level it would not be efficient (it would be like trying to build a giant computer, starting from the atomic level). 

The simplicity and elegance of cellular automata proves that anything that may be computed -- and potentially anything that may exist in nature -- can be generated from very simple building blocks and rules that interact locally with one another. There is no top-down control, there is no overarching model. Instead, from a bunch of low-level parts that interact only with other nearby parts, complex global behaviors emerge that, for example, can simulate physical systems such as fluid flow, optics, population dynamics in nature, voting behaviors, and perhaps even the very nature of space-time. This is the main point of the NKS book in fact, and Wolfram draws numerous examples from nature and cellular automata to make his case.

But with all its focus on recombining simple bits of information according to simple rules, cellular automata is not a reductionist approach to science -- in fact, it is much more focused on synthesizing complex emergent behaviors from simple elements than in reducing complexity back to simple units. The highly synthetic philosophy behind NKS is the paradigm shift at the basis of Wolfram Alpha's approach too. It is a system that is very much "bottom-up" in orientation. This is not to say that Wolfram Alpha IS a cellular automaton itself -- but rather that it is similarly based on fundamental rules and data that are recombined to form highly sophisticated structures.

Wolfram has created a set of building blocks for working with formal knowledge to generate useful computations, and in turn, by putting these computations together you can answer even more sophisticated questions and so on. It's a system for synthesizing sophisticated computations from simple computations. Of course anyone who understands computer programming will recognize this as the very essence of good software design. But the key is that instead of forcing users to write programs to do this in Mathematica, Wolfram Alpha enables them to simply ask questions in natural language and then automatically assembles the programs to compute the answers they need.

Wolfram Alpha perhaps represents what may be a new approach to creating an "intelligent machine" that does away with much of the manual labor of explicitly building top-down expert systems about fields of knowledge (the traditional AI approach, such as that taken by the Cyc project), while simultaneously avoiding the complexities of trying to do anything reasonable with the messy distributed knowledge on the Web (the open-standards Semantic Web approach). It's simpler than top down AI and easier than the original vision of Semantic Web.

Generally if someone had proposed doing this to me, I would have said it was not practical. But Wolfram seems to have figured out a way to do it. The proof is that he's done it. It works. I've seen it myself.

Questions Abound

Of course, questions abound. It remains to be seen just how smart Wolfram Alpha really is, or can be. How easily extensible is it? Will it get increasingly hard to add and maintain knowledge as more is added to it? Will it ever make mistakes? What forms of knowledge will it be able to handle in the future?

I think Wolfram would agree that it is probably never going to be able to give relationship or career advice, for example, because that is "fuzzy" -- there is often no single right answer to such questions. And I don't know how comprehensive it is, or how it will be able to keep up with all the new knowledge in the world (the knowledge in the system is exclusively added by Wolfram's team right now, which is a labor intensive process). But Wolfram is an ambitious guy. He seems confident that he has figured out how to add new knowledge to the system at a fairly rapid pace, and he seems to be planning to make the system extremely broad.

And there is the question of bias, which we addressed as well. Is there any risk of bias in the answers the system gives because all the knowledge is entered by Wolfram's team? Those who enter the knowledge and design the formal models in the system are in a position to both define the way the system thinks -- both the questions and the answers it can handle. Wolfram believes that by focusing on factual knowledge -- things like you might find in the Wikipedia or textbooks or reports -- the bias problem can be avoided. At least he is focusing the system on questions that do have only one answer -- not questions for which there might be many different opinions. Everyone generally agrees for example that the closing price of GOOG on a certain data is a particular dollar amount. It is not debatable. These are the kinds of questions the system addresses.

But even for some supposedly factual questions, there are potential biases in the answers one might come up with, depending on the data sources and paradigms used to compute them. Thus the choice of data sources has to be made carefully to try to reflect as non-biased a view as possible. Wolfram's strategy is to rely on widely accepted data sources like well-known scientific models, public data about factual things like the weather, geography and the stock market published by reputable organizatoins and government agencies, etc. But of course even this is a particular worldview and reflects certain implicit or explicit assumptions about what data sources are authoritative.

This is a system that reflects one perspective -- that of Wolfram and his team -- which probably is a close approximation of the mainstream consensus scientific worldview of our modern civilization. It is a tool -- a tool for answering questions about the world today, based on what we generally agree that we know about it. Still, this is potentially murky philosophical territory, at least for some kinds of questions. Consider global warming -- not all scientists even agree it is taking place, let alone what it signifies or where the trends are headed. Similarly in economics, based on certain assumptions and measurements we are either experiencing only mild inflation right now, or significant inflation. There is not necessarily one right answer -- there are valid alternative perspectives.

I agree with Wolfram, that bias in the data choices will not be a problem, at least for a while. But even scientists don't always agree on the answers to factual questions, or what models to use to describe the world -- and this disagreement is essential to progress in science in fact. If there is only one "right" answer to any question there could never be progress, or even different points of view. Fortunately, Wolfram is desigining his system to link to alternative questions and answers at least, and even to sources for more information about the answers (such as the Wikipeda for example). In this way he can provide unambiguous factual answers, yet also connect to more information and points of view about them at the same time. This is important.

It is ironic that a system like Wolfram Alpha, which is designed to answer questions factually, will probably bring up a broad range of questions that don't themselves have unambiguous factual answers -- questions about philosophy, perspective, and even public policy in the future (if it becomes very widely used). It is a system that has the potential to touch our lives as deeply as Google. Yet how widely it will be used is an open question too.

The system is beautiful, and the user interface is already quite simple and clean. In addition, answers include computationally generated diagrams and graphs -- not just text. It looks really cool. But it is also designed by and for people with IQ's somewhere in the altitude of Wolfram's -- some work will need to be done dumbing it down a few hundred IQ points so as to not overwhelm the average consumer with answers that are so comprehensive that they require a graduate degree to fully understand.

It also remains to be seen how much the average consumer thirsts for answers to factual questions. I do think all consumers at times have a need for this kind of intelligence once in a while, but perhaps not as often as they need something like Google. But I am sure that academics, researchers, students, government employees, journalists and a broad range of professionals in all fields definitely need a tool like this and will use it every day.

Future Potential

I think there is more potential to this system than Stephen has revealed so far. I think he has bigger ambitions for it in the long-term future. I believe it has the potential to be THE online service for computing factual answers. THE system for factual knowlege on the Web. More than that, it may eventually have the potential to learn and even to make new discoveries. We'll have to wait and see where Wolfram takes it.

Maybe Wolfram Alpha could even do a better job of retrieving documents than Google, for certain kinds of questions -- by first understanding what you really want, then computing the answer, and then giving you links to documents that related to the answer. But even if it is never applied to document retrieval, I think it has the potential to play a leading role in all our daily lives -- it could function like a kind of expert assistant, with all the facts and computational power in the world at our fingertips.

I would expect that Wolfram Alpha will open up various API's in the future and then we'll begin to see some interesting new, intelligent, applications begin to emerge based on its underlying capabilities and what it knows already.

In May, Wolfram plans to open up what I believe will be a first version of Wolfram Alpha. Anyone interested in a smarter Web will find it quite interesting, I think. Meanwhile, I look forward to learning more about this project as Stephen reveals more in months to come.

One thing is certain, Wolfram Alpha is quite impressive and Stephen Wolfram deserves all the congratulations he is soon going to get.

Comments

  • Public Comments

    • 8 months ago


      Thank you Nova. Brilliant find, brilliant share, brilliantly written. I am soooo excited about what's coming and what's already here. "The future is here. It's just not evenly distributed yet." - Gibson
      Nova Spivack - My Public Twine
      • 2 weeks ago


        I absolutely love this quote! peeps all over the net have been posting quotes from wolfa's funny responses to questions. I predict there is a lot more of it to come in the future.
        Nova Spivack - My Public Twine
    • 8 months ago


      Looks like something to integrate with twine. A twine (interest area / network) is, or can include, a set of questions about facts around the subject. A variation on saved searches. A specialized item type?
      Nova Spivack - My Public Twine
    • 8 months ago


      thanks for the extremely well written note.

      It will be extremely interesting to find out about the economics of Wolfram Alpha: what kind of infrastructure makes it possible for a broad coverage system to be centrally assembled by a small (~100 did you say?) number of knowledge workers. The answer you seem to hint at appears to be some form of spontaneous self-assembly of knowledge blocks that was not possible in systems such as Cyc that started off with similar ambitions.

      Finally, I just wanted to bring to your attention that the European Commission is funding the Large Knowledge Collider (LarKC) a research project for reasoning (i.e. providing factual answers to factual questions) at the web scale.

      LarKC does not have a natural language front end (although others could build one on top of it) and does not make the assumption that the knowledge resources it relies on (large stores of RDF triples) are accurate or even consistent.

      LarKC is also set up in such a way that it could make calls out to Wolfram Alpha (if Alpha allowed it) as its architecture supports what inference engine architects refer to as procedural attachments.

      Anyway, very much looking forward to the May release of Wolfram Alpha: we live in interesting times.

      thanks again for the writeup.
    • 8 months ago


      I fed your three example questions as they were (e.g. "What is the average rainfall in Seattle?") into my favorite traditional search engine and felt lucky in all three cases. SCNR. :)
      Nova Spivack - My Public Twine
    • 8 months ago


      Speaking of Bias, it appears that "..about 20 percent of all the server computers being sold in the world "are now being bought by a small handful of internet companies," including Microsoft, Google, Yahoo and Amazon." (see here: http://www.roughtype.com/archives/2009/03/the_coming_of_t.php).
      If as you say:"Those who enter the knowledge and design the formal models in the system are in a position to both define the way the system thinks -- both the questions and the answers it can handle." we are now entering into a new era in which both the backbone (servers and so on) and the knowledge itself will be defined (owned?) not by individual users but by organizations and corporations, this does not seem to fit with an open source net/web life we envisage, so though I am highly enthusiastic about the project itself, the manner of its application still remains to be monitored.
      having said the above, kudos
      Nova Spivack - My Public Twine
    • 8 months ago


      Interesting news.

      There's already a company doing (as near as I can tell) the exact same thing: TrueKnowledge, based in the UK. A summary of their technology is here: http://www.trueknowledge.com/technology/

      They're already in beta and you can play around with it now.
      Nova Spivack - My Public Twine
      • 8 months ago


        Yes True Knowledge is great, but doesn't do exactly the same thing. It's more focused on enabling anyone to add knowledge, but the sacrifice is that the sophistication of the computation and knowledge it contains is not as high as Wolfram Alpha.
        Nova Spivack - My Public Twine
        • 8 months ago


          I'm not so sure. Their latest marketing have been centered around making it easy for people to add information, but I don't think that's really their focus.

          As for the relative sophistication, I can't speak to that having never seen Wolfram Alpha. But in principle I believe a launched product is better than an unlaunched product! :) I look forward to seeing both "in the wild."
          Nova Spivack - My Public Twine
          • 8 months ago


            I was really impressed with True Knowledge when I used it. But it's a different kind of animal. You would not use True Knowledge to compute the answers to math problems for example, nor would you use it to analyze factual data about the world from sensors, or to answer questions about the stock market. There are some areas where the systems intersect, but they were designed with very different use cases in mind.
            Nova Spivack - My Public Twine
            • 8 months ago


              Definitely in the same class as True Knowledge -- you definitely see the calculating capability in TK when you ask for example "how old is Steve Jobs" and the system simply know the year that Jobs was born in and today's date and thus can accurately report his age.
              Nova Spivack - My Public Twine
            • 7 months ago


              Here's a comment made by William Tunstall-Pedoe in the Techcrunch version of Nova'a article (http://www.techcrunch.com/2009/03/08/wolfram-alpha-computes-answers-to-factual-questions-this-is-going-to-be-big/) :

              I’m the founder of True Knowledge and Stephen Wolfram was good enough to give me an extended demo of Wolfram Alpha over the weekend. I’d agree that we are possibly the closest comparable in that we are also building a platform that unifies and stores structured knowledge and enables this knowledge to be accessible through question answering. We also heavily use inference to “compute” answers. Having said that, there are also considerable differences: Wolfram’s broad approach is bringing together and curating all the knowledge themselves and generating a full page response to queries using millions of lines of Mathematica code. Our approach in contrast is to enable users to build the knowledge base and to produce concise answers created by inference steps that usually involve no domain specific coding at all. There are advantages and disadvantages of both these approaches and these will also heavily affect the likely uses the systems will be put to. However, it was very exciting to see WA and I’m looking forward to the launch.
              Nova Spivack - My Public Twine
              • 7 months ago


                To me, it sounds like True Knowledge is an advanced version of Wikipedia ("open world"), kind of a semantic wiki and very similar to DBpedia, while Wolfram Alpha is the next Encyclopedia Britannica ("closed world") but one, both built on inference systems, whereas WA seems to provide very good reasoning mechanisms.
                Wolfram Alpha obviously will be able to give answers to "solid" knowledge but will always struggle with knowledge which is rather like "water vapor". From my perspective every knowledge base architect has to make a decision on which kind of knowledge he will focus on, since knowledge comes in different flavours - see: http://blog.semantic-web.at/2008/10/09/which-flavour-does-knowledge-have-on-the-web/
                So probably the most realisitic scenario is, that all those specialised engines (also Google is a specialised one, not in a horizontal sense, but in a sense how information is processed by Google) will be connected with each other one day. Maybe this will happen sooner than many think, at least the LOD cloud has grown again: http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-03-05_colored.png

                This would mean for the end-user: No search for the right search engine anymore (Maybe even just one GUI for every kind of search or question). For the providers the Semantic Web could become an interesting infrastructure since not only well structured information can be harvested from there but also interesting business models can be built on top of it:

                The Semweb will enable the LONG TAIL of search services. (Or will Google or WA ever know where in your neighbour town the best pub is?)
                Nova Spivack - My Public Twine
        • 8 months ago


          I have built up a list of tricky questions to ask Wolfram Alpha when it goes public (such as How many krill do whales eat per hour? Why did Apollo XIII fail to land on the moon?) but I thought I would try some easier ones on True Knowledge. I am disappointed to say that it failed with all of them.
          • Who was pope in 1066? (understood the question perfectly. didn't know the answer though)
          • How many times has Madonna wed? (thought "has madonna wed" was an intransitive verb)
          • How many goals did Pele score in his career? (couldn't parse the question)
          • How many people died in the Black Death? (couldn't parse the question)
          • How many cannons were on the Mary Rose when she sank? (couldn't parse the question)
          Nova Spivack - My Public Twine
    • 8 months ago


      Is Wolfram actually buying tons of their own servers for this? As wildcat implies, the processing power and bandwidth they need to pull this off means that they will have to get very serious about their computing infrastructure. It just seems to me that that has never been Wolfram's strong suit - they are a software company.

      I guess they could use Amazon's AWS or a web data computing platform like http://www.80legs.com?
      Nova Spivack - My Public Twine
      • 8 months ago


        I'm not sure how they plan to scale this. The demo I saw was blazingly fast, even for sophisticated calculations. Wolfram laughed that it still wasn't fast enough for his tastes. They seem confident that it will scale and that they have a path for that.
        Nova Spivack - My Public Twine
    • 8 months ago


      Thanks for the interesting write-up. I was rather surprised by the contrast you draw between Wolfram Alpha and the Semantic Web approach. Semantic Web languages like RDF and OWL are primarily meant as interchange formats, not necessarily to be used as internal representations. Many (most?) Semantic Web applications are "Semantic Web applications" because they pull in data in RDF/OWL and serve back the results again in RDF/OWL, while internally doing whatever suits them best.

      In this way, Wolfram Alpha would be a very interesting part of the Semantic Web (assuming that ultimately it would communicate with other parties in RDF/OWL). It would enable platforms like LarKC to leverage the content and functionality of Wolfram Alpha to perform interesting tasks and to build interesting applications that would use Alpha's answers in interesting and unforeseen ways.

      Can't wait till May!
    • 8 months ago


      Have they formalized the knowledge adding process and techniques so that other human knowledge engineers could be trained to use the tools and engine for other bodies of knowledge? e.g. drug discovery data and knowledge or industry knowledge for economic and market prediction?
      Nova Spivack - My Public Twine
    • 8 months ago


      Nova, Great article! I'm curious, if you think that Wolfram and his team have discovered a more 'pragmatic' (and potentially more efficient) method for encoding knowledge - what benefit would current semantic web technologies serve?

      Also, can you give any more hints as to the method the Wolfram Alpha is using for knowledge encoding?
      Nova Spivack - My Public Twine
    • 8 months ago


      Brainboost.com did the exact same thing some years ago - until it got absorbed by Answers.com - the main page still shows the Q&A interface - but if people really want factual or research information they will go to a paid expert answers site but I look forward to test wolfram.
      Nova Spivack - My Public Twine
    • 8 months ago


      Yep, what Frank said. "What you do in the privacy of your own database is your own business". So long as you use common Web formats (URIs, RDF/OWL etc) to communicate, you're part of the Semantic Web project.
    • 8 months ago


      Makes sense. I worked at Wolfram Research back in the day. It some ways, this is a grander replay of how Mathematica was constructed 20 years ago by pouring in all the relevant knowledge of one field, Math, one sub-field at a time. Presumably a lot's been learned about representation through that two decade exercise.

      I remember back then a lot of testing done by running through the problems and examples of scores of math textbooks and workbooks. In time, it surfaced a lot of errors - in the answers in the books! It'd be interesting to see the range of things something like this could and would eventually "proofread".
      Nova Spivack - My Public Twine
    • 8 months ago


      This idea that very simple rules can generate great complexity looks similar to what Seth Lloyd is talking about in his book "Programming the Universe" where he describes the universe as an information processing machine (physical systems "register" information, transform it when they evolve, and end up creating amazing things like life, brains, consciousness, etc). If Wolfram|Alpha turns out to be usable to test and confirm this sort of hypothesis, then it may be interesting for its ontological implications that go way beyond competing with Google. In this sense it could be much much more important than Google...
    • 8 months ago


      Thanks for the write-up, Nova. This is fascinating.

      Almost all of us in AI have taken the lessons of Cyc, Wikipedia, Linking Open Data, etc., to mean that very large question-answering KBs need to be built by very large (often global) teams. Only with a large and diverse team could you achieve the degree of consensus, number of points of view, and depth of coverage that would be necessary to answer non-elementary questions. Nova says Wolfram's system covers not just mathematics, but also "technology, geography, weather, cooking, business, travel, people, music, and more." This is an astonishing range. Web 2.0 techniques have showed us a way to get large groups to work together cost-effectively around text, but even semantic wikis (which I think are the most promising systems yet built for large-scale collaboration around formal knowledge) haven't dreamed of approaching the kind of performance described here. To be able to independently design and cost-effectively build an interoperable set of knowledge building blocks for broad-scale knowledge, and to link them to a computational engine and NLP/diagram based I/O modules, is really really impressive. I don't know anyone who thought this was possible given the state of the art today. I definitely look forward to finding out more!
      • 8 months ago


        Thanks, yes, it seems quite hard to do, and we'll have to wait and see whether they have truly pulled it off. It looks pretty promising from what I've seen.
    • 8 months ago


      Sounds like Cyc...
      Artificial Intelligence
    • 8 months ago


      Nova Spivack - My Public Twine
    • 8 months ago


      I enjoyed your analysis of the potential pitfalls and issues that WolframAlpha might encounter in the course of attempting to represent and manage the sum of 'factual knowledge' in the world at any given point in time. This made for an excellent read; thanks.
      Nova Spivack - My Public Twine
    • 8 months ago


      Good News, Thanks for this blog.

      Lets wait until May.
      Nova Spivack - My Public Twine
    • 8 months ago


      Brilliantly written. Very simple articulation of a very complex technology. Thank you for sharing. Daya Baran
      Nova Spivack - My Public Twine
      • 8 months ago


        I am looking forward to asking Wolfram Alpha
        "Is Katmandu uphill from the Matterhorn?"
        If it has a rich enough geometrical model of the world -- not just smart relations among statement strings -- to answer that, life will really change.
        Nova Spivack - My Public Twine
    • 8 months ago


      Very well written article Mr Nova. It gives an excellent insight into different approaches to computational search . Can this alpha project also evolve into say a hybrid that does computational search and relevance search like google does sometime in its beta or near future? do you see this possibility ?
      Nova Spivack - My Public Twine
    • 8 months ago


      I'll be updating this article as I learn more, and as I have more thoughts about this. Thanks for all the great feedback, folks. Already I have made some edits based on your comments.
      Nova Spivack - My Public Twine
    • 8 months ago


      What about internationalization, other languages ? What about real-time data ?
      Nova Spivack - My Public Twine
    • 8 months ago


      Note to any members of the Wolfram Alpha team, or others who have seen the demo -- it would be great to assemble more examples of questions (that anyone can relate to) that Alpha can answer, for which the answers are also not quickly found in Google. Although what Alpha does is quite different from Google, it would help me to explain this to those who don't understand the difference between computing answers (which entails some level of understanding and reasoning) and simply looking up keyword matches in documents (which is just statistics and lookup -- no understanding or reasoning).

      For those who are more scientifically inclined, Stephen showed me many interesting examples -- for example, Alpha was able to solve numeric sequencing problems, calculus problems, and could answer questions about the human genome too. It was also able to answer questions about cooking, people, and many other topics. I suspect that in some cases a little bit of computation/reasoning took place and then a lookup was performed. But in most of the cases, it was not lookup, it was pure computation from what I could tell.
      Nova Spivack - My Public Twine
    • 8 months ago


      There are many ways of addressing the needs of web users, which include:
      1. Retrieval, like Google.
      2. Question answering, like Wolfram Alpha.
      3. Problem solving, like http://www.DrEliza.com

      Each of these serves a different need that is NOT served by the others. A robust web would have these and more, and some sort of "front end" to decide which engine (or engines) should receive a query. A good multiple-engine situation is where a user asks a "loaded" question that suggests that he has other problems about which he is unaware (these are usually long), or that his "question" is really a problem with a question mark at the end, e.g. "Why do I have asthma?"

      Of course there are already search engine front-ends like http://www.dogpile.com, but so far, no one is triaging input and directing it to widely varying types of responders.

      The challenges seem to be more political than technological. Google foolishly thinks that their approach can do ANYTHING, and they CAN answer some simple questions, so they aren't going to send any business elsewhere. Further, most services have a user agreement that precludes such usage. In short, don't look for progress in this area from any of the major companies.

      Similarly, I suspect that the code within Wolfram Alpha detects when the input is getting outside of its computational paradigm. Wolfram Alpha's engine probably works best with quantitative queries having phrases like "how much". I wonder if they would publish/furnish some guidance for triaging input to their site?

      Dr. Eliza is my own creation, and is able to discuss serious chronic illnesses with people and often figures things out, even where excellent doctors have been stumped. Most healthy people can't accurately pretent to be serious ill when it gets down to answering very detailed questions, so I added bad teeth to its repertoire for demo purposes. When your dentist says that it is time to pull a tooth, you would do well to first discuss it with Dr. Eliza, who more often than not will find SOME way of saving it.

      Of course, Dr. Eliza also sometimes arrives at internal "understandings" that it can't usefully deal with certain input. Now, if I could only send users somewhere ELSE, or alternatively, if I could just no-response to let other resources deliver their best, then all users would be much better off.

      I may end up creating such a front end, if I can find services who support rather than hinder such efforts, or alternatively, I would cooperate with someone else who decides to do this. THAT would truly open things up for everyone on the web.

      Any thoughts, before I roll up my sleeves and start coding?
      Nova Spivack - My Public Twine
    • 8 months ago


      Here's two good questions for Wolfram Alpha... one it could answer (but will it?) and one it can't answer:

      Q1: How many letters are in the answer to this question?
      (Answer: Four)

      Q2: Will Wolfram Alpha answer 'no' to this question?
      Nova Spivack - My Public Twine
    • 8 months ago


      This is quite fascinating. If the launch date was 4/1, I would be worried!

      Here are a few questions and observations:

      - I would assume WolframAlpha has embedded within it the complete Mathematica engine. Thus, for math question, Alpha serves as a natural language front-end to Mathematica. Also, I assume Alpha allows symbolic input since it can be awkward to express math concepts in language. This brings up the question of pricing. If Alpha is a free website and it gives you full access to Mathematica, that could erode sales of the package.

      - It would be interesting to test Alpha literally. I imagine one of the subject domains Alpha understands is physics. Give it an SAT subject test in physics translating diagrams into words where necessary. See how it does. It should score an 800!

      - How does Alpha handle factual questions where the information supplied is incomplete? For example, In terms of alcohol content, how many glasses of beer are equivalent to a Martini? Does it give a range of answers based on a set of stated assumptions, ask for user clarification, or simply fail?
      Nova Spivack - My Public Twine
    • 8 months ago


      <nitpicking>"In fact, it is even larger than what we consider to be an infinite amount of data. It is not storable at all." That doesn't make sense.</nitpicking>
      Nova Spivack - My Public Twine
      • 8 months ago


        I am referring to the fact that it is a transinfinite quantity of information -- the set of all multiplication products of all real-numbers is larger than the set of all multiplication products of all integers. Both are infinite, but the real-number products is a larger set. This is an example of a higher-level of infinity. It turns out there are an infinite series of "sizes" of infinity. See the work of Cantor: http://en.wikipedia.org/wiki/Infinity
        Nova Spivack - My Public Twine
        • 8 months ago


          I know Cantor, that's why I can't make sense of these sentences:

          "Even if one simply enumerates this infinite multiplication table as a numeric series it is still an infinite amount of data. In fact, it is even larger than what we consider to be an infinite amount of data. It is not storable at all."

          If you're talking about reals, the table is not enumerable, so the first sentence is incorrect. If you're talking about integers, the second sentence is wrong. Either way, the table is not "storable".

          But I guess my understanding of these sentences is somewhat different from what you intended to convey. Anyway, enough nitpicking...
          Nova Spivack - My Public Twine
          • 7 months ago


            That is my point -- the table is not enumerable or storable at all. We agree. But you are correct, my wording did not make that clear. I have clarified it now. Thanks for the useful comments!
            Nova Spivack - My Public Twine
    • 8 months ago


      Thanks for this Nova,
      Having seen Tim Berners Lee at TED talking about linked data it seems that there is a natural partnership here.
      Is Tim part of, involved, thinking of getting involved, etc.
      Is now the time for semantic web and linking open data to come together?


      William
      PS. If there is anyone out there who is really good at the linking open data I'd love to know. I can't make head or tail of how to join in! But then I'm just a strategy guy!
    • 8 months ago


      Désolé, c'est en anglais, mais c'est long (je ne vais pas traduire) et intéressant.
      Tellement que je suis pressé qu'on puisse essayer (en mai).

      Au lieu de faire un moteur de recherche, Wolfram construit un moteur de calcul de réponse (sur des faits non ambigus).
      Il y aura sûrement beaucoup à dire.

      Il répond à du "langage naturel", autant dire de l'anglais. Dommage. À qu'ils n'aient prévu des interfaces multilingues?
      Le petit monde de l'informatique
    • 8 months ago


      Nova,

      The major difference I see between Wolfram and Google is in: (a) the sorts of knowledge each computes, and (b) the answer forms each delivers. These are qualitative differences. Wolfram is raising the bar significantly. So, I'm really excited about the the prospect of being able to do fact reasoning and answer computing across the web.

      However, explaining this by making a distinction between look-it-up, vs compute-it iI don't think is really necessary since computationally, there is always a trade-off in "space" and "time" for different forms of knowledge computing. The time aspect is for learning or deriving the answer. The space aspect is for remembering what is known. Historically, when processors and memory were slow and small, and very expensive, programming methods favored computing everything with algorithms at execution time so as to minimize storage needs. Today, the performance and cost of computing is quite different. There is really no reason why we need to take computing cycles to find out again and again that 1 + 1 = 2. Today, since both memory and processing power are vastly cheaper, other architectures have emerged.. To illustrate, when performance (e.g., need realtime results) matters, then long compute algorithms become a constraint. To overcome the latency (compute time), classes of application emerged that compiled rules to minimize state changes,and pre-computed possible results based on the knowledge at hand, in order to deliver results rapidly. Instead of computing everything at run time, these applications pay the computational cost at compile time or some other time than question-answering time, (e.g., with Google and others, off line by night reindexing, etc.) in order to gain speed at execution time. The strategy is to compress very long (knowledge) computing path lengths into a look-up, because nothing could be faster. Google is fast at what it does because it reads and reindexes the entire internet daily. If Wolfram is to be comparably fast at what it does with more complex fact reasoning and question answering, then Wolfram Alpha will have adopted many of the same sorts of cloud computing and declarative pre-compute reasoning strategies.
      Nova Spivack - My Public Twine
    • 8 months ago


      Some other toy questions:

      "How long did the Thirty Years War last?"

      True Knowledge answers that one correctly, but fails on these:

      "How long was the Thirty Years War?"

      "How long did the Hundred Years War last?"
      Nova Spivack - My Public Twine
    • 7 months ago


      My Question "What is my gmail password?"
      Nova Spivack - My Public Twine
    • 7 months ago


      wrong :) anyway

      answer was asdf1234
      Nova Spivack - My Public Twine
    • 7 months ago


      OK. what I want to know is what happened to Hart? Those of you with a Buffy the Vampire slayer bent will get that.

      More seriously, How will this engine determine context? Sample question: How big is Paris? Answer: Illinois? Texas? France? Hilton? or someone else? How many questions do I need to answer to get my answer? Why not just use google to find the answer in the first place?

      There is a wonderfully niaive quality about the statements that this tool will answer questions. There doesn't seem to be any reference to context other than the question of bias.

      Here's another question of fact: Which social networking site will allow me to contact all of my friends?
      Nova Spivack - My Public Twine
      • 7 months ago


        When Wolfram Alpha is not sure of context it tries to assume a default context, but also suggests other contexts you might have intended. This is what I saw in the demo.
        Nova Spivack - My Public Twine
    • 7 months ago


      Wolfram is avant garde, but interactivity should not be taken for granted, the hyperconnected knowledge engines of the future will be treated as pervasive rather than invasive...ultimately search engines will have to consider the cognitive agents using them, and be adaptive not only to natural language, but philological and telesophic insight, to assist the creative processes of humans, to coherently reconstruct, not constrictively interfere, the question is, how will people use it, and for what? Are there some questions with biased answers?
      Nova Spivack - My Public Twine
    • 7 months ago


      Here are a couple of challenge examples for Wolfram Alpha. Both examples are online, where they can be viewed run and edited, using a standard browser. In each case, end users should be able to update the knowledge used to answer the questions.

      Example 1: English query leading to oil industry supply chain calculations over potentially large SQL tables, with English explanations of the results at the end user level. Explanations of why expected results are absent.
      Ref: www.reengineeringllc.com/Oil_Industry_Supply_Chain_by_Kowalski_and_Walker.pdf

      Example 2: English query over diverse data sources, leading to dollar estimates of possible savings from moves towards energy independence, with English explanations of the results at the end user level. Explanations of why expected results are absent.
      Refs: www.reengineeringllc.com/EnergyIndependence1.pdf
      www.reengineeringllc.com/EnergyIndependence1Video.htm (Flash video with audio)
      Nova Spivack - My Public Twine
      • 7 months ago


        Great stuff. I think IBL (the software used to build Adrians's examples) gives us a good idea of what interacting with Alpha might be like.

        The Alpha query bar on the Alpha site now looks pretty simple, just like the Google query bar. But instead of giving you an immediate answer (as Google now does), Alpha will probably take you to another screen where you will be helped to make your question more precise. Maybe this query-formulation dialogue will be similar to what you can now test with IBL or TrueKnowledge. And then, the nice thing of course about "computed answers" as opposed to mere keyword search results is that you can ask for an explanation of the given answer. If Google gave us an explanation of its results, it would be something like "You asked about X, Y, Z and the documents I just returned talk about X, Y, Z (or some approximation of these terms)". True Knowledge, IBL and presumably Alpha, can do better than that. If you ask them a question like "Is A a B?", they will answer something like "Yes" or "No" or "I don't know", but then they will be able to explain their answer with things like "All A's are C's and all C's are B's".

        It will be very interesting to see to what extent Wolfram Alpha turns out to be able to compute the sort of queries found in Adrian's examples. And how it will enable users to formulate such queries in natural language. I haven't plunged into Mathematica yet, but my hunch - which may well be totally wrong - was that Wolfram's approach would be more mathematical computation rather than logical deduction. Probably it doesn't make sense to make a distinction between these two. If it does make sense, then maybe the True Knowledge approach (I guess more "logical" than "mathematical") would be closer than Alpha to deal with these examples?

        I just went through the first IBL tutorial (follow instructions from https://www.reengineeringllc.com/ ) and found it excellent. Takes about a half hour.
        Nova Spivack - My Public Twine
    • 7 months ago


      What about "law" as a area to find factual answers? And as an area Alpha is/will be covering?

      This is a bit interesting because many legal questions need a lot of information about specifics
      who is the testator, what is the relationship between testator and legatee, .....
      but should be answerable for statute and case law.

      How about things like "what are the precedents for ... lots of situation specific information...?"
      Does Lexis/Nexis reason about this kind of thing or "just" try to match words and phrases?

      Of course, the most interesting legal questions are really about predictions
      of what the next court/judge/jury will decide. Alas, this will probably never be
      within Alpha's sphere of competence.

      Thanks for the excellent description, analysis, and critique. How did the guy in the NY Times,
      Saul Hansel, manage to completely miss your whole point ?
      Nova Spivack - My Public Twine
    • 7 months ago


      Nova,

      Were you able to ask your own questions, or did you just watch a demo?
      Nova Spivack - My Public Twine
      • 7 months ago


        Yes, I definitely asked my own questions -- Wolfram had no way to anticipate what I asked. It was very spontaneous. Just remember the system is still in the early stages. It's good, but it's still "alpha." Heh.
        Nova Spivack - My Public Twine
    • 7 months ago


      You've been quoted and referenced: Wolfram Alpha: Next major search breakthrough? http://news.cnet.com/8301-13953_3-10191304-80.html?tag=newsEditorsPicksArea.0
      Nova Spivack - My Public Twine
    • 7 months ago


      Wolfram alpha is an "answer engine" not a search a search engine. It's not a "Google killer" -- it's something different, for a different purpose. It doesn't compete with Google for document retrieval. That's not the goal. It computes answers to factual questions. Some of the media really missed the point.
      Nova Spivack - My Public Twine
    • 7 months ago


      I completely agree with Mark. The sheer implications of the range mentioned by Nova are incredible. I can hardly wait to learn more…
    • 7 months ago


      I just want to add a note on global warming: you can take a look on the wikipedia entries to find out if there is dissent or not.

      Wikipedia might have a bias about that topic (I tend to think along that line, pesonally), so take it with skepticism. This is just a suggestion to _also_ take wikipedia information into account, besides any other sources.

      That aside I think this engine is very interesting (especially the relation to NKS).
      And I think that bias is a problem that always must be adressed.
      Nova Spivack - My Public Twine
    • 7 months ago


      Well written summary.

      "dumbing it down a few hundred IQ points so as to not overwhelm the average consumer" - There are simple systems and there are complex systems, and if it can answer questions on physics, it should be complex.
      Rather raise the IQ of the users!

      seriously, if I look at the technical interest of some teenagers around me and students at university, products such as Wolfram can motivate people to learn about the world.
    • 7 months ago


      Stephen Wolfram generously gave me a two-hour demo of Wolfram Alpha last evening, and I was quite positively impressed. As he said, it's not AI, and not aiming to be, so it shouldn't be measured by contrasting it with HAL or Cyc but with Google or Yahoo. At its heart is a formal Mathematica representation. Its inference engine is basically a large number of individually hand-engineered scripts for tapping into data which he and his team have spent the last several years gathering and "curating". For example, he has assembled tables of historical financial information about countries' GDP's and about companies' stock prices. In a small number of cases, he also connects via API to third party information, but mostly for realtime data such as a current stock price or current temperature. Rather than connecting to and relying on the current or future Semantic Web, Alpha computes its answers primarily from his own curated data to the extent possible; he sees Alpha as the home for almost all the information it needs, and will use to answer users' queries.
      In an important sense, Alpha is a logical extension of Mathematica: it extends the range of types of information for which significant power can be gained by manually, and exhaustively, enumerating a large set of cases: airplane designs, cities, currencies, etc. I.e., Alpha extends what Mathematica has done previously for things like chemical compounds, geometric surfaces, topological configurations, arithmetic series, trigonometric ratios, and equations. In the new cases, as Mathematica did in those abstract math cases, Alpha excels at not just retrieving the stored data but performing various appropriate numeric calculations on the data, and displaying the results in beautiful graphs and easily comprehended tables for the user.
      The resulting mosaic covers a large portion of the space of queries that the average person might genuinely want to ask, in the course of their day. The interface is not exactly natural language, but can be treated by the user as though it were -- just as users of browsers can treat them as though they parsed sentences even though they don't. A better way to think of it is a DWIMM ("do what I might mean"), so if you type in something like "gdp France / Germany", it calculates and returns a graph of the relative fraction of France's annual GDP to Germany's GDP, over the last 30 years or so. If you just type in "gdp", it looks up your local host and (in my case) displays the GDP of the USA over the last 30 years, plus various pieces of information about what gross domestic product is, from a mathematical formula perspective but not from a semantic one. It does not have an ontology, so what it knows about, say, GDP, or population, or stock price, is no more nor less than the equations that involve that term. One vulnerability that this engenders in Alpha is that errors in the data may go unnoticed for a long time; a positive way of saying this is that one could align Alpha's terms to an ontology and knowledge base, and use it to catch some fraction of errors as outright implausible violations of basic knowledge (e.g., Miami's population dropping by exactly a factor a ten during the month of October, 2006.)
      Another example of DWIMM occurs if you type in a complicated mathematical formula, sloppily, with run-on variables, parenthesis errors, typos, etc. In those cases, Alpha does a great job of guessing what you could possibly have meant by that, something close to what you typed in which would be a nontrivial graph, and displays that graph. If you type in a string of letters that's parsable only as a chemical compound, it assumes that you want information about that compound. If you type in IL where it expects a state, it will interpret that as Illinois; where it expects a country, it will interpret that as Israel.
      For those who are familiar with and enamored by Mathematica's powerful theorem prover, it should be mentioned that that is, for the moment, turned off, for reasons having to do with computational cost -- i.e., response time -- and also to prevent "explosions" of less and less relevant answers from being produced. Cautiously, conditionally, at some time in the future, expect to see that theorem prover come into play.
      There are two important dimensions I want to discuss about Wolfram Alpha, besides the remarks I've already made here. (1) What sorts of queries does it not handle, and (2) When it returns information, how much does it actually "understand" of what it's displaying to you? There are two sorts of queries not (yet) handled: those where the data falls outside the mosaic I sketched above -- such as: When is the first day of Summer in Sydney this year? Do Muslims believe that Mohammed was divine? Who did Hezbollah take prisoner on April 18, 1987? Which animals have fingers? -- and those where the query requires logically reasoning out a way to combine (logically or arithmetically combine) two or more pieces of information which the system can individually fetch for you. One example of this is: "How old was Obama when Mitterrand was elected president of France?" It can tell you demographic information about Obama, if you ask, and it can tell you information about Mitterrand (including his ruleStartDate), but doesn't make or execute the plan to calculate a person's age on a certain date given his birth date, which is what is being asked for in this query. If it knows that exactly 17 people were killed in a certain attack, and if it also knows that 17 American soldiers were killed in that attack, it doesn't return that attack if you ask for ones in which there were no civilian casualties, or only American casualties. It doesn't perform that sort of deduction. If you ask "How fast does hair grow?", it can't parse or answer that query. But if you type in a speed, say "10cm/year", it gives you a long and quite interesting list of things that happen at about that speed, involving glaciers melting, tectonic shift, and... hair growing.
      This brings up the final issue I wanted to discuss: how much of what it returns does it understand. At one extreme is, say, Google, which responds to almost anything like a faithful puppy bringing in the morning newspaper without understanding much of anything it's fetching (recognizing words in what it returns, often leading to amusing or hair-raising inappropriate "ads" being displayed, and leading to tons of false positives and false negatives). At the other extreme is, say, Cyc, which only can answer a small fraction of user queries, but can answer ones that require common sense (not just common sense queries like "Do surgeons often operate on themselves?", but ones where the logical application of such knowledge is required to correctly disambiguate and parse the user's query containing pronouns, elisions, ambiguous words, ellipsis, and so on) and where every piece of the query and every piece of the answer is as deeply understood as, say, arithmetic. Wolfram Alpha is somewhere around the geometric mean of those two extremes. It handles a much wider range of queries than Cyc, but much narrower than Google; it understands some of what it is displaying as an answer, but only some of it -- e.g., the above example about it displaying the fact that hair grows 10cm/year if you ask for things that happen at 10cm/year but not if you ask how fast hair grows; or being able to report the number of cattle in Chicago but not (even a lower bound on) the number of mammals because it doesn't know taxonomy and reason that way. If the connection between turbulent air and plane travel isn't represented via an equation, it isn't represented at all. As with many of these sentences, I want to add "...yet", because Dr. Wolfram is very much aware of the limitations of his system, and has plans for addressing many of them as Alpha continues to develop.
      The bottom line is that there are a large range of queries it can't parse, and a large range of parsable queries it can't answer even when it can answer the constituents out of which they should be answerable, but it handles a huge range of numeric and scientific queries correctly even in its current state. And Dr. Wolfram and his team are chipping away at the natural language blocks, at the holes in the curated data repository, and at increasing the type and depth of logical combination of constituents, one by one, in priority order, just as they should. I went in to the demo concerned that this might be a competitor to Cyc, given its "hand-curate knowledge and engineer it, versus let anyone add anything" philosophy, but came out of last night's demo and discussion seeing Alpha as a complementary technology. I would invest in this, literally and figuratively. If it is not gobbled up by one of the existing industry superpowers, his company may well grow to become one of them in a small number of years, with most of us setting our default browser to be Wolfram Alpha.
      Nova Spivack - My Public Twine
      • 7 months ago


        This was a *very* helpful post, Doug. Thank you!
        Nova Spivack - My Public Twine
      • 7 months ago


        Doug,

        Thanks for the thoughtful review. I would definitely be interested in a follow up from you, at the appropriate time, with how you see Cyc and Alpha complementing each other. I expect the combination will be quite powerful.
        Nova Spivack - My Public Twine
      • 7 months ago


        URL for Doug's comment above, formatted better, on a blog: http://www.semanticuniverse.com/blogs-i-was-positively-impressed-wolfram-alpha.html
        Nova Spivack - My Public Twine
      • 7 months ago


        Is there a possibility for a future collaboration between cyc corp & wolfram alpha??. It will be interesting to see two great scientific minds of our times creating a paradigm shift in not only search technology but also the entire web . I hope to see future web tools with a common sense that eventually will give rise to many more inventions .
        Nova Spivack - My Public Twine
    • 7 months ago


      He's building the computer from Star Trek...

      Good news is that we will be able to make smoke come out of it's ears by asking it to compute the last digit of PI...

      http://www.youtube.com/watch?v=v9kTVZiJ3Uc&feature=related
      Nova Spivack - My Public Twine
    • 7 months ago


      I wonder how far can you go with this concept of "pre-curated" data. From the explanation of Doug, it seems that it is just a huge predefined, understood, structured database with some clever way to compute on the fly some views on top of that, *using code* and not knowledge.
      Nova Spivack - My Public Twine
    • 7 months ago


      Interesting. it is a step from "information finding" to "knowledge elaboration". i think that it could be exptremely powerful if people could insert continuously new "knowledge structures" so as to deal with different categories of questions,as in Wikipedia people can contribute by writing articles.
      Nova Spivack - My Public Twine
    • 7 months ago


      Very exciting! But, as I take it, the choice which knowledge "is of worth" storing in the database is always man-made. So each search result shows only a part of reality. Just the part the staff member, who has filled the database assumed to be true. That gives me food for thought.
      Nova Spivack - My Public Twine
    • 7 months ago


      Thank a lot Nova, a brilliant post! And thanks Doug, I've read your post on the blog, very interesting.
      I'm so excited about Wolfram Alpha, can't wait to try it.
      Nova Spivack - My Public Twine
    • smh
      7 months ago


      The Wolfram paradigm trimuphs again as an approach to representation and reasoning! Even more exciting are the new interactions waiting in the wings (check out Mathematica - FE & Kernel - and put on your lateral thinking cap). If we catch this wave I think we can move forward quickly to revolutionize interactions ... hope some of you start thinking about how you can take the paradigm and apply it to the type of problem you have dreamed of solving.
      Nova Spivack - My Public Twine
    • 7 months ago


      I'd come across Alpha by some other route and am glad to have your article. Let's wait till it's released, though I suspect they have learnt from Cuil.

      As a semantic web sceptic I read with interest 'However for the internal knowledge representation and reasoning that takes places in Wolfram Alpah, OWL and RDF are not required and it appears Wolfram has found a more pragmatic and efficient representation of his own.'

      I should think Alpha would find it easier if the universe was composed of triples. But that's the point isn't it - RDF is never going to happen in a meaningful way, and even the weaker 'linked data' meme seems, from where I sit, fairly unrealistic.
      Nova Spivack - My Public Twine
    • 7 months ago


      42
      Nova Spivack - My Public Twine
    • 7 months ago


      From Eyes on Tech:Wolfram Alfa Search Engine

      Evidently Wolfram Alfa has done the following create "contexts" or domains of information with reusable calculation modules. These modules might be pure math or simple programs.

      We often think of math something we have invented to explain the universe based on emperical evidence but in fact if you drop enough matches on a table you will find the number Pi which leads to the calculated answer of a circle. This intersection between math and cellular automata in this way leads to an answer to the circumference of the earth. So by putting a natural language processor on top and grabbing the implied context(s) and deviations you could skip the math part and vary the bottom layer algorithms of the physical universe to calculate the answer. In other words somebody asks for the distance of flight from Madrid to Sydney and instead of calculating the arc via mathemtical formula you start dropping sticks or some reduced mini celluar automata.
      Lets say you want to know how strong the TV signal is in a valley. First you figure out the domain which in this case is radio waves and transmission. Youget the relevant input like radio tower locations and terrain but then you dont use Maxwells Equations you use the fact that space is 3 dimensional and that something must spread from here to there. You include the terrain in the model and calculate and calculate and drop lower order terms.

      So we can think of the stack the normal way we deal with stuff as:
      1) Ideas
      2) Language
      3) Physics and Empirically Observed Results (Theory)
      4) Math
      5) Cellular Automata of the Universe

      Wolfram Alfa seems to cutout the middle and deal with it this way:

      1) Ideas
      2) Language
      3) Cellular Automata of the Universe (dropping lower order terms)
      This is really way out there! This is lightyears ahead of Semantic Web.
      Nova Spivack - My Public Twine
    • 7 months ago


      Google does indeed perform some computation -- a Google web search for "1337 * 42" will give you a top result "1337 * 42 = 56154" from its calculator. Conversions of various units (length, volume, currency, etc.) can also be performed. There are further computations that are not a "simple lookup" as you state in your review.

      Admittedly, Google does not currently do some of the computations that you describe as search results from Wolfram Alpha, but I wanted to correct your assumption that Google is entirely a lookup engine.

      As you say, though, Wolfram|Alpha is not a "Google killer" and they exist in separate domains. I agree with you that Wolfram|Alpha sounds intriguing and I will be certain to try it out as soon as it goes public. I can imagine many possibilities -- I hope it doesn't disappoint.
    • 7 months ago


      Nova Spivack - My Public Twine
    • 7 months ago


      How about this question: Wolfram Alpha Answers When will this recession be over?

      http://isontech.blogspot.com/2009/03/wolfram-alpha-answers-when-will-this.html
      Nova Spivack - My Public Twine
    • 7 months ago


      I've used Wolfram Alpha recently. It doesn't work. It has very limited knowledge. Basically you have to try to play this game where you guess what it might know, which are only very big and stupid questions like "What is the capital of Spain?" So you're only asking it dumb things you already knew. And then you have to be very careful with the wording. It will bring up a Wikipedia style info sheet based on the topic it assumes you want info on. It gets very boring after about ten minutes of testing it out and is useless as far a being a real tool is concerned. If you want info slightly more complicated then that it has about a 1% success rate. There is no technological breakthrough here. He just wasted his time compiling large amounts of data that already exist in much greater depth all over the internet and built a flimsy language recognition algorithm to access it. If we are now on the web 2.0 this is like going back to the web .0001. Nice try hyping it though.
      Nova Spivack - My Public Twine
    • 6 months ago


      Delphina, That sounds bad. If what you are saying is true some people including me are going to be pretty dissappointed. I guess since you have signed an NDA you cant post any more evidence???
      Best Regards,
      Mans

      Saw your post here to: http://isontech.blogspot.com/2009/03/wolfram-alpha-answers-when-will-this.html#comments
      Nova Spivack - My Public Twine
    • 6 months ago


      Delphina, that does not put me off at all. It even reassures me. The idea of an AI producing "meta-knowlege" that someday people will "trust" as much as they do Wikipedia today scares me. Because even if the algorithm acknowledges different "models", it is still a TOM that is inevitably going to intuite much less about the "models" in my discipline than even I do! :) I don't want HAL or the computer at the end of the Universe; none of us do. But it would be pretty cool to have a tool could give me fast and transparent access to multiple complex data sets and could even do some computation for me. As long as I'm feeling completely reassured that the data is being fed to me is from a source I can idenitify and independently audit, and also that the computation itself is obviously verifiable, then I'm really looking forward to this and can think of lots of meat to feed these tigers. Hats off!
      Nova Spivack - My Public Twine
    • 6 months ago


      My hat’s off to Wolfram. I just watched the long version YouTube video and immediately applied for early access. Nova, you mentioned my site, www.facster.com, on your blog on 4/12/05. I couldn’t make the site commercially viable, so I shut it down. But I continued to work on Facster, which I intended would make all time-series statistical information searchable and “joinable” (i.e. treat all time-series like a big relational DB).

      Stephen Wolfram did me a couple steps better, envisioning the accessibility of all “computable” data, implicitly making such data both searchable and joinable. I was also working on processes to make the data “curated”. My only quibble with Wolfram’s approach is apparently he is not providing the full audit history of curating the data.

      http://www.linkedin.com/in/jackfox
      Nova Spivack - My Public Twine
    • 5 months ago


      Awesome writeup on Wolfram Alpha. Very much appreciated!!! I did a brief write up on why I think WolframAlpha won't end up being the google killer: http://www.sagerock.com/blog/wolframalpha-not-google-killer/
      Nova Spivack - My Public Twine
    • 5 months ago


      Hi,

      I just curios... why so many people compare WolframAlpha with Google? It's looks like different just like most people comment here. I think, wolfram alpha would be our next "partner" just like Google (and other search engine) so far, to find any information we need on the net.
      Nova Spivack - My Public Twine
    • 2 weeks ago


      Interesting. A very sobering message, especially if it is as true as one might fear.
      ----------------------------------------------------
      nuoc hoa | nuoc hoa nam | nuoc hoa nu
      Nova Spivack - My Public Twine
    Add a Comment
Report This

Twine is about discovering, collecting and sharing the content that interests you. Learn More

Join Twine

Stats

First Posted By

First Comment By

Forgot your password?