Nova Spivack - My Public Twine Nova Spivack - My Public Twine / Items

The Next Generation of Web Search -- Search 3.0

Get Feed

The next generation of Web search is coming sooner than expected. And with it we will see several shifts in the way people search, and the way major search engines provide search functionality to consumers.

Web 1.0, the first decade of the Web (1989 - 1999), was characterized by a distinctly desktop-like search paradigm. The overriding idea was that the Web is a collection of documents, not unlike the folder tree on the desktop, that must be searched and ranked hierarchically. Relevancy was considered to be how closely a document matched a given query string.

Web 2.0, the second decade of the Web (1999 - 2009), ushered in the beginnings of a shift towards social search. In particular blogging tools, social bookmarking tools, social networks, social media sites, and microblogging services began to organize the Web around people and their relationships. This added the beginnings of a primitive "web of trust" to the search repertoire, enabling search engines to begin to take the social value of content (as evidences by discussions, ratings, sharing, linking, referrals, etc.) as an additional measurment in the relevancy equation. Those items which were both most relevant on a keyword level, and most relevant in the social graph (closer and/or more popular in the graph), were considered to be more relevant. Thus results could be ranked according to their social value -- how many people in the community liked them and current activity level -- as well as by semantic relevancy measures.

In the coming third decade of the Web, Web 3.0 (2009 - 2019), there will be another shift in the search paradigm. This is a shift to from the past to the present, and from the social to the personal.

Established search engines like Google rank results primarily by keyword (semantic) relevancy. Social search engines rank results primarily by activity and social value (Digg, Twine 1.0, etc.). But the new search engines of the Web 3.0 era will also take into account two additional factors when determining relevancy: timeliness, and personalization.

Google returns the same results for everyone. But why should that be the case? In fact, when two different people search for the same information, they may want to get very different kinds of results. Someone who is a novice in a field may want beginner-level information to rank higher in the results than someone who is an expert. There may be a desire to emphasize things that are novel over things that have been seen before, or that have happened in the past -- the more timely something is the more relevant it may be as well.

These two themes -- present and personal -- will define the next great search experience.

To accomplish this, we need to make progress on a number of fronts.

First of all, search engines need better ways to understand what content is, without having to do extensive computation. The best solution for this is to utilize metadata and the methods of the emerging semantic web.

Metadata reduces the need for computation in order to determine what content is about -- it makes that explicit and machine-understandable. To the extent that machine-understandable metadata is added or generated for the Web, it will become more precisely searchable and productive for searchers.

This applies especially to the area of the real-time Web, where for example short "tweets" of content contain very little context to support good natural-language processing. There a little metadata can go a long way. In addition, of course metadata makes a dramatic difference in search of the larger non-real-time Web as well.

In addition to metadata, search engines need to modify their algorithms to be more personalized. Instead of a "one-size fits all" ranking for each query, the ranking may differ for different people depending on their varying interests and search histories.

Finally, to provide better search of the present, search has to become more realtime. To this end, rankings need to be developed that surface not only what just happened now, but what happened recently and is also trending upwards and/or of note. Realtime search has to be more than merely listing search results chronologically. There must be effective ways to filter the noise and surface what's most important effectively. Social graph analysis is a key tool for doing this, but in addition, powerful statistical analysis and new visualizations may also be required to make a compelling experience.

Comments

  • Public Comments

    • 5 months ago


      Hello, I am interesting about web 3.0.
      Nova Spivack - My Public Twine
    • 5 months ago


      I totally agree with everything said.

      For me those two themes that may define the next great search experience, i.e. "present" and "personal", are just additional options that are *relatively* easy to implement if certain groundwork has been done. I believe that the groundwork consists of establishing an improved data infrastructure and general user authentication. Once these are commonplace then so much more is possible.

      Interoperable metadata, when considered for all relevant elements of "some information" (i.e. not just the "document" or "web page") will create (is creating) the improved data infrastructure, with RDFa and microformats serving as the vanguard approach.

      Proper user authentication, which may involve loss of anonymity, will bring necessary integrity to statements submitted by individuals and allow user profiles to be used as part of the metadata associated with "some information". A tweet may only be up to 140 characters long but consider the associated metadata that could be available from the twitter user (information in a profile that can be accessed by all systems, other information submitted by that user anywhere, comments made about that user anywhere). And of course dealing with spam becomes straightforward.

      I have reasonable confidence that semweb/web3/your_name_for_it is happening and resulting in a gradually improving data infrastructure (Yahoo, Google and MS are all underlining this). I remain concerned about achieving user authentication that is deep enough to pervade all systems (and definitively connect online users to real world people). A touchy subject no doubt with some.
    • 5 months ago


      Nice post Nova...

      I'm liking the direction you're going but I think we can get to the personalization and timeliness of information such as activity streams a whole lot sooner than what you indicate (not that we need to rush, but I believe its better that new thought leaders emerge before the existing monolith companies seize and maintain control). You're exactly right about the coming decade (my favorite Semantic Web slide is still your infamous Web Evolution slide from 1.0 to 4.0 and onward) being about the constant improvement of the way we utilize the newfound information relevance filtering processes, but I would like (and fully expect) to see the relevance issue to be answered in the next year. Something to set the gold standard for the next decade, like Google did in 2000 should come in 2010, but with a whole new priority on privacy through user-controlled sharing of their own private data. Imagine more than just a "personalized search engine", but more of a specifically-requested "find/discovery/recommendation engine" which personalizes information to the authenticated-role a given user has chosen to receive and share data for.

      In the meantime, we as developers must be careful and remember this will be no magic "holy grail", especially if users need to give up privacy for relevancy:
      http://www.google-watch.org/crock.html



      To GEORGE MUNROE:
      User authentication does not necessarily have to equal loss of anonymity, identities can be abstracted through roles and by building a Web-of-Trust type approach to accessibility of information. There will not, and should not, be one single solution. There are many separate approaches to federation and integration of accounts, contacts and domain-specific account information, and this is a good thing. The key to moving forward will likely be role-based access of specific pieces of information, protected by digital signatures/certificates and constantly evolving encryption. People have to remember though, that even with all this, the moment you even write down an idea on a piece of paper that idea (or piece of information) is no longer 100% safe or secure. You can store it in your home, but your home could be broken into, damaged, or it could just get lost/misplaced. Nothing changes for digital information.
    • 5 months ago


      I love that you keep trying to make something new and good.

      -Web 3.0 SE: Do you mean by timeliness, how relevant is for the current moment, how actual is it? I guess you need to understand what the information is about in order to do that, you are right in saying this, but not sure meta data alone is going to work. But you
      are for sure on the right track.

      -I guess personalized search hadn't succeed in part cause people were worried about sharing private personalization information with Google and what they will use this for. Maybe this should be taken care off.

      Random thoughts:
      -Can time periods be among the personalization factors? (maybe some people are more interested in some periods than in others)
      -In order to personalize the rankings you have to calculate rankings in real-time (true), that is not so easy to be done, but I think
      it is certainly doable.
      -We need to develop better models for emulating the behavior of social networks.
      -We need to overcome the limitations we have for representing and handling information, graphically, time related and also the interaction related ones.
      Nova Spivack - My Public Twine
    • 5 months ago


      Nova, good point. When Internet reached more than 1B of people, many software & services still do not take person into account. There are several areas that as I believe can be helpful to remove the gap between concrete person and traditional software, such as belongness to culture, knowledge of languages, education/job experience, health data, social connections, interests (from search results, communications) that combine a person together. One of MSR projects back in 2001 was MyLifeBits by Gordon Bell which tried to make computer to keep digital memories about you. The idea was that when computer capture a lot of your activities and store then into digital archive you then can easily get back to your history and extract required information using timeline or search.

      Knowing user's context is also as important as knowing user himself. For example, when you know user's home location as well as his current location it is possible to take personalization to the next level. Combining information about user, his context of envrionment, his social connections and devices he uses may lead to a next level of computing with the person in mind.

      But my belief that the next Web is not just a Web anymore, because Web is too deeply integrated with client and with each newer and newer release of clients for Facebook, etc it seems that the next era of computing, personalization, approaches.

      The next era will touch not only Web, but also the Desktop (read the "Beyond The Desktop Metaphor", by Mary Czerwinski and Viktor Kaptelinin to learn more about changes in Desktop that were, are and will continue happening), and the trends of computing - social networks, semantic desktops, semantic storages, sensor & location platform in Windows, Core Location in Mac OS X, physical orientation awareness in devices, rich multitasking/context switching at one point will finally make a revolution in computing.

      I tend to name this era - "context-aware computing" because context awareness is in basics of all of these changes described above.

      The question is, who will lead the final move to the next era of computing?
      Nova Spivack - My Public Twine
    • 5 months ago


      Being a semantic web geek I'm all in favour of providing metadata as, same, RDFa to webpages. However, do you think Joe Public can be trusted to do that well...if at all?
      Nova Spivack - My Public Twine
    • 5 months ago


      Nova, I'm not a developer and I know that you have said in the past that non-developers need not get involved in the semantic web, but I do have many ideas.

      Would it be possible to create a webworld that rewards the "average joe" for share their favorite links, books, movies, videos, websites etc..? Could we create a system with Twine that enables us to use affiliate links to track the traffic we send to our bookmarks? Twine is such an amazing website; I'm just thinking of the possibilities it has. I've only been a member for 2 days and I've already learned so much about Web 3.0.

      Thank you for creating such an amazing website!

      http://www.twine.com/user/billsaint
      Nova Spivack - My Public Twine
      • 5 months ago


        Well basically Twine lets the "average joe" share favorite things already. The affiliate links idea could be in line with things we are thinking about adding, but I'm not sure what exactly you are suggesting there...?
        Nova Spivack - My Public Twine
    Add a Comment
Report This

Twine is about discovering, collecting and sharing the content that interests you. Learn More

Join Twine

Stats

First Posted By

First Comment By

Forgot your password?