Search Engine News


...the search industry queries new media

search engine lowdown home search engine resources rss news feedcontact search engine lowdown

.:: SEL partners ::.
Desktop search engine from Copernic
Targeted traffic with Epilot
Text Link Ads
.:: navigating SEL ::.

>> marketing how-to's!

>> search/media interviews!

>> search news analysis!

>> SEL on your mobile!

>> sponsor SEL!

.:: get fed ::.

>> Subscribe to RSS Feed
>> Add to Bloglines
>> Add to Newsgator
>> Add to My Yahoo!
.:: sel (an)archives ::.

 >> 07.2003
 >> 08.2003
 >> 09.2003
 >> 10.2003
 >> 11.2003
 >> 12.2003
 >> 01.2004
 >> 02.2004
 >> 03.2004
 >> 04.2004
 >> 05.2004
 >> 06.2004
 >> 07.2004
 >> 08.2004
 >> 09.2004
 >> 10.2004
 >> 11.2004
 >> 12.2004
 >> 01.2005
 >> 02.2005
 >> 03.2005
 >> 04.2005
 >> 05.2005
 >> 06.2005
 >> 07.2005
 >> 08.2005
 >> 09.2005
 >> 10.2005
 >> 11.2005
 >> 12.2005
 >> 01.2006
 >> 02.2006
 >> 03.2006
 >> 04.2006
 >> 05.2006
 >> 06.2006
 >> 07.2006
 >> 08.2006
 >> 09.2006
 >> 10.2006
 >> 11.2006
 >> 12.2006
 >> 01.2007
 >> 02.2007

Search marketing in the new media era.

August 30, 2006
 
Greg Linden on Findory and the Relevance of Collaborative Filtering
I began regularly reading Greg Linden's Geeking with Greg because of the questions he asked in Battelle's Gary Flake interview post.

I'd been under his sway for years though, without realizing it. Linden developed Amazon's recommendation engine, which analyzes past purchases, user viewing behavior and makes suggestions based on purchases by others who have profiles similar to mine.

I sent Linden my interview with the task-based relevance engine Watson - because of this line from a post he wrote: "The prize in search will go to those that help people get what they need quickly, effectively, and effortlessly."

I also asked him for an interview - I realized it was time for me to learn more about Findory, Linden's personalized news recommendation service that enables users to cut through the hundreds of thousands of news and information items to the ones that will be most immediately and personally useful.

As a result of this interview I will be testing out Findory's feed reader (I already imported my feeds as favorites - I could also choose to import the public bloglines of others!) as well as Findory's alpha web search offering.

In this interview I sought to better understand Findory and the merits of personalized recommendations, as well as to grasp something of the monumental task of being a one-person startup.

I hope you find this glimpse into the mind of a relevance pioneer useful and interesting in your work - and I suggest you familiarize yourself with Linden's brand of relevance. His and Watson's I think are strong examples of what relevance will be in the future.

(...and thanks to Ben Wills for suggesting that I steal Kawasaki's interview layout format. I will also be working at understanding Kawasaki's incisive questions so that I can shorten my interviews a bit :P)
Getting Started:
1) Question:
how have you applied your studies at the Stanford Business School to your work at Findory?

Answer: There have been some direct applications of the coursework from entrepreneurship, accounting, and business law classes, but most of the value of Stanford Business School comes from hearing of the experiences of others and the breadth of the knowledge shared.

The influence of that can be somewhat subtle. I am more able to attack business problems that before may have seemed daunting, I am more confident when networking or negotiating. I have the advice of many on which to lean. It was an enjoyable and useful experience.

2) Question: where is Findory now in your overall vision for it? what stage would you say you're at now?

Answer: Findory is doing well. It now has products in personalized news, weblogs, video, podcasts, web search, and advertising.

Findory is a reasonably popular website as well. It has over 5M page views and 100k unique visitors per month. The site is generating a modest amount of revenue from its targeted, personalized advertising.

The vision for Findory remains the same as it was on the day it started, to help people find the information they need.

Search works when people know what they want and can specify search terms. When people do not know what they want or cannot specify a search, relevant information needs to be brought to them.

Personalization technology can learn individual interests, generate targeted recommendations, and surface useful information that might otherwise be lost. Personalization can help people get the information they need.

3) Question: how would you characterize Findory's growth at this point? who do you see as competitors and why? Findory + Bloglines would be very useful to me; do you see partnerships as a way to continue growth?

Answer: After growing at about 100% per quarter for the first two years, Findory's growth has slowed recently. I believe the primary reason for this slowing is lack of resources for targeting a broad, mainstream audience with new marketing and new features .

Findory's primary competitors are the search giants. Google, in particular, has early features that recommend news stories (in Google News) based on individual search history and a personalized web search that shows different search results to different people. MSN also has an experimental product that recommends news stories.

There are many other sites that might be substitutes for Findory. For example, My Yahoo, Live.com, and Netvibes are customizable home pages. Bloglines and other feed readers are also configurable and customizable. The primary difference is that Findory learns and adapts from behavior. No configuration is necessary. Just read articles, and Findory changes and personalizes to your interests. This is important for mainstream audiences that do not have the tolerance for twiddling and configuring of the geek crowd.

If you would like a combination of Findory and Bloglines, you might try Findory's feed reader. It is known as Findory Favorites and is available at http://findory.com/s/

You can list all your favorite RSS feeds (or load an OPML file) and then get recommendations of interesting stories selected from your favorite feeds. Quite unusual.

4) Question: do you still have plans to do overall websearch? can you map out this process for me?

Answer: Findory recently launched a new version of its personalized web search. It is currently in alpha testing.

Findory personalized web search reorders your web search results based on your search history, clickthrough history, and the behavior of other web searchers. Early analyses showed a modest but useful lift in the quality of the top search results.

More information on the personalized web search and the improvements in search quality can be found in a weblog post: New personalized web search at Findory

5) Question: how many employees now?

Answer: Just one, me. Findory is a tiny, self-funded startup.

Digging into the Findory collaborative filtering engine:
6) Question:
how would you describe or characterize the mathematics of recommendation? do you factor in length of page views or how long it takes to click back?

Answer: Findory recommends interesting articles based on what you read and what others have read.

It is a little like social networking sites, the sites where you list all your friends and then share information between the network of friends.

Unlike social networking sites, everything is done implicitly and anonymously. Rather than list your friends, other like-minded readers of Findory are found for you. Rather than explicitly share, interesting things others have found are quietly and anonymously shared behind the scenes.

All the hard work is done by humans. Findory readers find all the good articles. Findory only helps readers share what they have found easily and with no effort.

Technically, the algorithms used fall into the class of social filtering algorithms, though it often can be tricky work to get those types of techniques to scale to large data.

7) Question: Do you see the potential with Findory usage for an echo chamber, personalized insulation effect, where users end up missing important and relevant news? Will people end up reading what they WANT to read rather than what they NEED to read?

Answer: Findory works hard not to pigeonhole readers. It does not seek to show a reader only what they want to see. Findory helps readers discover a wide range of sources and articles that otherwise might have been missed.

Amusingly, of the very few complaints Findory gets, the most common are from people complaining about seeing articles from a viewpoint with which they disagree. The issue here is that Findory does not pigeonhole people, but some readers want to be pigeonholed. Opinion articles are not selected based on a particular view, with the result that people are exposed to viewpoints they might otherwise prefer to ignore.

By the way, it is interesting to compare Findory with traditional, more static front pages. Let's take Yahoo News as an example. Yahoo News shows the same front page to everyone. There are 100k+ articles available, but everyone sees the same thin slice of 20-30 articles. All the depth of information is lost.

Personalization offers a way to show different front pages to different people. Findory plucks the interesting bits and pieces out of a sea of information. Everyone sees a different slice of the news. Readers see new sources, are exposed to new viewpoints, and discover articles they otherwise would have missed.

8) Question: Is news really the best place for recommendations then? What other types of data do you base news selections on for individual users?

Answer: Readers need help finding interesting news. There are thousands of news sources, hundreds of thousands of news articles, and millions of weblogs out there. It is impossible for readers to sort out the good from the bad on their own. People need tools that make it easy to surface what is relevant to them and help them discover information they would otherwise miss.

The personalization and recommendations on Findory are mostly based on clickstream, but there is some analysis of content as well.

9) Question: what other valuable data about what I'm interested in (besides more of the same data) could you provide me? are there ways to spin this info outside of recommendations such as a graph that shows how long I spend looking at types of news on your site? That would hold up a mirror for me regarding my attention data...

Answer: Findory is not doing much of this, but I really like some of what Google is doing here. In particular, the Google Search History Trends at http://www.google.com/searchhistory/trends is useful and clever.

10) Question: what if I hooked findory up to my email account? or what if you tapped into my attention data in some way? I'd love to lift this recommendation piece out of just news and apply it across all the information I interact with daily, even desktop data. what are your thoughts here?

Answer: The goal of Findory is to help with all information overload. Findory eventually will personalize every information stream in your life, including news, search, advertising, events, e-mails, video, and music.

11) Question: what about white papers or Google Scholar articles and such? Could you envision a Findory that's customized to academics? Have you ever considered making a more targeted Findory for specific demographics?

Answer: We have had various requests to license Findory technology to build personalized news sites for narrower categories. As a tiny startup, Findory does not have the bandwidth to pursue these, but it is an interesting possibility for the future.

12) Question: Can you make Findory read my bloglines and then make suggestions from that, creating a sort of techmeme + new recommendations based on what I already read? This could also give Findory a better sense of how to structure the information you provide; for example I don't need a sports section...

Answer: Findory Favorites (http://findory.com/s/) can read in your OPML from Bloglines, but it does not use the articles you have read on Bloglines

I agree that adding some of Findory's personalization and recommendation technology to Bloglines or other feed readers would be fantastic, especially for people who are feeling overwhelmed with the effort of trying to manually skim and filter hundreds of feeds every day.

13) Question: What about books from Amazon or other types of products? the news I'm interested in could be connected to what entertainment or products I like, right? Are you leaving product recommendations up to advertisers?

Answer: Product recommendations are too close to my previous life at Amazon. Though it is a lot of fun, I doubt I would pursue that.

Personalization has been successfully applied to e-commerce by Amazon and others. Findory is trying to go a step further. Findory seeks to personalize information. Findory will help people find the information they need by learning from what people read and recommending interesting other articles.

14) Question: as a user can I rename any of the sections? they seem so broad, and over time I think a new categorization framework will emerge for every user. Yes, no?

Answer: The categories on Findory are purposely broad. They are intended to supplement search, allowing people to narrow their focus a bit when they want to browse. The personalization should focus the page on the most interesting articles and topics.

Other startups are focusing on narrow categories, including fine-grained classification in Topix.net and the tagging in Technorati. Generally, I think narrow categories require too much work from readers to use and, when used for customizable pages, can cause pigeonholing, so narrowing the categories has not been an area of focus for Findory.

15) Question: video recommendations - what data is most useful to making video recommendations? is this harder without user votes on your site?

Answer: The video recommendations use clickstream data, the information about what Findory readers have watched. Yes, recommendations on Findory for video will continue to improve as more people watch videos using Findory.

16) Question: how have you been able to leverage Findory's recommendation algorithm for advertisers?

Answer: Just as Findory's personalization engine matches content to interested audiences, our personalized advertising matches advertisements to interested people.

The current version uses Google AdSense as the provider of the ads but targets the ads using Findory data. Unlike normal AdSense ads, the advertising is not only targeted to the content of the page, but also to the individual behavior of each reader.

Advertising is a form of content. It is useful when it is relevant. When it is not relevant, it is annoying. Too often, advertising on websites is poorly targeted and irrelevant. Findory wants to make advertising relevant and useful.

17) Question: what are your thoughts about licensing your recommendation engine?

Answer: There have been several inquiries, but supporting licensing would be distracting for the company.

The problem in front of Findory is already Google-sized. Personalizing information -- news, search, and advertising -- is already a multi-billion dollar business. Findory has its work cut out for it with its current mission.

18) Question: you focus on anonymous personalization - why?

Answer: Because we can. And I think it is a good example for others too.

Findory does not require registration or login. Readers who come to Findory can just start reading. The more they read, the more Findory targets to their interests. It just works.

Login and registration are optional. If you do not login, you are just a random number to us. We have no idea who you are and no way to tie your browsing back to you.

Even if you do login, the registration requires just an arbitrary login and password (e.g. "donald.duck"); still no personal information is requested.

Mandatory registration is an unnecessary and unpleasant barrier for readers. Readers just want to get the information they need. We want to help them.

19) Question: Amazon (along with eBay) was a pioneer in leveraging user generated media - book reviews; why have you left this out of Findory?

Answer: Findory does not show the full content of articles -- readers clickthrough to read the full article -- so it would be awkward to show forums or comments next to each article. Moreover, other sites, like Yahoo News, are already pursuing this, making it unattractive for Findory to pursue.

It might be interesting here to talk about the general business relationship Findory has with content providers and its readers. Findory shows excerpts of the content from other sources. It is essentially an ad for that content. Readers benefit from discovering useful information. Content-providers benefit from getting traffic, not just any traffic, but the valuable "traffic of intent", as John Battelle calls it. Findory benefits from connecting advertisers with interested Findory readers. It is good for everyone.

20) Question: how would you describe your crawler? is it built any differently in that it stocks an index for a recommendation engine vs. a search engine?

The crawler itself is pretty straightforward. It is custom, but it is a standard multi-threaded crawler.

21) Question: talk about the findory index - does it grow based on findory usage? how do you decide to add new sources to the index? do you allow suggestions by users? what if I want more information about trout fishing? if I search for trout fishing will you start to find sources for me?

Answer: Findory manually examines news and weblog sources before including them in our crawl. At some point, we will switch to an automated process based on usage data.

A huge percentage of weblogs out there are not of interest to a general audience, either spam, junk, or useless. For one example, Technorati claimed there were 19.6M weblogs in October 2005, around the same time the most popular feed reader, Bloglines, said only 37k weblogs had at least 20 readers on Bloglines. Very few weblogs appear to be useful and relevant to a general audience.

Findory's crawl currently includes a few thousand sources and hundreds of thousands of articles. We constantly are adding new sources to expand our crawl.

General
22) Question:
what projects are you most excited about - outside of Findory - in online personalization? I mean projects by the big players or new start ups?

Answer: I like what Google is doing. Of the search giants, only Google seems to be aggressively pursuing personalization. They already have personalized web search and recommendations in Google News. I suspect they are also quietly pursuing personalization of advertising.

23) Question: what do you miss about Amazon?

Answer: Mostly, I miss the resources I had while I was there. No doubt about it, startups are hard and lonely. At a tiny startup, you have to do everything yourself, scrounge for every machine, and fight for every bit of attention from press and consumers. At Amazon, I had powerful computing servers, played with massive data sets, could bounce ideas off talented software engineers, and could rely on system administrators, database administrators, and PR and legal teams. It was nice to have a strong team supporting and helping me.

24) Question: what has had the steepest learning curve for you in running your own startup?

Answer: I would say the hardest thing is that what I decide to work on every day is a bet on the life of the company. I have to be very careful in picking my battles and knowing where to focus my attention.

25) Question: at the library this weekend I was confounded and astounded by the lack of user/borrower data to help me make decisions on step parenting books. do you know of any projects in the public space focused on tapping into usage data to help libraries and their patrons learn from others about the books that are the strongest offerings to a given thought space?

Answer: No, but that is an interesting idea! You should mention it to Gary Price. I am sure he would love to chat about it.
Follow Up Questions:
1) Question:
How many people are using the Findory API?
2) Question: Can you provide urls of interesting and clever usages that you've seen?
3) Question: How many Findory inline users do you have?

Answer: Findory's API, RSS feeds, and Findory Inline get a substantial amount of traffic, millions of hits per month. Of these, my favorite is the Findory Inline and RSS feeds that let bloggers see articles on other blogs related what they write about.

Some good examples of using Findory Inline are on my weblog at http://glinden.blogspot.com. You can see that Findory content -- weblog articles related to what I write about on Geeking with Greg, news stories about Google, and a snippet of the personalized headlines that I see when I visit Findory -- are placed directly on my weblog, blended nicely with the style of my weblog, and shared with readers of my weblog. Fun stuff.

4) Question: To me this is approaching a Eurekster-type service where I put customized Findory search and news on my site. Is such a Findory service likely in the near term?

Answer: It's a good idea. That could be implemented using the Findory API, though I admit it would require a bit of effort to do it. Perhaps I will expand Findory Inline to offer this feature.

By the way, if you want this feature, you might also check out Google SiteSearch.

It's a cute service that is part of AdSense and does some of what you want.

5) Question: Explain why, given that there's only you on this project, creating a general audience service instead of a niche audience service is the best direction. If you had to pick a niche what would it be?

Answer:Oh, a niche is no fun at all. I like my projects to be big, hairy, and audacious. I like working on things that could benefit tens of millions of people. A niche is no fun at all.

6) Question: You say in response to question 3 "Just read articles, and Findory changes and personalizes to your interests. This is important for mainstream audiences that do not have the tolerance for twiddling and configuring of the geek crowd." However, mainstream audiences are now flocking to services like MySpace, which require a great deal of configuring and twiddling, though of the less geeky sort obviously. What incentives does MySpace provide for "twiddlers?" Would/could these incentives make sense in Findory?

Answer: I think sites like MySpace and Facebook owe much of their success to being used for dating. Sex is a powerful motivator. It will get people to do work.

Sites like My Yahoo have had a lot less success with getting people to configure and twiddle. The vast majority of people who use My Yahoo do no configuration at all; they use the default page. All those people, the mainstream who are uninterested in spending time configuring My Yahoo, would benefit from implicit personalization like Findory's.




Powered by Blogger
Weblog Commenting by HaloScan.com
© 2006 Search Engine Lowdown. All Rights Reserved.
All views and opinions expressed are those of the author only,
protected by the First Amendment and are not representative of any company listed. All trademarks, slogans, text or logo representation used or referred to in this website are the property of their respective owners.