Main

Kompany Technology Archives

December 22, 2006

Kosmix gets a new UI

As part of a redesign of our frontend architecture (code named Merlot), we just launched a new look and feel for www.kosmix.com. While there have been many cosmetic changes, we also have a new application tier serving this UI. We are very excited about this rollout as it provides us the ability to create new visualizations of results and answers. Tell us what you think of these views, what you liked and didn't like as well as what you would like to see.

One very cool feature that we rolled out is the "Topic Overview" page. This page is a new visualization specifically designed to support Information Surfing. Since we have a categorization of the web, we are able to surface relationships between the query and other potential concepts of interest. We hope that you will find these relationships very interesting as useful as you explore the web for information regarding your query.

Here is an example query for that feature: pain in the left stomach. Notice that on the topic overview page, we have identified that pain in the left stomach is related to conditions such as Appendicitis and Gastritis, procedures such as Gastric Bypass and diagnostic tests such as Endoscopy. I dont think any other search engine provides this kind of information and you will have to use many search queries and read many of the results to surface these relationships.

So we think this is extremely valuable and unique and we would really love to hear back from you on what you feel about this feature. Coming soon will be more features and another blog post to describe our vision to support information surfing.

Information Browsing enabled by Merlot

In my last post I reported the lauch of Merlot, our new application frontend architecture. In this post I want to describe our vision for Merlot and Information Browsing. Topic Overview Pages is version 0.1 of our attempt to enable "Information Browsing". Traditional search results pages only give you a small sample of the wealth of information on the web for your query. More importantly they dont provide you with the context surrounding your query. Others have tried to do this but they have been small in scope and been done through a manual process.

The really cool thing about what we have done is we have figured out a way to identify relationships of interest to your query. This is to facilitate your browsing of information (rather than just web pages). Our goal is to produce the best overview page for your query given the variety of information on the web and take you closer to browsing for answers rather than searching.

Here is an example: Try the query "breast cancer". The basic information section provides you with a short overview of the common drugs, procedures, symptoms and tests but if you look at all the related topics, you will find a list of all the things we found connected to breast cancer. I dont think you can find such a comprehensive list anywhere else on the web.
Another example is "Diabetic Foot Infections". One of the related body parts is "Achilles Tendon". At first, I thought "Oh man! a bug", but then I decided to click on it and found the first link to be " Achilles Tendon Surgery Helps Prevent Diabetic Foot Ulcers". Cool huh!!!

So chime in and let us know how you like this form of browsing and in particular if there were any interesting relationships between topics that you found on Kosmix.

January 16, 2007

Kosmix beats Google at finding cure for obscure ailment

An article today (Jan 16, 2007) talked about how a doctor had turned to Google Scholar to find a treatment for death-cap mushrooms. This clearly demonstrates how the web can help medical professionals find answers to obscure problems.
However, the doctor could have saved hours of research by using focused Health vertical search engines. Kosmix helped give the answer in about 5 seconds, compared to the several hours the doctor spent on Google Scholar.

(Click on the link below to see Kosmix's results for "death cap mushrooms", and then click on "View All Related Topics" to see Silibinin as the first drug displayed)

Death Cap Mushrooms

In the words of David Utter, the author of the article, "Vertical search has been growing in importance, but it hasn't broken through to the mainstream like Google has. Hopefully more medical professionals will learn of and utilize options like Kosmix", specially for diagnosing obscure cases.

The full article can be found here:
http://www.webpronews.com/topnews/topnews/wpn-60-20070116KosmixHealthlineHadDeathCapCure.html

February 7, 2007

How is Kosmix different

We got asked this question at one of the career fairs we visited. Basically, why is Kosmix different from other search engines and do we need one more? This is a pitch we tried, to explain in layman terms how Kosmix Technology enables users to find better information, faster.

Kosmix is the next generation search engine, that uses its core categorization IP to provide a better search experience. We think we provide two main value adds over other players

You don't need to know what you want :

Traditional search engines are good at finding information if you know the question (right way to phrase the query). Kosmix algorithmically generates an overview around any topic and lets you explore. So you are interested in going to hawaii, you don't need to know what you can do there - we tell you what you can do, where you can go and where you can stay.

We get you the answers not just links :

As opposed to the Google/Yahoo model where you are provided with a set of links to go look for the information you desire. We bring answers to the first page, means fewer clicks and less searches to get to the information. Example are drugs for conditions (e.g. death cap mushrooms, lasik), common problems with cars (toyota), activities at destination, etc.

So we have very cool technology that helps us understand the content, all the pages on the web. Using similar technology we can understand the user query, hence make an intelligent guess at the intention. This helps us match the intent to content better, leading to a more fruitful search experience and definitely a wiser user :)

February 13, 2007

Musings with Sesh (our CTO)

Recently Sesh Seshadri (our CTO) and I got into an involved discussion about Kosmix technology. Thought I would report some snippets of that here:

One of the grand challenges we have addressed and continue to improve on is the ability to categorize the web into millions if not hundreds of millions of categories. We are unique in that we have figured out a way to do this in a scalable manner compared to the various human-intensive efforts
We have built a fully functional search engine all the way from a full crawl to the distributed query processing frontend in 18 months with 10 engineers. We have built this as a platform to support different kinds of web 2.0 applications.
Another of the challenges we would like to tackle is to produce an SQL like interface to the web as well as other data sources. We want users to be able to ask questions, test theories and inspect data using a very simple language (SQL). This would make the web an experimental platform for innovation.
In a company like Kosmix, we have challenges in every area of computer science...File Systems, Operating Systems, Distributed Computing, Distributed and Very Large Databases, Machine Learning and Artificial Intelligence to name a few.

Though I have been at Kosmix a while now, chatting about our technology with Sesh gave me a different way of looking at what we are doing.

April 1, 2007

Kosmix Founders win Mechanical Turk Patent

At the end of last month, the USPTO awarded patent #7197459 titled "Hybrid machine/human computing arrangement". Now you are thinking.. what a strange patent title....who would come up with something like that... well guess who those inventors were? Thats right folks.. while they were busy at Amazon, our founders Venkat Harinarayan and Anand Rajaraman (with Anand Ranganathan) invented the Mechanical Turk!

In brief, Mechanical Turk is a means for humans and computers to collaborate on accomplishing tasks. Say you had some task that required human judgment and intelligence like "What is your opinion of the summaries of these 10 news articles". You would then set up a number of HITs (human intelligence task) using mechanical turk. The turk would then make these tasks available to human users who have signed up. When the humans complete the task they can upload their results and get compensation for the number of tasks they completed. You get to review their submissions and authorize payment to them.

The Turk system shields you from the details of the users and for all you know its like a giant computer accomplishing your task!! Pretty cool huh!! There you go.. yet another smart idea from our founders!!

July 18, 2007

Videos on Kosmix Topic Home Pages

Ever notice how it’s hard to find exactly what you’re looking for with video search?

That’s why at Kosmix we’ve incorporated videos into our Topic Home Pages, bringing video results for most any query.

Not that I’m knocking YouTube or other video search, it just so happens we all got so used to search, that when you go looking for a video, you’re already conditioned to search using keywords. But a video usually doesn’t have many keywords associated with it, and unless you happen to guess exactly the right keyword related to what you’re looking for, you aren’t ever going to find the right video.

Truveo is a step in the right direction, with more video providers and deeper categories to search within, but you’ll still hit that invisible wall when entering more natural terms or abstract queries rather than keywords.

Let’s say you want to travel, get some inspiration, and all you know is you want “natural beauty.” Not so much inspiration in the results from Truveo.
http://truveo.com/?method=truveo.videos.getVideos&query=natural%20beauty&results=10&start=0&showRelatedItems=1&tagResults=10&channelResults=10&categoryResults=10&userResults=10

Enter in "natural beauty" at Kosmix Travel, and you’ll get videos of the Dead Sea.
http://www.kosmix.com/Travel/natural_beauty-s?

In health try, “Relaxation,” and you get videos for acupuncture and meditation.
http://www.kosmix.com/Health/relaxation-s?

With Kosmix mapping your intent for abstract queries, and displaying videos accordingly, videos move closer to the experience of flipping through channels on your TV; exploring possibilities with video results rather than being constricted by keyword search.

--Matthew Krajewski

July 26, 2007

The Google Question

Rolling my eyes, I was miffed and tad annoyed at the question my friend asked, probably because it is the elephant in the office that other web companies inevitably face:

“So why would I come search on Kosmix instead of Google?”

Sigh of exasperation. What followed was a stream of convoluted explanations trying to convince my buddy that you’ll find more of what you want using our product. My annoyance had more to do with my understanding the difference, and not fatigue from engaging in an imaginary uphill battle to get people to search with Kosmix rather than Google. We're more than happy to have users find us through Google, but we, as other similar companies find, will inevitably get compared against them as the standard rubric of success. Our employees even stand in line waiting for their very own Google brand T-shirt, as photographed in a recent “Time” article: http://www.time.com/time/business/article/0,8599,1641232,00.html

It is a simple issue of branding, brand name, brand recognition. Google is synonymous with search and may always be. Just as Coca-Cola is synonymous with refreshment, or Levi’s is synonymous with American style. Other companies can try to produce other colas, other jeans, or other search engines, but they will more than likely always come in second. Kosmix is not engaged in the battle to become synonymous with search, we’re in a relatively new field: becoming synonymous with the home page.

As our co-founder, Venky Harinarayan, notes in a recent Alt Search Engines article, “In today’s world, a consumer has to not only know what to look for, but also where to look for it – should she look for it on Google, or go to YouTube, or Flickr, or a blog search engine. What the consumer needs is a “home page” – a starting point – for her topic of interest, which brings the best of the Web into one place for her to explore.” (see the full article here: http://altsearchengines.com/2007/07/11/view-from-the-corner-office-kosmix/)

Other companies are still asking their users to search, to be active, to dig through those ten results to find some, not all, of what they may be looking for. Google Universal Search may resolve some of this, but fields of knowledge are funny beasts, and require unique habitats to truly thrive, something we do well at Kosmix by dividing the web into verticals. After understanding the autos, or health, or travel beast, we can then bring all the images, blogs, news, and unique content related to that vertical and query to a user,. We thereby create the best home page for any topic on the web, by not only defining our rich results by clarifying the sources, but what categories those sources belong. This means not only defining a web page as an Blog, but a Blog that is also a Republican pundit page. Every topic in the world doesn’t require search results, it requires a home page.

We ask our users not to be active, but be passive, sit back, relax, and let us bring the world home. We’ll redefine the home page, and become synonymous with that idea, not with search.

--Matthew Krajewski


August 9, 2007

Cyberchondriacs

Cyberchondriacs

Apparently, a factor in our own market success of Kosmix Health has to do with the rise of the Cyberchondriacs. Cyberchondriacs being those people that turn to the web as a first resource for their health concerns. The rise of the cyberchondriac is remarkable, about 37% increase over two years, totaling 160 million wellness seekers in 2006.

It would be overly humble to assume to stand in the background while editorial sites are revered as the true destination for cyberchondriacs. Kosmix Health is uniquely suited to the plight of the cyberchondriac because we aren’t an editorial site with a limited index of information, nor are we a broad faced search product. We wear the clothes of the editorial product in our UI but have the algorithm powering our product to categorize all the deep health content on the web. With the topic home page, you aren’t reliant on one reliable source, nor do you have to scour search results; we’ve got the health web broken down into high level information such as blogs, news, images, and video; but also have unique breakdowns such as treatments and symptoms.

In a recent Reuters article, Harris Interactive noted, “Cyberchondriacs are not only using the Internet to educate themselves, many are also using it to assist in their conversation with their physicians,” and at Kosmix Health we hope we give all the resources, organized easily, to facilitate meaningful exchanges in that valuable time spent with your physician.

To read further on the cyberchondriacs, check out:
http://www.businessweek.com/technology/content/aug2007/tc2007081_494616.htm
http://www.reuters.com/article/technologyNews/idUSPAR10595820070801

--Matthew Krajewski

August 27, 2007

A.D.A.M. Content

Kosmix is always striving to bring the best topic home page to our users. With Kosmix RightHealth we recently added A.D.A.M. content to quickly answer questions for a given query. For instance, for the query, "Headache," http://www.kosmix.com/Health/headache-s?
Definition and Prevention information are summarized at the top of the topic home page. Clicking through reveals more information on Definition, Prevention, Care, Home Care, and Treatments. Formerly, this type of information was answered by web modules (which are still present on our topic home pages), however, this integration of A.D.A.M. content offers quick, reliable answers without a lot of navigation through web results. Just another way we try to be one of the best Health sites on the web!

--Matthew Krajewski

September 28, 2007

Kosmos Filesystem Release

Web search engines are required to process large volumes of data. This entails having a scalable backend storage infrastructure built on commodity hardware (such as, cluster of PCs running Linux). To address this infrastructure need, at Kosmix, we have developed the Kosmos Distributed Filesystem (KFS). We have released KFS as an open-source project under the terms of the Apache 2.0 license. The initial release is KFS version 0.1 and it is currently in "alpha". The source code as well as pre-built binaries are available for download at the project site

In a nutshell, KFS virtualizes disk storage on a cluster of machines providing a global namespace. Files are striped across nodes in the cluster and are replicated for fault tolerance/availability. KFS consists of a client library that enables user applications to read/write files stored in KFS.

KFS supports the familiar filesystem interfaces/programming model. The functionality of the KFS API is similar to the model exposed by operating systems such as Linux. To illustrate,
• When a file is created, the filename is visible in the global namespace.
• As data is written to a block of a file, it gets flushed out to the set of servers storing that block. Data written to servers can now be read by other processes.
• For writing/reading, a process can seek to any point in the file and read/write from there.
• Files can be opened for writing multiple times.
• Data can be appened to existing files by opening the file for writing in append mode.

When blocks of a file are striped across nodes in the cluster, KFS stores individual blocks of file as files in the underlying file system (such as, XFS on Linux). To guard against disk corruption, checksums are computed on the blocks and verified on each read. If disk corruption is detected by checksum mismatch, the system discards the corrupted block and uses re-replication to recover lost data.

Each file stored in KFS is typically replicated 3-way. Depending on application needs, the degree of replication for files can be changed on-the-fly.

KFS also contains rudimentary support for block rebalancing. To help with better disk utilization across nodes, the system may periodically migrate data from over-utilized to under-utilized nodes.

KFS client library provides support job placement systems. For instance, a job scheduler can determine the location(s) of a byte range within a file and schedule jobs appropriately.

KFS is implemented in C++. In addition to C++ applications, KFS also contains support for Java (via JNI) and Python applications.

To enable a large class applications to evaluate KFS, we have integrated KFS to be the backing store for other open source projects:
Hadoop: Hadoop is an open-source project that provides a Map/Reduce implementation. It contains a Filesystem API that allows alternate implementations to be used as the backing store. For example, currently, the set of choices for a backing store are Local filesystem, HDFS, S3 infrastructure. As a new alternative to these choices, KFS is integrated with Hadoop using Hadoop's Filesystem API. This allows existing Hadoop Map/Reduce applications to use KFS seamlessly. That is, by changing some Hadoop configuration parameters, KFS can be used as the backing store. We have submitted the necessary "glue" code to the Hadoop code-base; it will be included in the next Hadoop release.
Hypertable: Hypertable is an open source project (being developed at Zvents Inc.) that provides a Big-Table interface. KFS is integrated with Hypertable as the backing store.

We are releasing KFS with the intent of providing a useful storage infrastructure software. It is our hope that KFS will meet the storage needs of various projects. We would be happy to work with anyone interested in using KFS. Please try out KFS and give us your feedback of what works, what you would like to see added/possibly contribute to KFS!

June 30, 2008

Why MeeHive Should Be On Your Radar

By: Nicky.

In my last blog post, I detailed in a rather cryptic fashion the concept of a ‘Personalized News Dial Tone.’ I explained that in much the same way that your phone instantly connects you to the people in your life, Kosmix is working on a product that instantly connects you to all of your news interests that change around you every moment of the day.

Since it’s Monday morning and I have 2.5 cups of coffee (read: personality) running blissfully through my veins, now seems the perfect time to tell you a bit more about what we’ve been doing with this product - which we’ve named MeeHive.

Why MeeHive? Well, think of a Bee Hive, a place that is so full of frenzied activity that it literally buzzes, and then imagine that we offered you your very own hive where you could collect stories that interest you. That would make you buzz, wouldn’t it?

We debuted MeeHive at last month’s Under the Radar conference held at the Microsoft Campus in Mountain View, CA. Under the Radar is dedicated to showcasing the industry’s up-and-coming players – the startups who are developing some of the freshest and most creative products out there.

Sesh, our fearless CTO, presented MeeHive as part of the ‘Graduate Circle,’ a forum for established companies like Kosmix to discuss how they got to be where they are and how they are continuing to innovate.

During his well-received presentation, Sesh described how in a world of ‘pull’ models, where you search the web high and low to get the information you want delivered to you, MeeHive is a news ‘push’ model – delivering fresh information to you all the time so that you don’t have to go looking for it.

He noted that with MeeHive’s ability to leverage the Kosmix user base and deliver uber-relevant results, it is well-positioned for success. Of course, we know that getting MeeHive to where we want it is a marathon, not a sprint, so we’ll be spending the summer building the most robust product we can in preparation for a launch not too far down the road. In the meantime, sign up for our beta and we’ll keep you posted on developments.

About Kompany Technology

This page contains an archive of all entries posted to Kosmix Blog in the Kompany Technology category. They are listed from oldest to newest.

Kompany Kulture is the previous category.

Press Stuff is the next category.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.31