« August 2007 | Main | October 2007 »

September 2007 Archives

September 19, 2007

Beta Launch

We’ve been ecstatic over here at our Mountain View office, to see the official launch of three of our beta products:RightAutos, RightTrips, and our flagship product, RightHealth.

This launch represents years of development based on algorithm-based categorization. RightHealth already receives more than 2.5 million visits and serves 9 million searches a month. With the internet increasingly becoming the starting point for information, it comes as no surprise that people with aches, pains, or sniffles increasingly come to RightHealth to get health answers. And with 50 million uninsured Americans, the web has become an invaluable resource for attempts at self-diagnosis.

Without the tireless work and talents of the Kosmix team this launch would have been pretty anticlimactic. Our UI designer, Brian Tharp, provided the intelligent and sophisticated design to bring our topic home pages to the next level. Mohan Gummalam and the rest of the Application Development team ran with the design and made sure it integrated into our architecture. Nagraj Kulkarni, as usual, went the extra mile to see that production cluster made it to the live site. Saumil Mehta, our talented product manager for Health, seemed to live at the office to make sure each and every detail was accounted for in the new RightHealth product. In fact, the entire Kosmix staff really pulled together to insure the launch of our beta products, and it unfortunately would take up multiple blogs to detail each and every person's important contribution.

With our categorization technology, and goal to create an unofficial homepage for any topic on the web, getting information for travel, autos, and health queries just got easier.

--Matthew Krajewski

Burning Man

Our very own Tina Nanez made the recent pilgrimage to the playa for the most otherwordly festival on the planet: Burning Man. Burning Man is an eight day festival in the Black Rock Desert of Nevada terminating on Labor Day weekend. Those that descend on the playa participate in creative expression, community building, and self-reliance. Tina was one of the some 50,000 in attendance this past Labor Day weekend.

Before she went, I stumbled across a great perspective on Burning Man in Conde Nast’s new business imprint, “Portfolio.” It detailed how businessmen have been attending Burning Man, typically a radical event, including such past attendees as Amazon C.E.O. Jeff Bezos, Google co-founders Larry Page and Sergey Brin, and Google C.E.O. Eric Schmidt. One executive related how he went since the experience opened him up to innovation and creativity, believing that when he came back he was returning a better executive. Read more at: http://www.portfolio.com/executives/features/2007/08/27/Executives-at-Burning-Man

However, I sat down with Tina to see what the experience was like first hand, and immediately got the sense that it was a wonderland of creativity that must have been like candy for the mind. In particular, she related how one night, a large truck came onto the playa, and stopped in the very middle, suddenly opening up and exposing hundreds of lightsabers while blasting the theme to Star Wars. Bicycles descended on the truck, glow sticks and streamers trailing behind them, nobody able to get there fast enough to engage Jedi playfighting. She related how to see the kids faces as everyone was suddenly in a galaxy far, far away was unforgettable.

September 28, 2007

Kosmos Filesystem Release

Web search engines are required to process large volumes of data. This entails having a scalable backend storage infrastructure built on commodity hardware (such as, cluster of PCs running Linux). To address this infrastructure need, at Kosmix, we have developed the Kosmos Distributed Filesystem (KFS). We have released KFS as an open-source project under the terms of the Apache 2.0 license. The initial release is KFS version 0.1 and it is currently in "alpha". The source code as well as pre-built binaries are available for download at the project site

In a nutshell, KFS virtualizes disk storage on a cluster of machines providing a global namespace. Files are striped across nodes in the cluster and are replicated for fault tolerance/availability. KFS consists of a client library that enables user applications to read/write files stored in KFS.

KFS supports the familiar filesystem interfaces/programming model. The functionality of the KFS API is similar to the model exposed by operating systems such as Linux. To illustrate,
• When a file is created, the filename is visible in the global namespace.
• As data is written to a block of a file, it gets flushed out to the set of servers storing that block. Data written to servers can now be read by other processes.
• For writing/reading, a process can seek to any point in the file and read/write from there.
• Files can be opened for writing multiple times.
• Data can be appened to existing files by opening the file for writing in append mode.

When blocks of a file are striped across nodes in the cluster, KFS stores individual blocks of file as files in the underlying file system (such as, XFS on Linux). To guard against disk corruption, checksums are computed on the blocks and verified on each read. If disk corruption is detected by checksum mismatch, the system discards the corrupted block and uses re-replication to recover lost data.

Each file stored in KFS is typically replicated 3-way. Depending on application needs, the degree of replication for files can be changed on-the-fly.

KFS also contains rudimentary support for block rebalancing. To help with better disk utilization across nodes, the system may periodically migrate data from over-utilized to under-utilized nodes.

KFS client library provides support job placement systems. For instance, a job scheduler can determine the location(s) of a byte range within a file and schedule jobs appropriately.

KFS is implemented in C++. In addition to C++ applications, KFS also contains support for Java (via JNI) and Python applications.

To enable a large class applications to evaluate KFS, we have integrated KFS to be the backing store for other open source projects:
Hadoop: Hadoop is an open-source project that provides a Map/Reduce implementation. It contains a Filesystem API that allows alternate implementations to be used as the backing store. For example, currently, the set of choices for a backing store are Local filesystem, HDFS, S3 infrastructure. As a new alternative to these choices, KFS is integrated with Hadoop using Hadoop's Filesystem API. This allows existing Hadoop Map/Reduce applications to use KFS seamlessly. That is, by changing some Hadoop configuration parameters, KFS can be used as the backing store. We have submitted the necessary "glue" code to the Hadoop code-base; it will be included in the next Hadoop release.
Hypertable: Hypertable is an open source project (being developed at Zvents Inc.) that provides a Big-Table interface. KFS is integrated with Hypertable as the backing store.

We are releasing KFS with the intent of providing a useful storage infrastructure software. It is our hope that KFS will meet the storage needs of various projects. We would be happy to work with anyone interested in using KFS. Please try out KFS and give us your feedback of what works, what you would like to see added/possibly contribute to KFS!

www.flickr.com
Official Kosmix's photos More of Official Kosmix's photos

About September 2007

This page contains all entries posted to Kosmix Blog in September 2007. They are listed from oldest to newest.

August 2007 is the previous archive.

October 2007 is the next archive.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.31