Musings with Sesh (our CTO)
Recently Sesh Seshadri (our CTO) and I got into an involved discussion about Kosmix technology. Thought I would report some snippets of that here:
One of the grand challenges we have addressed and continue to improve on is the ability to categorize the web into millions if not hundreds of millions of categories. We are unique in that we have figured out a way to do this in a scalable manner compared to the various human-intensive efforts
We have built a fully functional search engine all the way from a full crawl to the distributed query processing frontend in 18 months with 10 engineers. We have built this as a platform to support different kinds of web 2.0 applications.
Another of the challenges we would like to tackle is to produce an SQL like interface to the web as well as other data sources. We want users to be able to ask questions, test theories and inspect data using a very simple language (SQL). This would make the web an experimental platform for innovation.
In a company like Kosmix, we have challenges in every area of computer science…File Systems, Operating Systems, Distributed Computing, Distributed and Very Large Databases, Machine Learning and Artificial Intelligence to name a few.
Though I have been at Kosmix a while now, chatting about our technology with Sesh gave me a different way of looking at what we are doing.
Tags: Categorization, Grand Challenges, Kosmix, Search Engine

Subscribe to our RSS Feed


February 14th, 2007 at 2:06 am
Sounds amazing. A NLP approach to it would make it an awesome product.
February 20th, 2007 at 3:28 pm
The idea of SQL style of queries on the web is interesting(not surprising given Sesh’s background in Databases). Just curious if it is along the lines of WebSQL(more search oriented – structural queries on document-anchor model of web graphs) or traditional SQL (manual/automated schema generation + information extraction + queries on these schemas, for each vertical).