1. New Frontiers in Information Retrieval and Web Search (Dr. Gao)

With the rapid growth of the Web and information retrieval systems in scale, traditional search paradigms become increasingly inadequate in addressing the problem of information overload. We plan to explore one or multiple new frontiers in information retrieval and Web search, leveraging data mining techniques and human computation. One example project would be to facilitate cross-domain exploratory faceted search for online communities such as Craigslist. Other related projects include rank-based evaluation systems and recommendation systems.

2. Exploration of New Paradigms in Multimedia Information (Image/Video) Retrieval (Dr. Lu)

Currently, image and video retrieval is one of the hottest research topics in information retrieval since it has become one of the most popular services in many search engines, such as Bing, Google and Yahoo!. However, most of the existing techniques are mainly based on textual information. This is due to the fact that text-based search techniques are mature while image visual content information is difficult or expensive to exploit. Hence, how to represent an image as a ``text" document becomes a key and interesting problem. With the popularity of 3D cameras, 3D TV and 3D displayer, in the new future, more and more customer 3D images will be generated and shared on Internet.  3D image understanding and retrieval becomes a new and hot topic. This project is to explore these new paradigms in multimedia information retreival: 1) build compact and descriptive visual elements for images that function similar to textual words; 2) explore techniques to improve 2D image understanding by cooporating 3D information and develop new 3D image retrieval algorithms.

3. WEAvE: Web Exploration and Analytic Engine  (Dr. Ngu)

In this project, we investigate a new paradigm in the retrieval and discovery of Deep Web sources based on a rich service class description. This will lead to the development of an automated utility for integration and access to a large number of dynamically changing  Deep Web sources. Today's search engines can index and retrieve surface Web sources, which are static html pages on the Web.  But they cannot retrieve content in Deep Web sources, which are dynamically generated html pages from searchable databases or documents. The continuous proliferation of Deep Web sources poses even more challenge to their effective usage by net citizen.  The project will explore using machine leaning techniques to generate service class description of a class of Deep Web sources.  It will also explore how to merge extracted data from different deep web sources.