1. Light up the Web (Dr. Gao)

In real world, we visit the same places (classrooms, malls, country clubs, gyms) for similar purposes and interests. We meet, talk, exchange ideas, and develop relationships. On the Web, we also visit various places that are pages, images, videos, etc. However, we do not meet because the Web is dark and we cannot see each other. This project aims at lighting up the Web, so that Web users can meet at random places. In this initial development, we focus on allowing users visiting the same page to be able to see and chat with each other. It is reasonable to assume that such users share a very specific, momentary, situational interest in common. People have long-term as well as short-term interests. We speculate that these interests follow a power-law relationship, where there are very few short-term interests and very many short-term interests. While there are plenty of chatting or social networking sites catering to the long-term interests, this project focus on enabling immediate communication for users with short-term and situational interests.

2. Conference Ranking (Dr. Gao)

Ranking is ubiquitous in our society. HITS and PageRank are classical algorithms that are used to rank webpages for search engines based on a circular definition of quality (importance, popularity, prestige, goodness). In this project, we apply the main ideas of these algorithms, with necessary modification and tuning, to rank CS conferences and journals. Though similar work exists based on citation networks, this project will experiment on different data and assumptions.

3. Comparison-based Evaluation (Dr. Gao)

Often we need to evaluate a set of entities (paper submissions, movies, restaurants, products) and obtain their true ratings (average ratings from the population). Based on the law of large numbers, average ratings from large samples can well serve the purpose. However, in practice evaluation data are often sparse and each entity may only receive a small number of ratings. In this case, average ratings would significantly differ from true ratings due to biased distributions of reviewers holding different standards. Based on the observation that relative preferences are more trustworthy than absolute ratings, in this project we investigate comparison-based evaluation.

4. Exploration of New Paradigms in Multimedia Information (Image/Video) Retrieval (Dr. Lu)

Currently, image and video retrieval is one of the hottest research topics in information retrieval since it has become one of the most popular services in many search engines, such as Bing, Google and Yahoo!. However, most of the existing techniques are mainly based on textual information. This is due to the fact that text-based search techniques are mature while image visual content information is difficult or expensive to exploit. Hence, how to represent an image as a ``text" document becomes a key and interesting problem. With the popularity of 3D cameras, 3D TV and 3D displayer, in the new future, more and more customer 3D images will be generated and shared on Internet.  3D image understanding and retrieval becomes a new and hot topic. This project is to explore these new paradigms in multimedia information retreival: 1) build compact and descriptive visual elements for images that function similar to textual words; 2) explore techniques to improve 2D image understanding by cooporating 3D information and develop new 3D image retrieval algorithms.

5. WEAvE: Web Exploration and Analytic Engine  (Dr. Ngu)

In this project, we investigate a new paradigm in the retrieval and discovery of Deep Web sources based on a rich service class description. This will lead to the development of an automated utility for integration and access to a large number of dynamically changing  Deep Web sources. Today's search engines can index and retrieve surface Web sources, which are static html pages on the Web.  But they cannot retrieve content in Deep Web sources, which are dynamically generated html pages from searchable databases or documents. The continuous proliferation of Deep Web sources poses even more challenge to their effective usage by net citizen.  The project will explore using machine leaning techniques to generate service class description of a class of Deep Web sources.  It will also explore how to merge extracted data from different deep web sources.