About Me

My name is Steven Bauer,I will be going into my junior year at the University of Pittsburgh this fall. I am majoring in

Computer Science and I plan on going to graduate school to get my masters degree in CS. I am working under Dr. Ngu for

the summer in the REUIR program.

My Project

My partner Paris Nelson and I's project is a continuation of Dr. Ngu's Deep Web Search research from last year. Our goal

is, given a list of web service sites in the Flight, People, or Hotel search domains, automatically submit the users requested

query to each of the websites and extract the relevant information. Once we have extracted the information we are to display

it to the user in an easy to digest manner.

Expectations

This project has a pretty lofty goal considering the vast amount of differentiation between websites even of the

same domain. Paris and I are pretty excited about embarking on this project due to its ground breaking nature. Research has

shown how much data is in the deep web and it is currently inaccessible by search engines.

Contributions

Whew! It has been a busy and crazy couple of months. Paris and I have come a very long way from where we started

in the beginning of June. We have created an automated submission technique that uses a list of user generated partial terms

to discover which input boxes are relevant to a users query. It ensures that all boxes are in the same form on the website and

is able to brute force missing required input boxes. Once we have discovered all of the text boxes we save this information

to our database such that on future queries to the same site, we can skip the matching. We have created three extraction

techniques, one for tables, an annotated pattern extractor, and our patented (maybe someday) repeated data extractor. Between

these three methods of extraction we are able to get meaningful information from a large number of websites. With future work

our software will achieve even higher levels of precision, I can't wait to see where next years REU takes it!

Conclusion

This REU has been a great learning experience for me, not only have I learned basic PHP, javascript, MySQL, and learned

a few new tricks in Java but I have learned what it is like to be a researcher. This REU has helped me solidify my decision

to go to graduate school and get my Masters if not PHD in Computer Science. I have made some great friends, got to see

some cool places and I am very glad I was given this fantastic opportunity to take part in this program.

 

Links To Our Work

Our final presentation slides:

Our finished poster:

A link to our source code:


Last Updated (Tuesday, 31 July 2012 02:04)