Joe Cabrera

Texas A&M University
Computer Science, Class of 2011

In pursuing my degree in Computer Science at Texas A&M, I decided I would like to do more research in the field of Information Retrieval (IR). I am mostly interested in Information Retrieval and machine learning topics like search, natural language processing, and recommender systems. I am also a strong supporter of open source software and am a regular contributor to the Debian project. My language experience is across the spectrum, but I mostly program in C++, Java, or Python. This summer I will be pursuing research in Multimedia Retrieval with my mentor Dr. Yijuan (Lucy) Lu, an assistant professor at Texas State University. A description of my research follows.



Locating partial duplicate songs using beat-chroma aligned features: A Codebook Approach


Partial Duplicate songs are often modified in ways making it hard to distinguish the original songs. They can be modified by altering the volume levels, timing, amplification, or layering other songs on top of another. These songs are commonly known as remixes. Many times it is difficult to defect these partially altered songs in a timely manner.

We propose by using a codebook of common musical features to quickly identify a partially duplicated song. Using the Million Song Dataset, we build a codebook of chroma features for our dataset. Then using this codebook, we encode a new query original song. This song is then checked against exist partial duplicate songs to discover if a match can be found.


In case you're still curious.....

I will also be journaling my work at Check my blog regularly for daily and weekly updates