Events

Select event type to filter by
« Wednesday May 02, 2012 »
Wed
Start: 05/02/2012 3:00 pm
End: 05/02/2012 4:00 pm

Apache Pig is a platform for analyzing large datasets built on Hadoop. The goal of this project is to create a tool that allows users to harness the power of Pig to create powerful data-cleaning scripts without the steep learning curve of Pig Latin or MapReduce. A set of extensions (user-defined functions) to Pig aimed at data cleaning as well as an easy-to-operate GUI program called Piglet that allows the user to manipulate the extensions will be shown.

Committee
Dr. Yanqing Zhang (chair)
Dr. Raj Sunderraman

Syndicate content