Wednesday, August 24, 2011
10:30 AM - 11:20 AM
|Level: ||Technical - Intermediate|
An inexpensive way of storing large volumes of data, Hadoop is also scalable and redundant. But getting data out of Hadoop is tough due to a lack of a built-in query language. Also, because users experience high latency (up to several minutes per query), Hadoop is not appropriate for ad hoc query, reporting, and business analysis with traditional tools.
The first step in overcoming Hadoop's constraints is connecting to HIVE, a data warehouse infrastructure built on top of Hadoop, which provides the relational structure necessary for schedule reporting of large datasets data stored in Hadoop files. HIVE also provides a simple query language called Hive QL which is based on SQL and which enables users familiar with SQL to query this data.
But to really unlock the power of Hadoop, you must be able to efficiently extract data stored across multiple (often tens or hundreds) of nodes with a user-friendly ETL (extract, transform and load) tool that will then allow you to move your Hadoop data into a relational data mart or warehouse where you can use BI tools for analysis.
Attendees will learn how an IT person without java programming skills can:
- Integrate with Hadoop and Hive to bring ETL, data warehousing and BI applications to the tasks of analyzing Big Data
- Provide key data integration and transformation functionality to Hadoop data
- Manage and control Hadoop jobs using a graphical interface
- Integrating Hadoop data with data from other sources to drive compelling reporting and analytics for today's massive volumes of data
Ian Fyfe is responsible for driving adoption of Pentaho's BI technologies, focusing on Pentaho's customer base and community to ensure their needs are being met and exceeded, and providing input on high-level product strategy and roadmap development. Ian brings extensive experience in the Business Intelligence and Data Warehouse industry including Jaspersoft, PeopleSoft, Epiphany, Informix, and Business Objects.