Jasper Hunts for ‘Big Data’ Trophies with Talend’s Hooks for Hadoop
April 3, 2012 Alex Woodie
Open source business intelligence software developer Jaspersoft has added new tools for accessing “big data” sets through Jaspersoft ETL, the extract, transform, and load (ETL) software that it OEMs from Talend. The latest release of the company’s tools includes better support for the Apache Hadoop big data framework.
As one of the latest big trends in IT, big data refers to the problems organizations face as they attempt to store, manage, and process huge amounts of data. While the bottom and upper limits vary depending on the application, many midsized businesses will struggle when faced with tens or hundreds of terabytes of data.
In response to the growing volume, velocity, and variability of data, various new techniques and technologies have been devised to help make sense of the data. The open source Hadoop batch processing framework is perhaps the most popular big data technology at the moment. But there are many others both in use and in development, as the big data trend continues to spread.
Now Jaspersoft is throwing its hat into the big data ring and giving its customers better tools to build and manage Hadoop clusters. With the recent release of Jaspersoft version 4.5, the company has added several new big data capabilities in Jaspersoft ETL, including a connector for the Hadoop Distributed File System (HDFS) that enables customers to extract and load data from Hadoop in batch or streaming mode.
Jaspersoft ETL also gets connectors for: HBase for loading data into Hadoop’s column-oriented database; Pig (Pig Latin generation) and Hive (HiveQL generation), for processing Hadoop data in place; and Sqoop, for building direct Hadoop-to-database links without coding.
The Hadoop connectors were the result of a renewal of the OEM relationship between Jaspersoft and Talend. Prior to the renewal, the partnership gave Jaspersoft access to 450 Talend connectors, including DB2/400 connectors.
Jaspersoft says the addition of native big data capabilities in Jaspersoft ETL allows data from any source to be included in Hadoop clusters, which will support a broader set of big data use cases for business intelligence analysts.
“Big data sets can traditionally take data scientists hours to collect, load, and transform, but today’s data scientists are demanding options to explore data faster,” Karl Van den Bergh, Jaspersoft’s vice president of product and alliances, says in a press release.