Consulting

Pentaho

Pentaho ® is a prominent open source reporting project. Note, though, that reporting is but a small part of Pentaho’s overall vision of providing a soup-to-nuts, open source Business Intelligence Suite. Pentaho encompasses several open source projects: reporting (Pentaho Reporting), data integration and ETL (Kettle), data mining (Weka), and OLAP analysis (Mondrian).

Components

Pentaho’s website can make it difficult to figure out what is free, open-source and what is commercial product. So to clarify, the open-source Community Edition of the Pentaho BI Suite includes the following components for reporting.

  1. Classic Engine – formerly known as “JFree Report”, is a collection of Java classes and APIs that execute Pentaho’s XML-based reports. The Classic Engine runs report designs against data sources, and renders report output in HTML, PDF, Excel, and other output formats. You can embed the Classic Engine inside your Java applications. You don’t need the Classic Engine if you use the BI Server.
  2. Pentaho Report Designer – a WYSIWIG tool that lets you create reports using a graphical user interface, as opposed to creating reports by directly creating and manipulating XML. These reports can then be run by the Classic Engine or the Pentaho BI Server. The Pentaho Report Designer is a stand-alone, installed client tool, and is not available as an Eclipse or NetBeans plug-in.
  3. Pentaho BI Server the BI Server is a J2EE application that provides an infrastructure for multiple users to run reports and OLAP cubes through a web-based user interface. It is most commonly deployed on top of Apache Tomcat, but can use any J2EE application server. At the core of the BI Server are the Classic Engine and the Mondrian ROLAP Engine (which run the reports and OLAP cubes respectively), plus a host of server capabilities including authentication, user management, logging, email notification, web services, and report scheduling. The BI Server includes two web-based user interfaces:
    • Pentaho User Console – end-users can login, browse reports, run them, view report results in HTML or PDF, or download report results in other formats. It also includes a built-in ad-hoc report design wizard that lets users create their own simple reports against pre-defined data sources.
    • Pentaho Administrator Console – administrators and developers can deploy reports, manage users, set up security access privileges, and deploy workflows.
    • Note that in release 3.5, the community edition of Pentaho BI Server doesn’t appear to integrate with Weka or Kettle, the data mining and data integration components of the Pentaho BI Suite. Also, the open-source Community Edition does not include the Pentaho Dashboard Designer, Pentaho Analyzer or Pentaho Enterprise Console.
  4. Pentaho Design Studio – An Eclipse plug-in that lets you create XML-based Action Sequence documents (XACTION files). Think of Action Sequence files as lists of instructions that you can deploy to the BI Server to control the behavior of the BI Server. Action Sequence documents can be used to run data queries, prompt BI Server users for input parameters, run one or more reports in succession, and execute Java Script. For example, an Action Sequence document can instruct the server to prompt the user for parameter values, run a report with those parameter values, and then send an email notification to specified users. Note that the Pentaho Report Studio is not used to create the reports themselves – use the Pentaho Report Designer for that. The Pentaho Report Studio is not required if you just want to deploy a report to the BI Server and let your users run it, schedule it, and immediately view results. Action Sequences are only needed if your workflows are more complicated.

Although it is still listed on the Pentaho Reporting website, the Pentaho Flow Engine (an alternative to the Classic Engine) is not applicable. Development on it has been at a standstill since summer 2007.


Top

General Impressions

In general, our impression is that Pentaho has a great vision, but has only just started to execute.


On the plus side, Pentaho offers a solid, full-featured report server for free, with functionality that is pretty equivalent to the JasperServer (also free). (Note that BIRT does not provide a free, open-source report server). To compare the open source editions of the Pentaho BI Server and the JasperServer, we find that Pentaho has a few more bells and whistles, such as the ability to instantly translate the User Console’s user interface into French, Greman, Spanish, or Japanese -- but in our non-scientific testing the Jasper Server performed better and was more stable. There is one area, however, where Pentaho’s BI Server stands head and shoulders above JasperServer – Pentaho's Server includes a free ad-hoc report design wizard. With it, end users are guided through the process of creating their own simple reports and deploying them to the Server. For Jasper Reports and BIRT, this type of functionality is available, but it is not open source and you have to pay for it.


Pentaho’s goal of providing an all-encompassing, integrated BI suite is very broad, but as a result the reporting functionality is not particularly deep. Functionality that report developers take for granted in BIRT and Jasper Reports – HTML pagination, report parameters along with user prompting, side-by-side report components, cross-tabs, and robust charting – either don’t exist in Pentaho, are superficially developed, or are brand new. As a result, it is frustrating and sometimes impossible to create real-world reports, most of which have complexity that Pentaho is only just starting to address.


Pentaho Report Designer

The Pentaho Report Designer is a WYSIWIG tool that lets you create reports using a graphical user interface, as opposed to creating reports by directly creating and manipulating XML. These reports can then be run by the Classic Engine or the Pentaho BI Server.


The Pentaho Report Designer has changed a lot in recent releases. The 3.5 version in late 2009 at last added much-needed reporting functionality including report parameters, including dynamic cascading parameters, and cross-tabs (although cross-tabs are still “experimental” and unsupported). We are encouraged by the rapid progress in the Pentaho Report Designer and are hopefully that its functionality will be more fully built-out soon.


The Pentaho Report Designer is in the “pixel perfect” school of report design. Like Jasper (and unlike BIRT) users specify precisely where each report element is to be displayed. This gives users fine-grain control over the look of a report, but also limits the report’s ability to adapt to different-sized displays. For example, if you want a report to look good when printed on an 8.5”x11” sheet of paper, then the report will only be as wide as a sheet of paper even when displayed on a widescreen monitor with lots more horizontal screen real estate.


Like Jasper, Pentaho is very dependent on sub-reports. If you want to use multiple data sources, have side-by-side report components, or re-use the results of a query within a different section of a report, you need to use sub-reports. While sub-reports are great for re-using report pieces across many different reports, requiring sub-reports for the above use cases adds unnecessary difficulty and complexity to the report design process:

  • You need to gracefully hand parameters and sometimes query data between the master report and sub-report (and sub-sub-report, etc).
  • From within the Report Designer, you cannot actually see how the sub-report will show up in the master report – instead a “sub-report” icon is displayed.
  • Report Developers need to manually manage the dependency between the master report and sub-report files.
  • Too many sub-reports can result in very poor performance because each sub-report opens its own database connection, thread, and queries. So, for example, if you have a sub-report within a group section that expands into 70 different groups, then the sub-report will run 70 times, opening up a new database connection each time.
  • Sub-reports need to be precisely designed so that their size fits exactly into the space provided by the master report.

Top

Strengths and Weaknesses of the Pentaho Report Designer


Below are some of the strengths and weaknesses of the Pentaho Report Designer, as compared to BIRT and Jasper’s report designers. Note that some of these strengths and weaknesses are really due to the behavior of the underlying Classic Engine, not the Pentaho Report Designer itself. However, we include them here because the report developer is most likely to encounter them.


Pentaho Report Designer’s strengths

  • Pentaho does not require that reports be compiled prior to running (unlike Jasper but like BIRT).
  • Pentaho reports are in XML format, and thus can be effectively put under revision control.
  • The Pentaho Report Designer has the best built-in report design “wizard.” It’s great for getting beginning users started, even though the vast majority of real-world reports cannot be created by this wizard, as it only creates highly regular reports with up to 4 levels of grouping.
  • With Pentaho, you can create “row-banded” reports, with alternating colors for each report row, by simply checking a box. Much easier than either BIRT or Jasper!
  • Support for new visualizations/charts that are coming into vogue as the result of Edward Tufte’s work: sparklines (bar, line, pie), and survey-scale (See Figure 2 for sparkline and suvey-scale chart examples).

pentaho-001

pentaho-002

FIGURE 2: Pentaho Sparkline Chart sparkline chart and survey-scale chart examples.

 

 

Pentaho Report Designer’s weaknesses

  • Pentaho requires that the report query do the “heavy lifting” for grouping, filtering, sorting, and aggregates. If the data does not arrive in the report in the proper way, Pentaho has less ability to further manipulate the data than BIRT. The report developer is responsible for ensuring that groups in the report design are in the same order as the data groups returned by the query, or else unpredictable behavior might occur
  • The process of creating a chart involves providing values for a really large dialog box (see below screenshot). This is sufficient to create many types of charts, but does not offer as many levers to customize the report’s contents, look, and behavior as BIRT. There is no chart preview – you can’t see the results of what you created until you leave the Chart Dialog.  Most report properties (title, legend, etc.) cannot be parameterized. And you can’t make charts interactive, such as letting people drill-down on chart elements, or seeing more detail when the mouse hover-overs.

pentaho-003

  • Pentaho’s charts are not “native” – they are rendered by the separate JFreeCharts library.
  • Pentaho does not support “newspaper layouts” with multiple columns (BIRT doesn’t, Jasper does), and does not yet support vertical text.
  • Pentaho’s expression syntax is OpenFormula, which is based on Excel formulas. While this is easy for developers to use and understand, it is often too limiting for real-world reports.
  • In our experience, the Pentaho Designer is more unstable and more likely to crash than either BIRT’s or Jasper’s Designers.
  • Cross-tabs are still “experimental” – the 3.5 release introduced cross-tabs, but they are not truly supported or encouraged. You need to key in “CTRL-ALT-O” to enable the feature in iReport. As such, they are not as full-featured, customizable, or as stable as BIRT cross-tabs.
  • In general, conditional formatting is not really supported.
  • Pentaho is largely undocumented, especially for the Community Edition. The online documentation that does exist is often confusing, out-of-date, and full of broken hyperlinks. Thus, we strongly recommend that anyone using Pentaho’s designer purchase the only good book we know of on the subject: Pentaho Reporting 3.5 for Java Developers, by Will Gorman.

Top

Conclusion

Pentaho has a compelling long-term vision, but right now we believe that BIRT has the superior report designer, although Pentaho’s direction looks promising.



Top

Update

Each of the products reviewed has had a major release recently, including Jaspersoft 4, Pentaho BI 4 and BIRT 3.7. Innovent will be releasing an update to our comparison in February 2012 so please check back.


See Also:

BIRT vs Jasper

OSS Reporting Comparison Matrix


Pentaho is a registered trademark of Pentaho, Inc.