Choosing the Right Visualisation Tool for your Task – Reblog

This is a reblog of DataMarket blog.

We’re frequently asked: What is the best tool to visualize data?

There is obviously no single answer to that question. It depends on the task at hand, and what you want to achieve.

Here’s an attempt to categorize these tasks and point to some of the tools we’ve found to be useful to complete them:

The right tool for the task

Simple one-off charts

The most common tool for simple charting is clearly Excel. It is possible to make near-perfect charts of most chart types using Excel – if you know what you’re doing. Many Excel defaults are sub-optimal, some of the chart types they offer are simply for show and have no practical application. 3D cone shaped “bars” anyone? And Excel makes no attempt at guiding a novice user to the best chart for what she wants to achieve. Here are three alternatives we’ve found useful:

  • Tableau is fast becoming the number one tool for many data visualization professionals. It’s client software (Windows only) that’s available for $999 and gives you a user-friendly way to create well crafted visualizations on top of data that can be imported from all of the most common data file formats. Common charting in Tableau is straight-forward, while some of the more advanced functionality may be less so. Then again, Tableau enables you to create pretty elaborate interactive data applications that can be published online and work on all common browser types, including tablets and mobile handsets. For the non-programmer that sees data visualization as an important part of his job, Tableau is probably the tool for you.
  • DataGraph is a little-known tool that deserves a lot more attention. A very different beast, DataGraph is a Mac-only application ($90 on the AppStore) originally designed to create proper charts for scientific publications, but has become a powerful tool to create a wide variety of charts for any occasion. Nothing we’ve tested comes close to DataGraph when creating crystal-clear, beautiful charts that are also done “right” as far as most of the information visualization literature is concerned. The workflow and interface may take a while to get the grips of, and some of the more advanced functionality may lie hidden even from an avid user for months of usage, but a wide range of samples, aggressive development and an active user community make DataGraph a really interesting solution for professional charting. If you are looking for a tool to create beautiful, yet easy to understand, static charts DataGraph may be your tool of choice. And if your medium is print, DataGraph outshines any other application on the market.
    • The best way to see samples of DataGraph’s capabilities is to download the free trial and browse the samples/templates on the application’s startup screen.
  • R is an open-source programing environment for statistical computing and graphics. A super powerful tool, R takes some programming skills to even get started, but is becoming a standard tool for any self-respecting “data scientist”. An interpreted, command line controlled environment, R does a lot more than graphics as it enables all sorts of crunching and statistical computing, even with enormous data sets. In fact we’d say that the graphics are indeed a little bit of a weak spot of R. Not to complain about the data presentation from the information visualization standpoint, most of the charts that R creates would not be considered refined and therefore needs polishing in other software such as Adobe Illustrator to be ready for publication. Not to be missed if working with R is the ggplot2 package that helps overcome some of the thornier of making charts and graphs for R look proper. If you can program, and need a powerful tool to do graphical analysis, R is your tool, but be prepared to spend significant time to make your outcome look good enough for publication, either in R or by exporting the graphics to another piece of software for touch-up.
    • The R Graphical Manual holds an enormous collection of browsable samples of graphics created using R – and the code and data used to make a lot of them.

Videos and custom high-resolution graphics

If you are creating data visualization videos or high-resolution data graphics, Processing is your tool. Processing is an open source integrated development environment (IDE) that uses a simplified version of Java as its programming language and is especially geared towards developing visual applications.

Processing is great for rapid development of custom data visualization applications that can either be run directly from the IDE, compiled into stand-alone applications or published as Java Applets for publishing on the web.

Java Applets are less than optimal for web publication (ok, they simply suck for a variety of reasons), but a complementary open-source project – Processing.js – has ported Processing to JavaScript using the canvas element for rendering the visuals (canvas is a way to render and control bitmap rendering in modern web browsers using JavaScript). This is a far superior way to take processing work online, and strongly recommended in favor to the Applet.

The area where we have found that Processing really shines as a data visualization tool, is in creating videos. It comes with a video class called MovieMaker that allows you to compose videos programmatically, frame-by-frame. Each frame may well require some serious crunching and take a long time to calculate before it is appended to a growing video file. The results can be quite stunning. Many of the best known data visualization videos are made using this method, including:

Many other great examples showing the power of Processing – and for a lot more than just videos – can be found inProcessing.org’s Exhibition Archives.

As can be seen from these examples Processing is obviously also great for rendering static, high-resolution bitmap visualizations.

So if data driven videos, or high-resolution graphics are your thing, and you’re not afraid of programming, we recommend Processing.

Charts for the Web

There are plenty – dozens, if not hundreds – of programming libraries that allow you to add charts to your web sites. Frankly, most of them are sh*t. Some of the more flashy ones use Flash or even Silverlight for their graphics, and there are strong reasons for not depending on browser plugins for delivering your graphics.

We believe we have tested most of the libraries out there, and there are only two we feel comfortable recommending, each has its pros and cons depending on what you are looking for:

  • Highcharts is a JavaScript charting library that renders vector based, interactive charts in SVG (or VML for older versions of Internet Explorer). It is free for non-commercial use, and commercial licenses start at $80. It is a flexible and well designed library that includes all the most common chart types with plenty of customization and interactivity options. Interestingly enough even though Highcharts is a commercial solution, the source code is available to developers that want to make their own modifications or additions. With plenty of examples, good documentation and active user forums, Highcharts is a great choice for most development projects that need charting.
  • gRaphaël is another JavaScript charting library built on top of Raphaël (see below). Like HighCharts, gRaphaël renders SVG graphics on modern browsers, falling back to VML for IE <9. While holding a lot of promise, gRaphaël is not a very mature library and with limited capabilities, few chart types, even fewer examples and pretty much non-existent documentation. It is however available under proper open source licenses and could serve as a base for great things for those that want to extend these humble beginnings.

Other libraries and solutions that may be worth checking out are the popular commercial solution amCharts, Google’s hosted Chart Tools and jQuery library Flot.

Special Requirements and Custom Visualizations

If you want full control of the look, feel and interactivity of your charts, or if you want to create a custom data visualization for the web from scratch, the out-of-the box libraries mentioned above will not suffice.

In fact – you’ll be surprised how soon you run into limitations that will force you to compromise on your design. Seemingly simple preferences such as “I don’t want drop shadows on the lines in my line chart”, or “I want to control what happens when a user clicks the X-axis” and you may already be stretching it with your chosen library. But consider yourself warned: The compromises may well be worth it. You may not have the time and resources to spend diving deeper, let alone writing yet-another-charting-tool™

However, if you are not one to compromise on your standards, or if you want to take it up a notch and follow the lead of some of the wonderful and engaging data journalism happening at the likes of the NY Times and The Guardian, you’re looking for something that a charting library is simply not designed to do.

The tool for you will probably be one of the following:

  • Raphaël, gRaphaël’s (see above) big brother. Raphaël is a powerful JavaScript library to work with vector graphics. It renders SVG graphics for modern browsers and falls back to VML for Internet Explorer 6, 7 and 8. It comes with a range of good looking samples and decent documentation. Raphaël is open source, and any developer should be able to hit the ground running with it to develop nice looking things quite fast. We don’t recommend Raphaël for the advanced charting part, but for entirely custom data visualizations or small data apps it may very well be the right tool for the task.
  • Protovis is an open source JavaScript visualization toolkit. Rather than simply controlling at a low level the lines and areas that are to be drawn, Protovis allows the developer to specify how data should be encoded in marks – such as bars, dots and lines – to represent it. This approach allows inheritance and scales that enable a developer to construct custom charts types and layouts that can easily take in new data without the need to write any additional code. Protovis natively uses SVG to render graphics, but a couple of efforts have been made to enable VML rendering making Protovis an option for older versions of Internet Explorer that still account for a significant proportion of traffic on the web.Protovis is originally written by Mike Bostock (now data scientist at Square) and Jeffrey Heer of the Stanford Visualization Group. Their architectural approach is ingenious, but it also takes a bit of an effort to wrap your head around, so be prepared for somewhat of a learning curve. Luckily there are plenty of complete and well-written examples and decent documentation. Once you get going, you will be amazed at the flexibility and power that the Protovis approach provides.
  • D3.js or “D3″ for short is in many ways the successor of Protovis. In fact Protovis is no longer under active development by the original team due to the fact that its primary developer – Mike Bostock – is now working on D3 instead.D3 builds on many of the concepts of Protovis. The main difference is that instead of having an intermediate representation that separates the rendering of the SVG (or HTML) from the programming interface, D3 binds the data directly to the DOM representation. If you don’t understand what that means – don’t worry, you don’t have to. But it has a couple of consequences that may or may not make D3 more attractive for your needs.

    The first one is that it – almost without exception – makes rendering faster and thereby animations and smooth transitions from one state to another more feasible. The second is that it will only work on browsers that support SVG so that you will be leaving Internet Explorer 7 and 8 users behind – and due to the deep DOM integration, enabling VML rendering for D3 is a far bigger task than for Protovis – and one that nobody has embarked on yet.

After thorough research of the available options, we chose Protovis as the base for building out DataMarket’s visualization capabilities with an eye on D3 as our future solution when modern browsers finally saturate the market. We see that horizon about 2 years from now.

 

Advertisements