This is a reblog of DataMarket blog.
We’re frequently asked: What is the best tool to visualize data?
There is obviously no single answer to that question. It depends on the task at hand, and what you want to achieve.
Here’s an attempt to categorize these tasks and point to some of the tools we’ve found to be useful to complete them:
The right tool for the task
Simple one-off charts
The most common tool for simple charting is clearly Excel. It is possible to make near-perfect charts of most chart types using Excel – if you know what you’re doing. Many Excel defaults are sub-optimal, some of the chart types they offer are simply for show and have no practical application. 3D cone shaped “bars” anyone? And Excel makes no attempt at guiding a novice user to the best chart for what she wants to achieve. Here are three alternatives we’ve found useful:
- Tableau is fast becoming the number one tool for many data visualization professionals. It’s client software (Windows only) that’s available for $999 and gives you a user-friendly way to create well crafted visualizations on top of data that can be imported from all of the most common data file formats. Common charting in Tableau is straight-forward, while some of the more advanced functionality may be less so. Then again, Tableau enables you to create pretty elaborate interactive data applications that can be published online and work on all common browser types, including tablets and mobile handsets. For the non-programmer that sees data visualization as an important part of his job, Tableau is probably the tool for you.
- Tableau’s visual gallery is a great way to see what the program is capable of.
- DataGraph is a little-known tool that deserves a lot more attention. A very different beast, DataGraph is a Mac-only application ($90 on the AppStore) originally designed to create proper charts for scientific publications, but has become a powerful tool to create a wide variety of charts for any occasion. Nothing we’ve tested comes close to DataGraph when creating crystal-clear, beautiful charts that are also done “right” as far as most of the information visualization literature is concerned. The workflow and interface may take a while to get the grips of, and some of the more advanced functionality may lie hidden even from an avid user for months of usage, but a wide range of samples, aggressive development and an active user community make DataGraph a really interesting solution for professional charting. If you are looking for a tool to create beautiful, yet easy to understand, static charts DataGraph may be your tool of choice. And if your medium is print, DataGraph outshines any other application on the market.
- The best way to see samples of DataGraph’s capabilities is to download the free trial and browse the samples/templates on the application’s startup screen.
- R is an open-source programing environment for statistical computing and graphics. A super powerful tool, R takes some programming skills to even get started, but is becoming a standard tool for any self-respecting “data scientist”. An interpreted, command line controlled environment, R does a lot more than graphics as it enables all sorts of crunching and statistical computing, even with enormous data sets. In fact we’d say that the graphics are indeed a little bit of a weak spot of R. Not to complain about the data presentation from the information visualization standpoint, most of the charts that R creates would not be considered refined and therefore needs polishing in other software such as Adobe Illustrator to be ready for publication. Not to be missed if working with R is the ggplot2 package that helps overcome some of the thornier of making charts and graphs for R look proper. If you can program, and need a powerful tool to do graphical analysis, R is your tool, but be prepared to spend significant time to make your outcome look good enough for publication, either in R or by exporting the graphics to another piece of software for touch-up.
- The R Graphical Manual holds an enormous collection of browsable samples of graphics created using R – and the code and data used to make a lot of them.
Videos and custom high-resolution graphics
If you are creating data visualization videos or high-resolution data graphics, Processing is your tool. Processing is an open source integrated development environment (IDE) that uses a simplified version of Java as its programming language and is especially geared towards developing visual applications.
Processing is great for rapid development of custom data visualization applications that can either be run directly from the IDE, compiled into stand-alone applications or published as Java Applets for publishing on the web.
The area where we have found that Processing really shines as a data visualization tool, is in creating videos. It comes with a video class called MovieMaker that allows you to compose videos programmatically, frame-by-frame. Each frame may well require some serious crunching and take a long time to calculate before it is appended to a growing video file. The results can be quite stunning. Many of the best known data visualization videos are made using this method, including:
- Aaron Koblin’s Flight Patterns
- Jer Thorp’s Kepler Exoplanet Candidates
- DataMarket’s own Earthquakes and Eruptions
Many other great examples showing the power of Processing – and for a lot more than just videos – can be found inProcessing.org’s Exhibition Archives.
As can be seen from these examples Processing is obviously also great for rendering static, high-resolution bitmap visualizations.
So if data driven videos, or high-resolution graphics are your thing, and you’re not afraid of programming, we recommend Processing.
Charts for the Web
There are plenty – dozens, if not hundreds – of programming libraries that allow you to add charts to your web sites. Frankly, most of them are sh*t. Some of the more flashy ones use Flash or even Silverlight for their graphics, and there are strong reasons for not depending on browser plugins for delivering your graphics.
We believe we have tested most of the libraries out there, and there are only two we feel comfortable recommending, each has its pros and cons depending on what you are looking for:
- Take a look at the HighCharts Demo gallery for an idea of HighChart’s capabilities (and limitations)
- You will find samples of gRaphaël charts on theproject’s home page
Special Requirements and Custom Visualizations
If you want full control of the look, feel and interactivity of your charts, or if you want to create a custom data visualization for the web from scratch, the out-of-the box libraries mentioned above will not suffice.
In fact – you’ll be surprised how soon you run into limitations that will force you to compromise on your design. Seemingly simple preferences such as “I don’t want drop shadows on the lines in my line chart”, or “I want to control what happens when a user clicks the X-axis” and you may already be stretching it with your chosen library. But consider yourself warned: The compromises may well be worth it. You may not have the time and resources to spend diving deeper, let alone writing yet-another-charting-tool™
However, if you are not one to compromise on your standards, or if you want to take it up a notch and follow the lead of some of the wonderful and engaging data journalism happening at the likes of the NY Times and The Guardian, you’re looking for something that a charting library is simply not designed to do.
The tool for you will probably be one of the following:
- Take a look at the demos on the Raphaël project pagefor an idea of its capabilities
- The wide range of examples available on the project’s web site will certainly testify to this flexibilitiy.
- D3.js or “D3″ for short is in many ways the successor of Protovis. In fact Protovis is no longer under active development by the original team due to the fact that its primary developer – Mike Bostock – is now working on D3 instead.D3 builds on many of the concepts of Protovis. The main difference is that instead of having an intermediate representation that separates the rendering of the SVG (or HTML) from the programming interface, D3 binds the data directly to the DOM representation. If you don’t understand what that means – don’t worry, you don’t have to. But it has a couple of consequences that may or may not make D3 more attractive for your needs.
The first one is that it – almost without exception – makes rendering faster and thereby animations and smooth transitions from one state to another more feasible. The second is that it will only work on browsers that support SVG so that you will be leaving Internet Explorer 7 and 8 users behind – and due to the deep DOM integration, enabling VML rendering for D3 is a far bigger task than for Protovis – and one that nobody has embarked on yet.
- That said, many of the examples on the D3 website are simply mind-blowing
After thorough research of the available options, we chose Protovis as the base for building out DataMarket’s visualization capabilities with an eye on D3 as our future solution when modern browsers finally saturate the market. We see that horizon about 2 years from now.