Data Comparisons using Subsets


Data comparisons are commonly visualized in Spotfire. These comparisons can be done using an imported categorical column (like comparing regions from sales data), using tagged data (like comparing outliers), or using Calculated Columns or expressions that use case or if statements. However, there are many situations when you cannot perform a comparison on various subsets in the same visualization. For example, up till now, comparisons only work across Data Columns and values, what if you wanted to include filtering or marking collections in your comparison? You may want to compare:

• Filtered to unfiltered data
• Filtered to all data
• Marked data to all data

TIBCO Spotfire 5.5 introduces the concept of ‘Subsets’, to show multiple subsets in a visualization. Let’s start by looking at the default options.  In the properties dialog of a Bar Chart ,you will  see a new section called ‘Subsets’. If you open that section, there are 3 default subsets which you can select and the ‘Current filtering’ options is checked by default. This means that all visualizations start off behaving just like they used to by working with the active filtering scheme specified in the visualization’s properties.

Subset1  


What happens when we want to compare various Subsets? Let’s take a look at a few example use cases.

Suppose we want to look at a Parallel Coordinate Plot to show our customer’s buying patterns across the 6 departments we offer (Electronics, Furniture, Garden, Groceries, Clothing, and Toys).
 

subset2 

Now suppose we want to filter to some specific insight we are looking at. If we do that currently, we can only visualize the un-filtered data. In 5.5, we can setup a ‘Subset’ in the visualization properties panel. We can select both the ‘Current filtering’ and ‘Not in current filtering’ options.
 

subset3 


We can then use the keyword ‘(Subsets)’ in our axis expression. In this example, let’s choose the Color axis, and color the ‘Not in current filtering’ to be light gray.
 

subset4

We can now perform filtering and see how our filtered data compared with our filtered- out data.
 

subset5

Let’s take a look at another example.

Assume we are using Spotfire to analyze data across two groups, a and b.

We can create a Line Chart which shows the average quantity for each group each day. This is achieved simply by setting the Color axis to the Group Column.
 

subset6
Now what if we want to analyze our individuals in the Group one at a time and compare them to the average of the groups? To do this, we can choose to Subset by ‘All data’ and ‘Current filtering’.

subset7 


We can then Color by both Subsets and the Group column. After Filtering to a specific individual we can see their quantities in comparison to the two groups.

subset8  

subset8a 

Subsets also can be created using Custom Expressions. This allows us to perform Dimension free Data Exploration by allowing us to compare various groupings across Columns. For example, let’s assume we are looking at some transactional sales data.

subset9

Similar to the previous example, we may want to look at sales over time for various categories. For example how do the sales for ‘Department/Specialty’ category compare to a single Merchant, like REI. Rather than using filtering, we can create and expression:

subset10   
This allows us to not only compare values from a column against each other (which you could do with an if statement previously on something like the color axis), but also compare values across multiple column categories.

 

subset11

 

 

If you are interested in learning more about Subsets or other features of TIBCO Spotfire 5.5, please watch our new FREE 5.0 to 5.5 Delta Training Jumpstart at http://spottrain.tibco.com/sln/course/view.php?id=134 . When prompted you can login using the 'Proceed without an account' button. Also continue to watch our Spotfire Demo page at http://spotfire.tibco.com/demos as we begin to roll out 5.5 feature demos.