Predictive Analytics with TIBCO Spotfire and S+ or R


Last week we discussed an important new concept in TIBCO Spotfire version 3.1, scenario analysis. This week we will continue learning about new 3.1 features by discussing another type of analytics: Predictive Analytics.

With TIBCO Spotfire version 3.1, you can communicate directly with S+ or R to execute scripts and functions. You can then return the data back into Spotfire as rows, columns, tables, and even values to properties, like Document Properties, Data Table Properties, and Column Properties.

For this example, we will be looking at interest rate differentials  (IRD) (http://www.investopedia.com/terms/i/interest-rate-differential.asp) for US and Canada from February 1976 to June 1996 and will try to calculate the IRD for future months.



 
The first thing we need to do is to create a S+ or R script that will take the data in and will predict the IRD out 5 months.
There are a variety of ways to do time series predictions: two of the common ones are Spline Analysis and the ARMA process (http://en.wikipedia.org/wiki/Time_series) For this tip we will use the ARMA process. If you would like additional information or training on other models, please consider taking our S+ or R training courses.

To execute the S+ or R script in Spotfire we have to create a Data Function. A  Data Function is basically a S+ or R Script or Function that is stored in the Spotfire Library to be used in Spotfire analysis files.

We create the Data Function by going to Tools > Register Data Functions and filling in the required information.  The first part is filling in the script itself:


 
Then we need to define the input parameters. These are the expected inputs going into the script. In this case, we are going to expect the script to receive a Data Table.
 

We then define the output via the output tab.  In this case, we are expecting the output to be a Data Table called result.

 

When you are done, you click the Save button to save the Data Function in the Spotfire Library.

Next, you need to add the Data Function into your document. To do this, you go to the Tools menu and click on the Data Functions tool and select the desired function.


 

Once you click OK, you can now define what to send into the Data Function as inputs and what to send the output of the Data Function to. These are called Input and Output Handlers.
The Input Handler will be both the date and Interest Ratio Differential columns from the Data Table.



 

The Output Handler will be a new Data Table called result.


 


Once we click OK, the calculation will occur and return a new Data Table called result, which includes the original date column, a value column for the Interest Rate Differentiator, and then a column called valueType which tells us if the row was an original row or a new, predicted row.


 
We can then create a line chart showing the date on the x-axis and the average value on the y-axis and color by Value Type, so you can see the predicted months at the end.



 
To take this example further, it is likely that a consumer may want to adjust how many months to predict. They may also want to adjust some variables in the algorithm. With TIBCO Spotfire 3.1, we can implement this by exposing variables like the length of prediction (in months) as properties where the consumer can pass in the value.


See the example below, where the consumer set the number of months to predict to 3 and the result Data Table and Line Chart get updated to show 3 month's predictions.


 

If you would like more information on Data Functions and how to create them, please consider taking our 3.1 Delta Training Course SP1631  For more information on the specific algorithms and statistics we used, please consider taking our S+ or R training courses.