Computing Kendall's Tau parameter for time series data

Introduction.

In this post, the Kendall's Tau parameter will be computed for the annual peak flow data for Dog Branch at St. Paul, Arkansas.  This data was derived from the USGS website.  The computation of the Kendall's Tau parameter was performed in Microsoft Excel for this analysis.    

Examining the Kendall's Tau parameter on time series data will help to indicate if a trend exists and how strong it is.  If the Kendall's Tau parameter is close to 1, a strong rising trends exists.  If the Kendall's Tau parameter is close to -1, a strong falling trend exists.  If the parameter is close to zero, then neither a falling or rising trend exists.


First Step - Visualize the Data.

A good first step in determining a trend is to visually examine the time series data.  This is shown below.




From the chart above, it appears that a trend is not apparent.  In the next steps, the Kendall's Tau parameter will be computed to see if that assumption holds.

Next Step - How is the Kendall's Tau Parameter Computed?

For computing the Kendall's Tau parameter, we need to figure out the number of concordant pairs (C) and the number of discordant pairs (D).  In computing these, we can then use the following equation to compute Kendall's Tau parameter.


How Do We Figure Out Concordant and Discordant Pairs?

An easy way to explain this is to realize we are looking for a trend in the data.  In this case, we are looking for a rising trend in the data.  So, starting with the first year, 1961, the number of concordant pairs will be values greater than 370, and the number of discordant pairs will be values less than 370.  I used the conditional formatting feature in excel to highlight the values that are greater than 370.  There were four values greater than 370 (concordant pairs), which means that there are 16 values less than 370 (discordant pairs).  



Once 1961 is done, we move to 1962 and do that same analysis for the values from 1963 to 1981.  Notice that there will now be 19 pairs, down 1 from the 1961 analysis.  With each subsequent year, the total number of pairs will decrease by one since we are only looking forward from that date.  For 1962, the number of concordant pairs (number of values greater than 170) is 10, and the number of discordant pairs (number of values less than 170) is 9.     


These two examples (1961 and 1962) should give you an understanding of what is a concordant pair and what is a discordant pair in time series data.  


Final Number of Pairs

The total number of concordant and discordant pairs is summarized below.  Notice that the analysis of pairs is not done for the year 1981 since there are no values below this.  Also, for the year 1976, the value is 83 cfs.  Since the year 1980 also had the same value, I assigned one-half to concordant and one-half to discordant for a quick computation.  Further below, I show a more thorough computation, but the answer does not change significantly.    



The total number of concordant pairs was 107.5 with the total number of discordant pairs being 102.5.  Applying these values to our Kendall's Tau Parameter gives us the following result.



Solving the equation above gives us a value of 0.0238, which is very close to zero confirming our visual inspection of the time series data that there is no trend.  


Accounting for ties in the computation of Kendall's Tau Parameter

In the above example, I assigned 0.5 to concordant and 0.5 to discordant for the one tie that was present to get a quick computation of the Kendall's Tau Parameter.  Below, I show a more thorough computation for dealing with ties.  The equation is as follows:


Note that n is the number of observations (21).  The number of tied values in the date group is zero, while the number of tied values in the flow group is 2.  Removing the tied values gives a value of concordant pairs of 107 and a value of discordant pairs of 102.  Solving these equations gives the following:


The 0.0239 value is very similar to 0.0238 value that was computed previously.  It again confirms the lack of a trend in the flow data.

Comments