I don’t know why, but, occasionally, people may have to do minimal or even considerable statistical calculations using Calc, the OpenOffice.org‘s spreadsheet part of the Office Productivity Suite. Although, I wouldn’t recommend doing so since there are other, better software packages to complete simple and complex statistical analyses, such as **R**, a statistical programming language and environment with excellent analysis and graph capabilities. R can be found at one of its many mirrors.

However, for those who need or have to (or … whatever), here are some very very basic info on using statistical functions in Calc. Calc has a considerable variety of functions documented in the openoffice wiki. of statistical (and other types of) functions that you can use. There is also OOo_Stat, an external add-in, that you can include to assist you with your analyses. More on that later on.

MInd you, I am not going to provide a full “Statistics using Calc” guide here, I just want to suggest a few statistical functions for basic descriptive and inferential statistics.

**A. Basic Descriptive Statistics**

*I. Central Tendency*

**AVERAGE**returns the mean average of its list of arguments**MEDIAN**returns the median value (the value at the 50th percentile) of its respective list of arguments**MODE**returns the most frequent value of its list of arguments

There are other functions, like **GEOMEAN**, **HARMEAN**, and **TRIMMEAN** which give the geometric mean, the harmonic mean and the trimmed mean (excluding a portion of highest and lowest values) of their arguments.

*II. Variation*

**KURT**estimates the kurtosis of the distribution of values (ie, how flat or how tall the distribution is)**SKEW**estimates the asymmetry of the distribution (ie, how different from the symmetrical normal distribution it is)**STDEV**estimates the standard deviation of the distribution**VAR**estimates the variance (ie, the square of the standard deviation) of its list of arguments

Note, that there are functions **STDEVP** and **VARPP** for the standard deviation and variance of the population where the sample has come from.

**B. Some (not all) inferential statistics**

*I. Simple tests*

**CHITEST**estimates the chisquare test of goodness-of-fit**FTEST**estimates the likelihood that two samples have the same variance.**TTEST estimates Student’s t-test of one or two samples**

*II. Correlations, and simple regression*

**CORREL**estimates the Pearson product-moment correlation coefficient between two arrays of values**PEARSON**is identical to CORREL**INTERCEPT**estimates the intercept*b_0*of the regression*y = b0 + b1*x***SLOPE**estimates the slope*b1*of the same regression equation**STEYX**estimates the residual standard error for the regression**RSQ**estimates the R^2, the square of the correlation coefficient between*x*and*y*>**LINEST**is a different kind of story. It provides a linear estimation for the data at hand. Basically it summarizes the regression analysis. It is an array function which means it must be called like this {=LINEST(arg1;arg2;arg3;arg4)}.

Data in the above example come from the R program and its *cars* dataset which “… give the speed of cars and the distances taken to stop” according to the help files. R help files cites Ezekiel’s 1930 *Methods of Correlation Analysis* published by Wiley as the source of the data and D. R. McNeil’s *Interactive Data Analysis* also published by Wiley in 1977) as a reference source for the data.

The output of the LINEST function consists of eight cells between F18 (upper left) and G22 (lower right) cells. Unfortunately no explanation is provided. I have tried to write in the adjacent cells some labels about each cell in the output of LINEST. Please note, that the third line of output (F21:G21) shows only the F-criterion for the fitting of data and the denominator degrees of freedom. In statistical parlance, *F(1,48) = 89.57, p < 1.49e-012*, which can easily be verified by using another Calc function FDIST.

There is additional output above the F18:G22 range of cells, basically the output of the CORREL, PEARSON, RSQ, INTERCEPT, SLOPE, STEYX commands from top to bottom, which all verify the output of the LINEST function.

**C. OOo_STAT**

An add-in macro for Calc, it comes with its own instructions on how to install it (you must enable macros) and then appears either as and additional top-level menu or as an item under Tools. OOo_STAT will not appear in Writer. It includes functions for graphics and advanced inferential statistics not available through other Calc functions. It is still in 0.5 version (which would make it a beta) but works pretty well.

After install, OOoStat gives three choices: Applications, Basic Stats, and Multivariate Statistics as shown below

The Applications menu allows for visualization of data via histograms and other manipulations.

The Basic Stats deals with one- and two- way ANOVA, Correlation Coefficients and Multiple Regression.

The Multivariate Analysis menu involves Principal Components Analysis, Correspondence Analysis, Multivariate Analysis of Variance (MANOVA), analyses which may be better completed using a dedicated statistical analysis package.

All in all, Calc provides a rather extensive set of tools (either through functions or through the OOo_Stat addin) which may assist users in their statistical analysis. However, beware, the user should always be in control of the analysis tools and not the other way round.

Enjoy,

I.

While I found your information from “A. Basic Descriptive Statistics” reasonably informative and helpful, your opening statements and tone are at best condescending, and generally insulting. Since the article is about statistical analysis in OpenOffice Calc, suggesting that a person is stupid for using Calc is highly unprofessional and completely in appropriate. If you prefer another software package that is fine. However, such should not be your opening for this topic, nor should it be presented as treating those using Calc as inferior. Please take the time to re-work you opening so that your skill and knowledge shine cleanly instead of through the fog of resentment it currently generates.

Mike,

I’d like to know what you found “… at best condescending, and generally insulting”. Yes, Calc (as well as Excel or any other spreadsheet program) was not meant to be used for statistics work. There are better, more accurate, more analytical programs for such work.

If you object to the phrase ” … for those who need or have to (or … whatever), “, then I am sorry that you felt in such a way. I never said that the people using Calc are inferior, only that there are other options for their statistical work.

Thank you, however, for you kind words about my skill and knowledge. I appreciate it.

I

I agree with you that for professional statistical analysis or for research, more specific statistical packages such as R should be used. But for simple descriptive statistics or for educational purposes Calc may be appropriate.

I have been written a set of macros for a class that I teach in descriptive statistics, I’ve called it ODStatistics and it can be downloaded from:

http://sourceforge.net/projects/odstatistics/files/odstatistics-0.3.17.zip/download

Walter,

Amazing macro and very very detailed work. This should be recommended highly and yes, I have already download it and used it.

I may say that it works perfectly with LibreOffice.

I

I use LibreOffice and OooStat to teach statistical functions in my course at a business school. Your page is very helpful.

I’m glad to have followed through Walter’s link. Very helpful, once again!

Shankar

I’m glad you like it. I use it myself with in my “Descriptive Statistics” class. Of course if you have any suggestions or feature requests please do enter them at the project’s site: http://sourceforge.net/tracker/?group_id=277622&atid=1178847