Content area
Scilab, a free MATLAB-like programming system, is reviewed. Scilab can be used for data analysis and applied numerical work in both research and teaching. Scilab is an interesting alternative to some commercial programming environments.
SUMMARY
This article reviews Scilab, a free MATLAB-like programming system. Scilab can be used for data analysis and applied numerical work in both research and teaching. Scilab is an interesting alternative to some commercial programming environments. Copyright (c) 2001 John Wiley & Sons, Ltd.
1. INTRODUCTION
Scilab is a general-purpose programming environment that is similar to Matlab, Gauss and Octave, which were described in Anderson (1992), Rust (1993), Cribari-Neto (1997) and Cribari-Neto and Jensen (1997). Its core consists of a high-level matrix programming language, an interpreter, and a fairly standard set of numerical functions. Scilab has built-in support for linear algebra (customized LAPACK), linear programming, constrained quadratic and non-linear programming, data exchange with Matlab, optimal control and signal processing. Scilab can produce publication quality graphics and animated graphics. It can perform symbolic computations by interfacing with Maple. In addition to the built-in functions there are a few packages, written by Scilab users, that are available free of charge from the Internet.
Scilab has been developed at INRIA (Unite de recherche de Rocquencourt-Rocquencourt-BP. 105-78153 Le Chesnay Cedex (France), home page at www-rocq.inria.fr/scilab/). The source code and precompiled binaries can be downloaded free of charge from their web site. Unlike Octave, Scilab is not licensed under the GNU General Public License (GPL). INRIA holds the copyrights, but it allows users to distribute and modify the source code provided that they notify the authors and clearly state the original source of the software. INRIA does not provide commercial support for Scilab, but online documentation and bulletin boards exist and contain enough information for beginners as well as for advanced users.
I found that the main strengths of Scilab are its price and availability of its source code, free additional packages available on the Internet, the ability to easily extend the system with Scilab, C and FORTRAN libraries, precision and robustness. Scilab's main weaknesses are its unclear future development path, slow linear algebra routines, and lack of several standard econometric procedures.
1.1. Supported Hardware and Software Platforms
The last stable version, which is the one reviewed here, is version 2.5. It was released in December 1999. INRIA does not officially state any commitment to future development, but bug fixes and new functions are regularly posted on the Internet. New function libraries are mostly written by Scilab users and are usually free of charge. Scilab is a stable platform-during my extensive experimentation the program did not crash at all.1
INRIA provides binaries for the hardware and software platforms listed in Table I. Porting to other systems should be easy, since source code is available. Scilab is also included in Debian and SuSE Linux distributions. Scilab has a relatively small footprint. After startup, it occupies approximately 4.5 MB of RAM. Hence the minimal computer that one can buy at the time of this writing is sufficient to run it.
2. MAIN FEATURES
2.1. Native Data Structures
Scilab supports many high-level data structures. The principal objects are two-dimensional matrices, which can be either dense or sparse. Matrices can be real or complex, and builtin linear algebra routines can handle both types. Scilab performs a significantly smaller set of operations on sparse matrices than Matlab. For example, the routine to compute the conditioning number of a sparse matrix is not included. Also, Krylov methods for iterative solutions of sparse linear systems are not implemented. Unlike Matlab, Scilab provides only three classes of test matrices: magic squares, Franck matrices, and inverse Hilbert matrices.
In addition to matrices Scilab recognizes polynomials, rational functions, string matrices, lists and dynamic multivariate linear systems. It can perform symbolic operations and linear algebra operations on matrices of these objects. General data structures are implemented by using lists. (In Scilab, multi-dimensional arrays are constructed using lists.) Scilab enables users to overload operators. This convenient feature, for example, allows users to make compositions of dynamic multivariate linear systems by using simple arithmetic operators.
2.2. The Scilab Programming Language
Scilab programming language shares many similarities with Matlab and Octave. Below is an example of a Scilab program that runs a textbook version of OLS and prints the results.
The authors provide a dictionary of differences between Scilab and Matlab which helps in manual translation of programs between the two languages. They are also developing an automatic translator for Matlab 4.X files. The translator is in beta-testing at the time of this writing. In my experience, it translates large files very slowly and sometimes behaves unpredictably.
Scilab treats functions as variables and not as files, hence a single file can contain more than one function. The functions have to be explicitly loaded into the working environment before they can be executed. Debugging in Scilab is particularly elegant and is implemented through the set of debug modes that separate debug variables from main variables in a simple way. Users enter a debug mode by issuing command pause. In the debug mode, users can access all the variables that exist in the main level. The variables created while debugging can be returned to the original level if users choose so but are otherwise discarded after debugging. This structure of debug modes, which is fully recursive, enables users to efficiently test their programs. The Scilab interpreter is quite standard. Line editing is Emacs-like, but not quite as powerful as that implemented in Octave.
2.3. Available Toolboxes
Scilab is distributed with many built-in libraries that are of interest to applied econometricians. The list includes linear algebra (interface to LA-PACK), optimal control, optimization for linear matrix inequalities, signal processing, simulation, differential equations, optimization (differentiable and non-differentiable, LQ solver), Scicos (an interactive environment for modeling and simulation of hybrid systems), Metanet (network analysis and optimization), symbolic capabilities and parallel Scilab (in development).
There are many user contributed free toolboxes on the Web. Those that are particularly interesting to applied econometricians include: an artificial neural network toolbox, an image manipulation toolbox, an interface to FSQP optimization tool, hidden Markov models toolbox, a Matlab compatible plotting library, PROSTAT (basic statistics function toolbox), STIXBOX (another statistics toolbox) and a time series plotting toolbox.
Because Scilab can interface with Fortran and C libraries, applied econometricians should have few problems customizing it to their needs.
3. QUALITY OF OUTPUT
3.1. Graphics
Scilab can produce publication quality two- and three-dimensional and animated graphics. The graphics routines are mostly incompatible with those in Matlab, Octave and Gauss. The graphical output can be exported to GIF, Postscript, Widows Metafiles and XFIG formats. The quality of graphics and their overall 'feel' are best demonstrated by an example in Figure 1.
3.2. Non-graphical output
The non-graphical (text) output is intuitive and easy to understand. Scilab supports copy and paste operations, so all text output can easily be copied into other programs. The diary option allows users to save entire sessions in text files for later use. Matrices, polynomials and rational functions can also be saved as TEX files by means of texprint and po12tex commands.
4. DOCUMENTATION
4.1. On-line Help
There is online help that adequately describes all Scilab commands. Online help appears in a separate window, and an arbitrary number of help windows can be open at any moment. This convenient feature leaves the main Scilab window undisturbed with help messages. Searching through help files is straightforward with the apropos command. Most help files include examples of the commands in question. Many examples of command usage can also be obtained by running commands with empty argument lists.
4.2. The Manuals
INRIA provides two Scilab manuals in English: Introduction to Scilab (ITS) and Scilab Manual (SM). The former is targeted at new users, and the latter is a comprehensive reference text. Both manuals are available from the Scilab home page in PDF, Postscript and HTML formats, but no printed versions are distributed. In addition there is a book on Scilab, describing its use in industry and science (Bunks et al, 1999).
ITS is well suited to new users. It includes a sample Scilab session that covers the most important features of the language. Working through the session can help a novice to begin using Scilab productively in less than an hour. SM is quite detailed, including description of all built in functions, calling parameters and examples. Descriptions of Scilab functions in the PDF and HTML versions of SM contain hyperlinks to related topics. The subject index also contains hyperlinks. These two features allow readers to quickly access parts of the manual that interest them.
5. USE IN ECONOMETRICS
5.1. Econometric Tools
Scilab has some useful built-in statistical functions. The most important routines are leastsq for non-linear least square estimation, datafit for non-linear GMM estimation, optim for nonlinear optimization of unconstrained functions and functions with variables constrained between lower and upper bound constant vectors, armax for estimation of multivariate ARMA processes, arsimul for simulation of multivariate ARMA processes, yulewalk which implements the least square filter design, and kalm to compute the Kalman update and error variance.
Absence of constrained non-linear optimization is a weak point of routines datafit and leastsq, since certain standard statistical tests (e.g. some likelihood ratio tests) cannot be performed without it.2 Neither datafit nor leastsq provide the usual diagnostic tests, error estimates or goodness of fit measure. Users must augment these files in their programs to get this functionality.
Scilab has built-in functions to compute cumulative distribution functions and a comprehensive random number generator which is invoked by using function grand. It can generate random draws from the following distributions: beta, binomial, -2, uniform, exponential, F, gamma, multivariate normal, multinomial distribution, negative binomial distribution, noncentral -2, noncentral F and Poisson. The function can also generate successive states of a Markov chain and random permutations of vectors.
Two statistical toolboxes that are found on the INRIA web site are PROSTAT and STIXBOX. PROSTAT includes support for basic statistics. STIXBOX adds non-parametric statistics, density estimation, bootstrap and jackknife, logistic regression, and eleven test datasets.
In summary, Scilab and the existing packages provide basic statistical functionality. However, more econometric packages need to be ported to Scilab if it is to become a tool for doing applied econometrics.
5.2. Speed of Computation
To assess the speed of computation, I measured execution times of several simple tasks. The goal was to test the speed of the core linear algebra routines, random number generator, fast Fourier transform, and the interpreter. The benchmark is not meant to be a detailed comparison, but rather one that gives the general idea of Scilab's speed. Execution times are compared for Scilab 2.5, Matlab 5.3 and Octave 2.0.15.
The tests were run on a Dell PC with CPU speed of 400 MHz, 256 MB of RAM, running Windows NT 4.0, service pack 5. Execution times in seconds are listed in Table II. We see that Matlab performs standard numerical tasks faster and fast Fourier transform slower than Scilab and Octave. Matlab has a faster linear algebra engine than Scilab and Octave because it uses an optimized BLAS library. We also see that Scilab's interpreter is by far the slowest of the three.
5.3. Numerical Accuracy
Numerical accuracy of basic econometric procedures was measured using the StRD data sets, published by the National Institute of Standards and Technology (NIST).3 Following Vinod (2000), who tested the numerical accuracy of Gauss, I used the number of accurate digits (NAD) as the relevant measure of numerical accuracy.
I used this approach to test the accuracy of Scilab 2.5, Matlab 5.3 and Octave 2.0.15 in computing summary statistics, linear regressions, and analysis of variance. The conclusion of these comparisons is that all three packages offer very similar numerical accuracy.
5.4. Data Input and Data Management
Scilab stores data in its own binary format, and it also recognizes a few other file formats. Reading Matlab 4.X compatible binary files is achieved by using the command mtlb load and saving in the same format by mtlb save. Scilab reads Excel text files by excel2sci and it recognizes sound files in the WAW format.
Recoding raw data, handling missing values, and grouping variables into records are common tasks for an applied econometrician. In Scilab, users must write scripts to perform these operations, as little native support for data management exists in Scilab.
6. FINAL REMARKS
In conclusion, Scilab is a free, robust, general-purpose package that can be used successfully in applied econometrics. It is primarily used by engineers, but its similarity to Matlab and Octave, its overall quality and its free availability should encourage econometricians to use it in teaching and research. One must add that a dedicated Scilab econometric package would significantly increase the appeal of Scilab to applied econometricians.
ACKNOWLEDGEMENTS
I would like to thank James MacKinnon and the members of the Scilab group for providing useful comments and suggestions.
1 Scilab requires manual setting of stack size to work properly. This feature can be inconvenient if one loads many data sets and has to deal with resetting the stack size.
2 There is a free user-contributed package that provides an interface to FSQP. FSQP is a free optimization package and can help to circumvent this problem.
3 StRD data sets and algorithms can be found on the StRD home page: http://www.itl.nist.gov/div898/strd/
REFERENCES
Anderson RG. 1992. The gauss programming system: A review. Journal of Applied Econometrics 7: 215-219.
Bunks C, Chancelier J, Delebecque F, Goursat M, Nikoukhah R, Steer S. 1999. Engineering and Scientific Computing with Scilab. Birkhauser.
Cribari-Neto F. 1997. Econometric programming environments: Gauss, Ox and S-plus. Journal of Applied Econometrics 12: 77-89.
Cribari-Neto F, Jensen MJ. 1997. Matlab as an econometric programming environment. Journal of Applied Econometrics 12: 735-744.
Cribari-Neto F, Zarkos SG. 1999. R: Yet another econometric programming environment. Journal of Applied Econometrics 14: 319-329.
INRIA. 1998. Introduction to Scilab. INRIA, Meta2 Project/ENPC Cergrene SCILAB, Domaine de Voluceau-Rocquencourt-B P 105-78153 Le Chesnay Cedex (France).
Rogers J, Filliben J, Gill L, Guthrie W, Lagergren E, Wangel M. 1998. Statistical reference data sets for testing the numerical accuracy of statistical software. Technical report, National Institute of Standards and Technology.
Rust J. 1993. Gauss and Matlab: A comparison. Journal of Applied Econometrics 8: 307-324.
Vinod HD. 2000. Review of gauss for windows, including its numerical accuracy. Journal of Applied Econometrics 15: 211-220.
MICO MRKAIC*
Fuqua School of Business, Duke University, Durham, NC 27708-0120, USA
* Correspondence to: Professor M. Mrkaic, Fuqua School of Business, Duke University, Durham NC 27708-0120, USA. E-mail: [email protected]
Received 12 January 2001
Copyright Wiley Periodicals Inc. Jul/Aug 2001
