What is R?

posted Jul 2, 2019

The R programming language is an open source scripting language for analytics and data visualization. R acts as free alternative to traditional statistical packages such as SPSS, SAS, and Stata. Such software allows for the user to freely distribute, study, change, and improve the software under the Free Software Foundation's GNU General Public License. These advantages over other statistical software encourage the growing use of R in cutting edge social science research.

Why R?

The "R" name is derived from the first letter of the names of its two developers, Ross Ihaka and Robert Gentleman, who were associated with the University of Auckland at the time. R is a free implementation of the S programming language, which was originally created and distributed by Bell Labs. Most code written in S will run successfully in the R environment.

They are plenty of tools available in the market to perform data analysis.  The picture below depicts the learning curve compared to the business capability a language offers. Excel and PowerBI are simple to learn but don't offer outstanding business capability, especially in term of modeling. In the middle, you can see Python and SAS. SAS is a click and run software tool to run a statistical analysis for business, but it is not free. Python, however, is a language with a monotonous learning curve. Python is a fantastic tool to deploy Machine Learning and AI but lacks communication features. With an identical learning curve, R is a good trade-off between implementation and data analysis.

Who uses R?

Stack Overflow ( is the largest, most trusted online community for developers to learn, share​ ​their programming ​knowledge. Lately, the percentage of question-views has increased sharply for R compared to the other languages. This trend is  highly correlated with the booming age of data science and reflects the demand of R language for data science.

Data scientists use either R and Python. Their job is to understand the data, manipulate it and expose the best approach. Data scientists are not programmers. R is probably the language for non-programmers involved in data science.

Data scientists are heavy users of machine learning. The best algorithms for machine learning can be implemented with R. Packages like Keras and TensorFlow allow the development of machine learning algorithms from R.

Where  obtain R?

Installation files for Windows, Mac, and Linux can be found at the website for the Comprehensive R Archive Network, There is no cost for downloading and using R. You can already use R using the command line after installing the basic software.

To make your work more productive, install RStudio IDE. Get it here.

