Introduction to some of the most promising programming languages for Data Science and Cloud Development
As of 2020, there are about 700 programming languages available . Some of these tend to be applied just for specific domains while others are widely appreciated for their ability to be able to work in a wide range of applications. During the past decade, there has been an almost steady growth in the application of software and new languages have been developed in order to meet the demand. In this article, we are going to explore some of the most currently used programming languages and potential new stars in the ambit of Data Science and Cloud Development.
Deciding to learn a relatively new programming language in our spare time can at times be a risky investment of our time since we can’t be sure of how this new language will be perceived by the job market in the next few years. On the other hand, newer programming languages have in most cases been carefully designed in order to make the most out of the recent advancements of technology, therefore potentially giving us an edge in the long run. Some of the key advantages of using newer programming languages could therefore be:
- Hardware optimization (GPUs, Multi-Core CPUs systems).
- Improved Networking.
- More concise code.
- Type Inference.
- Easier containerization and cloud support.
According to the 2020 Stackoverflow Developer Survey , the following programming languages were the ones most loved by developers in 2020 (Figure 1). As part of this article, we are going to consider 5 of them.
Additionally, according to the Stackoverflow Developer Survey, these were the top 10 paying programming languages in 2020 (Figure 2).
Programming languages like Python and R are now quite popular in ambits such as Data Science, Machine Learning and general-purpose computational/numerical tasks thanks to their ease of use. However, these languages had not originally been designed in order to work on highly scalable systems. This can therefore make it difficult to work with this type of programming languages for large enterprise solutions. In order to overcome this type of problem, Julia has been created by a group of researchers at the Massachusetts Institute of Technology (MIT). Some of the key features of Julia are:
- Optimized for working with parallel and distributed systems.
- Built-in package manager.
- Support for C Programming functions.
- Dynamic Typing.
In order to facilitate adoption, many Data Science and Machine Learning libraries have already been implemented such as ScikitLearn.jl, TextAnalysis.jl, StatsModels.jl. Additionally, Julia can also be used in a traditional Jupyter Notebook. If you are interested in finding out more about Julia for data science, this YouTube course is a great place where to start.
As can be seen from Figure 3, Julia had so far an overall increasing number of Google searches throughout the past few years.
Go is nowadays one of the most promising system programming language. This programming language was in fact developed by Google in order to make application and development scaling easier. Some of the key characteristics of Go are:
- Designed for Cloud-Native Development. In fact, mainstream tools such as Docker and Kubernetes have been developed using Go.
- Memory Management (unlike languages such as C and C++ has an embedded Garbage Collector).
- Excellent Concurrency support.
After reaching a peak around 2014, Go had a quite consistent amount of Google searches throughout the years. Go is currently one of the most popular programming languages on Cloud platforms such as Google Cloud Platform and Microsoft Azure.
In case you are interested in coding Machine Learning algorithms in Go, GoLearn is a great place where to start.
Python is in these days the most popular programming language for Data Science and Machine Learning tasks. It was first developed in 1991 by Guido van Rossum and since then it has just been increasing in its popularity (Figure 5).
Some of the most popular Python libraries for Data Science and Machine Learning are:
As mentioned before, one of the key issues associated with Python is it’s poor scalability performances. In order to try to overcome this problem, different systems such as Cython and Numba have been implemented to create C like performance levels while coding in Python.
Scala is currently considered to be one of the best programming languages for functional programming (it does although still provide support for object-oriented programming approaches). In terms of search popularity, Scala seems to have peaked around 2018–2019 on Google Searches (Figure 6).
Some of the key advantages of using Scala are:
- Scala is a statically typed language.
- Much faster compared to programming languages such as Python.
- Compatibility with Java.
- Ability to combine both functional and object-oriented programming.
One of the main reason for the popularity of Scala is Apache Spark (a data-management tool built with Scala). Apache Spark is in fact one of the most popular big data tools for Hadoop integration (fast processing of large amounts of data).
 How Many Computer Programming Languages Are There? — Career Karma, TRENT FOWLER. Accessed at: https://careerkarma.com/blog/how-many-coding-languages-are-there/#:~:text=According%20to%20Wikipedia%2C%20there%20are,to%20an%20impressive%20245%20languages.
 2020 Developer Survey — Stack Overflow. Accessed at: https://insights.stackoverflow.com/survey/2020
 Google Trends. Accessed at: https://trends.google.com/trends/?geo=US