“Today a new language is overtaking French as the most popular language taught in primary school. Its name is Python… 6 out of 10 parents want their kids to learn Python”, Joel Clark.
Well, when I attended school, I learnt BASIC… But I must confess, I do share the excitement taking over the Python language.
I have recently taken part in a webinar organised by Risk.net and Fincad where we discussed the advantages and challenges in using Python for developing quantitative trading applications. The panel included experts from various corners of the industry including myself and:
- Joel Clark, contributing editor, Risk.net (Moderator)
- Gary Collier, CTO, Man Group Alpha Technology
- Per Eriksson, senior executive, enterprise risk and valuation solutions, FINCAD
- Ronnie Shah, head of US quantitative research and quantitative investment solutions, Deutsche Bank
The webinar was a success with over 500 participants. Since Python is on everyone’s mind, I wanted to highlight some interesting questions and thoughts from our discussion. The audio of the webinar is available here
Why Python has become an increasingly popular programming language in financial markets?
One of the major advantages of using Python is the ease to interconnect different systems with data feeds and databases, to process data, and to output results into user and trading applications.
My first experience with Python came in 2012, when Bank of America Merrill Lynch, where I worked as a front office quant strategist, introduced the Quartz system developed in Python. The Quartz was supposed to be the bank-wide solution to share data and trading risks. The reason is that the insufficient centralization and aggregation of positions and risks across all trading books (traditionally differentiated by geographies and asset classes) was one of the key weaknesses shared by large investment banks during and in the aftermath of the 2008 financial crisis. As a result, the Quartz and Python-based analytics were thought as a bridge to connect different parts of analytics, data centres, and development teams. A daunting task for any large organization employing hundreds of developers and users!
Moving fast forward, Python has been widely applied by major financial institutions for developing tools to connect different parts of analytics and to increase collaboration within a firm. Over time, people have also started to do more core development in Python in addition to using Python as a glue language.
New developments using the Python language have been leveraged thanks to a rich Python ecosystem with huge number of libraries for data analytics and visualization. For an example, Man AHL illustrated how they benefited by moving both research and production code to Python.
Summarising their paper and our panel, Python has become increasingly popular because:
- Python enhances the communication between different teams.
- Python provides an advanced ecosystem with packages for numerical and statistical analysis, data handling and visualization.
- Python is easy to learn and it is flexible to apply, and it’s actually fun to program using the Python language. As a person with many years of doing quantitative modelling in C++ and Matlab, I fully support this view.
How Python works among other languages for data analysis?
Since data analytics is currently one of the key drivers across all industries including the finance and investment management, choosing the right ecosystem for development may have a crucial impact on the business development and success.
Presently, the three development tools are widely applied for the data analytics.
- Python along with pandas for tabular data structures and multiple packages for data analysis (statsmodels for statistical analysis, matplotlib for data visualization, scikit-learn for machine learning, etc). The advantage is that Python provides a free and open-source solution with plentiful resources for data fetching, processing, and visualization. Python can be easily deployed on either a PC or a server to make scalable firm-wide solutions.
- Traditionally, Matlab has been widely applied in academic and research labs but it comes with a heavy cost for commercial firms. Matlab has numerous packages for data processing, analysis, and visualization, however each package is available at a separate price. Personally, I have used Matlab a lot along with its capabilities for the object-oriented programming. While I value some capabilities of Matlab, the major drawback of Matlab, apart from its licensing cost, is that the deployment of Matlab-based analytics is problematic and comes with separate fees. Matlab applications can be compiled and deployed on a server but the deployment process looks complex and not well documented and it may be costly if external consultancy is needed. In my opinion, the insufficient portability and scalability are major obstacles for developing firm-wide solutions using Matlab.
- R along with its multiple packages for statistical data analytics. While R is free and it has many packages to do various statistical analyses, the deployment of R across firm-wide platform may not be as efficient. In my opinion, the R language is suitable only for the development of stand-alone tools for statistical analyses. In fact, Jupiter Lab enables to apply R functionality within the Python ecosystem.
How long would it take to convert Matlab production code to Python?
Given the advantages of Python over Matlab, most firms would now employ Python to start any new development from scratch. How is about converting the legacy code and systems?
Gary Collier gave one example of AHL converting a fairly complicated trading system for single stock equities to Python within 8-9 months.
In fact, my friend Saeed Amen has just written a short overview paper on moving from Matlab to Python. The transition is feasible… While there will be short-term costs, the long-term benefit is to have a firm-wide solution developed in one multi-purpose language that everyone can understand and contribute to.
Python everywhere?
To conclude, the top figure shows the share of questions about various programming languages asked each month at Stack Overflow, which is the largest online community for developers. We clearly see the growing trend for Python against all other major programming languages. Perhaps soon enough the Python will overtake all other languages taught not only in primary school but e,plyed everywhere else…