Sylvain Corlay

Avatar picture
  • Scientific software developer, Quant researcher, formerly quant at Bloomberg, and adjunct at Columbia and NYU
  • Founded QuantStack in 2016
  • Core developer of Project Jupyter: mostly focused on widgets.
  • Data visualization in the browser: bqplot, pythreejs, ipyleaflet
  • C++ scientific computing: xtensor, xeus

A piece of career advice for newcomers

My 2 cents

Get involved in open-source early in your carrer

  • Contribute to a large project that you are already using.
  • Start a small project on your own, that may be useful for others.

In both cases, start small.

You can have a lot of impact

  • A simple CSS fix in Jupyter can impact tens of thousands of users.
  • Smooth ramp of complexity from the drop-by contributions to becoming core contributors to major packages
  • Another dimension for growth than climbing the corporate ladder

Finally you will find some of the most inclusive and diverse communities.

  • Jupyter
  • Scipy
  • nteract
  • Django
  • Sage

A piece of advice for employers

My 2 cents
  • Encourage employees to contribute to the software they rely upon in their work.
  • In their free time, let them contribute to anything unrelated!
  • This gives a new dimension to grow for your engineering. Other career perspective than becoming a manager.
  • Help you recruit people, show off the quality of your tech teams.

Roads and bridges

  • Open source is not for crazy hippies and closed-source software is not evil.
  • Open source is the backbone of scientific computing.
    • compilers, interpreters, OSs, scientific libraries, package managers.
    • You cannot afford to redo all of that yourself.
  • Open source is common infrastructure used by everyone

Roads and bridges

Open up your phone.
Your social media,
your news, your
medical records, your
bank: they are all using
free and public code.

-- Nadia Eghbal

What happens when infrastructure breaks?

The case OpenSSL

Having worked with the Department of Defense, Marquess saw how critical OpenSSL was, not just to their software, but to other industries around the world, from enterprise to aeronautics to health care. Until that moment, he had “always assumed, (as had the rest of the world) that the OpenSSL team was large, active, and well resourced.” In reality, OpenSSL wasn’t even able to support one person’s work.
Now what happens when numpy / jupyter / scipy / pandas / conda breaks?
  • There is only a handful of developers working on Pandas full time
  • No full-time developer working on numpy at the moment


The mission of NumFOCUS is to promote sustainable high-level programming languages, open code development, and reproducible scientific research. We accomplish this mission through our educational programs and events as well as through fiscal sponsorship of open source data science projects. We aim to increase collaboration and communication within the scientific computing community.

Our action

  • Work with the NumFOCUS leadership on the creation of a EU branch or separate EU entity.

  • Creation the PyData Paris meetup.

  • What next?

The language war

  • What happens when you say "Java" in a C++ conference?
  • What happens when you say "MatLab or R" in a Python conference?

The language war

There is no reason for R, Julia and Python to be in competition. They have very similar communities

Duplication of effort hurts sustainability and inter-operability.

  • Package management
  • IDEs, developer tools
  • In-browser data visualization tools
  • Data structures

All these require collaboration beyond language boundaries.

My advisor had a heavily customized awk/sed/bash workflow to manage job submissions and postprocessing of C codes for supercomputing runs... So I used her scripts to run my jobs, and on top of that has added my own layer of Perl, plus a hefty amount of Gnuplot, IDL and Mathematica.

- Fernando Perez



Raise your rand if you

  • Have heard of IPython or Jupyter
  • Have used IPython or Jupyter
  • Have used the Jupyter notebook
  • Have used interactive widgets in the notebook

IPython: Interactive Python, 2001


Jupyter Team


and 500+ more contributors



Jupyter notebook

  • Interactive browser-based computing environment
  • Exploratory data science, ML, visualization, analysis, stats
  • Reproducible document format:
    • Code
    • Narrative text (markdown)
    • Equations (LaTeX)
    • Images, visualizations
  • Over 50 programming languages
  • Everything open-source (BSD license)

The Jupyter Notebook




... more than 70 different kernels


2015 IBM survey: 3M users


Measuring Adoption

Approximately 600k notebooks on GitHub


Commercial Offering

  • Google DataLab
  • Microsoft Azure
  • AWS
  • CoCalc (SageMathCloud)

Other Large-Scale Deployments

Generally using the JupyterHub multi-user server

  • Educational:
    • UC Berkeley
    • Cal Poly
    • U. Sheffield
    • ETH Zurich
  • NERSC (National Energy Reseach Scientific Computing Center)
  • San Diego Supercomputing Center
  • Minnesota, CU Boulder, Compute Canada
  • CERN
  • Wikimedia Foundation
  • Danish e-Infrastructure cooperation.

Enabling Reproducible Science


Enabling Reproducible Journalism


More Than Notebooks

More than notebooks




The HTTP of Scientific Computing

A specification matters a great deal more than any implementation.

Well specified

  • Communication protocols
  • File formats
  • Serialization Schemas

transformed Jupyter in the HTTP of Scientific Computing for

  • Exploratory Analysis
  • Data Visualization
  • In a language agnostic fashion

What is missing?

The interface to the REPL has been abstracted out by Jupyter

Interoperability between the languages of scientific computing is still poor

  • Data Structures for Data Sciencess

  • The scalability Gap
Tutorials, talks, development and industry training

Thank you!