The Swiss Data Science Center (SDSC, datascience.ch) is a national center between EPFL and ETH Zurich, whose mission is to accelerate the use of data science and machine learning techniques broadly within academic disciplines of the ETH Domain and the Swiss academic community at large. It aims to federate data providers, data and computer scientists, and subject-matter experts around a cutting-edge analytics platform offering domain-specific “Insights-as-a-Service” while addressing security and privacy issues inherent to the field of data science. The SDSC is composed of a large multi-disciplinary team of data & computer scientists and experts in relevant domains, distributed between our offices in Lausanne and Zurich. The unique synergy that the center enables among the institutions of the ETH Domain and between academic and industrial stakeholders in both data science and across carefully selected domains is expected to foster scientific breakthroughs with significant societal impact.
Many data science projects today struggle to be efficient and reproducible. It is difficult to identify available data, and then even more to share it; those who share data are often not recognized for their contribution; it is a challenge to keep track of data versions; it is hard to see what code and data were used by whom to produce what results. Renku (datascience.ch/renku/, renkulab.io/) is an open collaborative platform developed by the SDSC to address these problems. Renku provides a knowledge infrastructure that seamlessly integrates interactive sessions (such as Jupyter, RStudio), automatic provenance tracking (which results were produced by whom and when), GitLab CI/CD, as well as version control systems for code, data and containerised environments. The key strength of Renku is its knowledge graph that captures the provenance of the analysis process by connecting versioned research objects, thus ensuring computational reproducibility. Renku makes it possible to have greater trust in results and acknowledge the contributions of all those involved, regardless of whether their contribution was to implement the solution, provide the data, or ask the right questions.
EPFL and ETH Zurich are seeking enthusiastic and experienced candidates with scientific IT expertise and a proven track record in and around data science and analytics on large-scale distributed platforms, services and applications, to staff up their national R&D Swiss Data Science Center. The ideal candidate will become part of the Swiss Data Science Center and will act as an enabler of data science activities within the research community from the ETH domains, Swiss universities, and the industry. In this role, you will:
We offeryou a stimulating, startup-like, cross-disciplinary environment in a world-class research center that is part of two leading universities. In this dynamic position, you will make full use of your data science engineering and research skills and creativity to develop novel solutions for real cutting-edge questions. You will push forward the capabilities and performance of the team, contribute to decision-making about the direction of the SDSC platforms and investigate available technology options. You will work in a data science setting alongside leading domain and computer science experts from the ETH domain as well as industry. We have excellent ties to research groups worldwide, both academic and industrial. You will get access to state-of-the-art infrastructure and resources.