Call for Proposals to Contribute to the LSST:UK Component of Rubin Data Release Processing

Issued by George Beckett (LSST:UK Science Centre, Data Facility Liaison); Wil O’Mullane (Rubin Observatory, AD Data Management); Mike Watson (LSST:UK Consortium, Board Chair) on behalf of the LSST:UK Consortium


The call was updated on Monday 6th November 2023 to include details of the grade profiles for the funding award from STFC. The new information is underlined in the text below for convenience.

The LSST:UK Science Centre Phase C funding, awarded by STFC, includes support for a team to work on the UK contribution to LSST Data Release Processing. The LSST:UK Consortium is now ready to allocate that funding for the three-year period from 1st April 2024 to the end of Phase C on 31st March 2027, with a likelihood of continued support for the selected team during Rubin operations.

Members of Consortium institutions are invited to submit proposals to address one or more of the roles defined below, as part of a work-package team that will collaborate closely with peers in the Rubin Observatory Data Management Operations team.

Details of how to submit proposals are provided in this document. The deadline for submissions is 4pm GMT on Tuesday 21st November 2023.

Background

Data Release Processing (DRP) encapsulates the process of turning telescope observations into science-ready data products that are packaged into, typically annual, Data Releases. DRP is a computationally intensive and complex task which will be shared across three facilities – one at the US Data Facility, at SLAC, California and one at the French Data Facility, at CC-IN2P3 in Lyon, as well as here in the UK.

The LSST:UK Science Centre has been contributing to DRP preparations since March 2022. During this period, a team from Edinburgh, Lancaster, RAL, and Sheffield has helped to develop a multi-site technology platform that is – at this time – ready to be tested and refined into a processing solution for the Rubin Observatory operational phase.

The Rubin Observatory proposes to use technologies and techniques that have been honed within the high-energy physics community to support distributed data processing for the Large Hadron Collider. In particular, the proposed solution relies heavily on the distributed-data-management system, Rucio; the File Transfer Service; and the PanDA workflow management system.

The Opportunity

LSST:UK has funding for five roles (within Phase C Work Package 4 of the current award). These roles are a mix of full-time and half-time appointments, defined below, which have been developed in collaboration with DRP experts from the Rubin Observatory as well as the US and French Data Facility Liaisons.

The successful candidates will be part of the Rubin Data Management Operations team, with shared responsibility for DRP as a whole, plus an expectation of specific interest and expertise to ensure the UK delivers on its commitment to DRP.

Full details of the LSST:UK in-kind contribution to Data Release Processing can be found in the mature draft of the UK Data Facility Operational Plan.  

LSST:UK has secured funding for 5.0 FTE of effort during April 2024—March 2027, with the following grade profile based on standard UK University salary scales: 2.5 FTE at Grade 7; 1.5 FTE at Grade 8; and 1.0 FTE at Grade 9. An indicative assignment of grades to roles is included below, though this is not a prescription. The final recommendation from the select panel may be based on a different distribution of grades to roles, though the selection panel is expected to make a recommendation that is within the effort levels available at each grade.

The team roles are, with reference to the Operational Plan, defined as follows:

  • Production Scientist (0.5 FTE, indicative Grade 8) – working in Data Production and contributing to the science validation of DRP and helping to ensure that Data Release Products are fit for purpose. Specifically, the Data Release Production Scientist will:

    • Oversee UK processing activities, to identify processing problems, off-spec data products, and underperforming processing stages.

    • Undertake a real-time (that is, during production) assessment of the astronomical validity of the data release processing undertaken at the UK Data Facility.

    • Coordinate with other Production Scientists (that is, at US and French Data Facilities) regarding data issues that are not specific to UK DRP.

    • Understand the scientific intent of the processing pipeline and be able to troubleshoot issues with elements of the pipeline configuration.

  • Processing Scientist (0.5 FTE, indicative Grade 7) – as part of Data Facilities, to be responsible for the completion of DRP in the UK and ensure the timely delivery of data products to maintain processing momentum. The Processing Scientist will be responsible for:

    • Coordinating the curation of datasets via the Data Butler as well as other required database platforms for DRP.

    • Monitoring day-to-day progress with the UK contribution to DRP and measure real-time DRP performance.

    • Undertaking basic validation of UK processing activities, to identify and coordinate resolution of day-to-day processing problems, off-spec data products, or under-performing processing stages.

    • To prepare resource estimates and forecasts for IRIS provisioning.

    • To support the QA team by providing suitable sample products and data to enable them to validate UK processing contributions.

    • To liaise with the Data Wrangler, Workflow Manager and Operations Support to prepare for each new Data Release campaign, confirming software and hardware configurations to be used and to collate feedback from UK pre-Data Release testing and validation activities.

    • Seeing problems through to resolution.

  • Data Wrangler (Data Curation – Rucio) (1.0 FTE, indicative Grade 8) – as part of Data Facilities, to configure, operate, and maintain the data-distribution, staging and curation, required for DRP—e.g., using Rucio. The Data Wrangler will be responsible for:

    • Contributing to the overall operation of Rucio across the campaign.

    • Oversee any bulk-download/ transfer operations and audit location and availability of data assets, in line with campaign requirements.

    • Configuring UK endpoints and interfaces to Rucio, to allow UK infrastructure to effectively contribute to DRP.

    • Problem-solving for Rucio workflows and configuration elements.

    • Liaising with UK (storage and transport) infrastructure providers to enable effective use of Rucio.

  • Workflow Manager (Workload Management – PanDA) (1.0 FTE, indicative Grade 9) – as part of Data Facilities, to configure, operate, and maintain the (compute) processing workflow, required for DRP. It is expected that the processing workflow will be implemented using PanDA. The Data Wrangler (WM) will be responsible for:

    • Contributing to the overall operation of PanDA across the campaign.

    • Configuring UK endpoints and interfaces to Panda, to allow UK infrastructure to effectively contribute to DRP.

    • Install and maintain Pipeline software and dependencies on UK infrastructure.

    • Setting up and monitoring UK elements of the Data Butler service.

    • Problem-solving PanDA workflows and configuration elements.

    • Liaising with UK (computing) infrastructure providers to monitor and improve effectiveness and efficiency of PanDA configuration in the UK.

  • DRP Operations Support (1.0 FTE, indicative Grade 7) – as part of Data Facilities, a Research Software Engineer who will have a thorough understanding of the whole UK contribution to DRP and be able to trouble-shoot and contribute to processing tasks as required.

    • Addressing issues with the configuration and operation of the DRP environment

    • Liaise with network providers (transit and endpoints) regarding efficient transport of data products to processing sites, etc.

    • Liaising with the Rubin Infrastructure Team regarding implementation, configuration, and optimisation of the DRP environment.

The DRP team is completed by the inclusion of the UK Data Facility Liaison (George Beckett, University of Edinburgh) and 1.0 FTE of Fabric Administrator effort (indicative Grade 7) that will be divided between the IRIS sites hosting the DRP hardware and, hence, not allocated as part of the current call.

Selection Process and Timeline

Institutions wishing to apply for one or more of these roles should submit an application to the LSST:UK Consortium Board Chair, Professor Mike Watson (mgw@leicester.ac.uk), by 4pm GMT on Tuesday 21st November 2023. Applications should include a brief (up to half a page) explanation of why the institution would be a suitable host for the role(s), together with short curriculum vitae (or equivalent description of expertise) for the named individual(s) proposed to take them on, as well as for the institutional Principal Investigator, who will be funded (typically as Directly Allocated staff) at a level of 0.1 FTE per 1.0 FTE of Directly Incurred staff funding to contribute to the leadership of the UK DRP activity. Travel funding totalling £2,000 per staff-year of effort and other directly incurred costs totally £2,000 per staff-year of contributed effort will also be funded as part of the award.

Applicants may propose to split the full-time roles outlined above – for example, to maximise the potential to use institutional skillsets and experience – though no individual contributions should be less than 0.5 FTE.

Proposals will be considered by a panel appointed by the LSST:UK Consortium Board.

The panel will make a recommendation to the LSST:UK Board, who will confirm the successful candidates. The Board will also invite one of the institutional Principal Investigators to assume the role of Work Package leader.

These roles are expected to commence from 1st April 2024 and funding for the successful candidates will be awarded from the STFC Astronomy Programme. Successful candidates will be contacted with details of the process leading to the announcement of these grants.

Further Information

Any questions regarding the call may be directed to the Chair of the LSST:UK Consortium, Professor Mike Watson <mgw@leicester.ac.uk>. Questions regarding technical aspects of DRP and the proposed UK contributing roles may be directed to the UK Data Facility Liaison, George Beckett <George.beckett@ed.ac.uk>.

Monday 30th October 2023

If you require this document in an alternative format, please contact the LSST:UK Project Managers lusc_pm@mlist.is.ed.ac.uk or phone +44 131 651 3577