Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Experiments with Lasair-ZTF

  • Use cases

    • Nuclear transient research currently migrating to Google Colab notebook

      • Difficulties are: filtering on colour, streams clogged with old objects, not having full light curve history (this is the biggest problem)

      • Intend doing citizen science with light curves. Documentation to do this was helpful

    • Lensed transients

  • Email alerts could be more human readable

  • A thumbnail for each object on streaming page would be helpful

  • Testing spectra in advance would be good.

  • Email alerts and web interface are what

  • Roy stated that the light curves are coming so the major problem identified will be solved. Like/dislike for an object is the tail of tiger and could mean Lasair in effect building marshalls. As yet do not have the facility to push data into notebooks but will let you know this ready. facility yet.

  • Eric Q to Matt - Regarding the boundaries with marshalls, how do keep track of candidates?

    • Matt answer - luckily the numbers are not big so not a problem yet. Have been using a note in Google collab. Also within research group have started tentative steps to building our own marshall.

  • Cosimo Q to Roy - Agree color curve would be fantastic. What about using Gaussian processes to interpolate the light curves in order to produce colours (with uncertainties)?

    • Roy - happy to look at Gaussian code and to include this in the Lasair pipeline.

    • Cosimo will forward the code to Roy.

  • Jakob Q to Roy - you are all using Google colab but we have encountered python limitations

    • Roy answer - Ken will talk about this later.

    • Ken - do not know how Google Colab behaves when it requires C binaries other than NumPy. Not yet installed own C code. We should explore this.

    • Gareth answer- aware that Nubaldo will be running on RSP next door so if Google Colab not suitable then can consider integration with the RSP.


Differences between ZTF and LSST (Eric Bellm)

(need slides from Eric - have messaged)

  • ZTF is on telescope that is almost 75 years old

  • Slide 3 shows physical differences with Rubin. Only advantage for ZTF is in field of view. Total number of exposures per night will be similar on both telescopes.

  • Major difference is ZTF only processes the data once i.e. as live data comes off the telescope.

  • Q Julien on chat - Cutouts to be transmitted for LSST are template and difference. Why choosing template over science? The science one is probaby what I look first to judge quality.

    • A- The reason lost to pre-history. aim was to get 2 out of 3. Aim to go to 3 cut-outs.

  • Question: (Roy W asked) Can Lasair get away with using the LSST cutout service instead of saving them in our own database? How many can we fetch per day?

    • Answer: Do not think Lasair should use US cutout service. Capacity limits are not clear at moment, but cut-outs won’t become available until after the underlying images, which means a 3-day delay (in line with Government requirements).

  • Q Jakob - Regarding upper limits on forced photometry, will there always be 12 month limits on alerts with varying limits for forced photometry.?

    • A: Not built rigorously in pipelines, so remains conceptual. Should have history of every time observe a reason, and provide upper limit on noise estimate where don’t have forced photometry. This can change from night to night.

    • A: Details of workload management for PPDB are to be confirmed, though it is a concern. Several ways to think about this. Outpcome of Broker W/shop was action on project to offer a database export of PPDB (world-readable) for those who require significant information from DB.

  • Davy asked to clarify that forced photometry in alerts in one epoch behind actrual alert.

    • Eric confirmed this was case. Getting triggering DSRs, but will. Eric will make a note and hope anomaly can be achieved without significant effort.

...

  • DMTN-118 says what Rubin is providing.

  • In addition to ZTF, have added eg different ways of looking at the position, timings, first detection, …. Not interested in periodic data.

  • Difficult to change the set of features since it would need rebuilt in the relational DB. What can be added to the LSST list?

  • There are potential Sherlock attributes.

  • External annotators: From lasair, a query pushes out a Kafka stream to the external annotator of candidates where the annotation could be run on. Results are sent back from the external annotator and taken into Lasair.

Databases and storage : SQL, Cassandra, CephFS (Ken Smith)

  • Nic Wolf (Antares team) joined the session to accompany Ken.

  • Galera/MariaDB/ Cassandra in Lasair (Need slides from Ken - messaged him)

  • DB dump is not sustainable in the future due to sheer number of rows etc. Galera offers replication where all nodes are equal. Reading/writing tasks can be distributed to the different nodes.

  • Galera well integrated with MariaDB but not so well with MySQL hence the MariaDB choice.

  • Many of the Galera tools are free including the cluster control interface.

  • Detections originally stored as files on CephFS but worried this will not scale.

  • Cassandra writes better than reads. Widely adopted. Other brokers use it eg. Antares and core LSST (see DMTN-184).

  • NoSQL - not only SQL according to Cassandra authors.

  • Have been operating since Feb 21 hence the problem with light curves (see Matt Nicholl talk) prior to this. This will be solved.

  • Decided to use Cassandra since over the 10 years there will be 30 billion by the end. Relational DB clunky at that level.

  • For Cassandra no advantage of SSD over spinning disk so may devote the SSD to the relational DB.

...

  • (need slides - messaged him)

  • Lasair team has introduced REST APIs using Django Rest Framework (looks to be defacto standard)

  • Effectively, these are machine-readable versions of functions provided (interactive) on webpages

    • /api/cone -

    • /api/query

  • Python wrapper “lasair” (available via PIP install) to help use API.

  • Plan to add support for querying Cassandra directly

  • Also have Jupyter Notebook examples, hosted on Google Colab

    • Need user account on Lasair to access

    • Ken provided live demo of cone-search notebook.

  • API throttled, based on different levels of token

    • Action taken in response to use who was submitting thousands of queries, and putting strain on service (now using more effiicient efficient watch-list approach).

      • Anonymouse Anonymous use limit to 10 calls-per-hour, 1000 rows per query, …

  • Eric B Q on zoom chat - “ how are you handling auth for the public kafka service?”

...

  • Different interfaces – need to prioritise

    • Webpage

    • Scripts (Python, primarily)

    • Other projects (website)

    • iDAC/ RSP interface – opportunity in UK, as IDAC is next to Lasair broker, but needs requirement analysis and design work

    • Topcat – need TAP

    • Personal storage – e.g., MyDB, VOSpace, …

  • Eric asked how Kafka authentication is being handled.

    • Roy noted hope not overwhelmed with requests [for credentials]

  • Ken noted experience of writing TAP service, which could be useful (Guy Rickson and Thomas Marquat (sp?)).

  • Jacob asked if annotation is same as a classification

    • Roy noted one type of annotation, but other classifications were possible.

  • Julien asked about usage split between REST and Kafka. (Q in zoom chat “how is the usage split between the REST API and Kafka? as an example in Fink, most of users use the API, and very few Kafka.”)

    • Roy noted that people are very conservative and stick to interfaces they know.

    • Julien hopes, over time, people will become familiar with new technologies, which will have benefits for users.

    • Andy notes absence of user tools is a barrier to uptake of Kafka. Generally speaking Kafka and Cassandra are not targetted to end users: they are behind the science, meaning astronomy is unusual.

    • Dave noted size of data to be handled. Kafka can handle a stream which would cause APIs to fall over.

    • Ken noted not intended to expose Cassandra to end users. Vision is to have a wraper (web page, for example) to hide Cassandra. PanSTARRS does something similar, for cone-search query.

1730: Discussion