Lasair Tech Review: Andy's version of key Qs

This is mostly Roy’s key questions, with some additional things and a little re-phrasing. Ideally we should turn each of these into either a decision; or an activity in Cycle 2; or something we punt downstream to Cycle 3 or later.

Testing
- Should we make our own simulated alert stream? No for now
Sherlock
- Should we incorporate classification by RAPID or similar?
  - No but make sure infrastructure can handle additional classifiers
- Can we get degree of confidence in classification? No for now
- Should we make it simpler to add new catalogues to Sherlock?
  - No, we should not design Sherlock 3.0 for users to add catalogues. It is not trivial, catalogues can be proposed to us for inclusion.
Science Driver Issues
- Should we engage Science Working Group or others in prioritisation of functionality
  - E,g,, should we ignore solar system objects in Lasair? (size..)
    - Claim: SS could be Low effort, high return. Nobody else doing it?
- Is there a science case for stellar transients in Lasair? (size..)
  - Survey the community: what science is alerts and what is data releases?
  - Stephen will do survey
- Should we have separate DBs for different science?
  - Understand how to have multiple databases/schema/website
- What science areas are other Community Brokers planning to address?
  - Dave Y and Stephen volunteered to try to find out.
Science Platform
- Use Firefly as GUI? Or just embody ideas? ADQL, form based query
  - Not now but keep it in mind
- Should we incorporate Nublado into Lasair?
  - Yes when its ready. Small difference from Jupyterhub. Will provide LSST images.
- Should we set up a TAP service for Lasair? Could put Topcat in front.
  - Yes as spinoff from LSP (Stelios)
Queries
- Do we restrict queries, and if so how? Vizier-type form? Parse SQL like WSA, VSA?
  - Yes: Form as default, freeform for advanced users with review/optimise process
  - Must also consider API/Jupyter queries
  - Review and propose the new system (Gareth and Andy)
  - 3 layers: Form + SQL + Jupyter
- Do we offer filtering on the Kafka stream to users? build queries in KSQL?
- If so, do we also keep the close connection between static and streaming queries?
Relational database
- Are light curves in there? Is forced phot in there?
- Is there just one RDBMS or several? (see Sci Drivers)
  - Should Sources be kept in blob storage, leaving a lean relational database for Objects?
- Can we make a decision now on Cassandra or other noSQL?
Hardware
- Do we plan to use SSDs or not?
- Can we get openstack nodes with ssd? How many, how much?
Watchlists and user data
- Measure scalability of watchlists.
- How easy to add a new catalog to sherlock?
Kafka Inside or not
- Should we base architecture around Kafka vs http, scp?
- Should we make a test version with Kafka inside?
Blob storage
- Are images in there? Or just the whole AVRO packet as received?
- Blob store implmentation as CephFS or SWIFT or Cassandra?
Service resilience
- What are the changes we should make to improve resilience of Lasair?