Page Comparison

This is mostly Roy’s key questions, with some additional things and a little re-phrasing. Ideally we should turn each of these into either a decision; or an activity in Cycle 2; or something we punt downstream to Cycle 3 or later.

Testing
- Should we make our own simulated alert stream? No for now
Sherlock
- Should we incorporate classification by RAPID or similar?
  - No but make sure infrastructure can handle additional classifiers
- Can we get degree of confidence in classification? No for now
- Should we make it simpler to add new catalogues to Sherlock?
  - Yes but not urgent. Not so easy. Each catalog is handmade codeNo, we should not design Sherlock 3.0 for users to add catalogues. It is not trivial, catalogues can be proposed to us for inclusion.
Science Driver Issues
- Should we engage Science Working Group or others in prioritisation of functionality
  - E,g,, should we ignore solar system objects in Lasair? (size..)
    - Claim: SS could be Low effort, high return. Nobody else doing it?
- Is there a science case for stellar transients in Lasair? (size..)
  - Survey the community: what science is alerts and what is data releases?
  - Stephen will do survey
- Should we have separate DBs for different science?
  - Understand how to have multiple databases/schema/website
- What science areas are other Community Brokers planning to address?
  - Dave Y and Stephen volunteered to try to find out.
Science Platform
- Use Firefly as GUI? Or just embody ideas? ADQL, form based query
  - Not now but keep it in mind
- Should we incorporate Nublado into Lasair?
  - Yes when its ready. Small difference from Jupyterhub. Will provide LSST images.
- Should we set up a TAP service for Lasair? Could put Topcat in front.
  - Yes as spinoff from LSP (Stelios)
Queries
- Do we restrict queries, and if so how? Vizier-type form? Parse SQL like WSA, VSA?
  - Yes: Form as default, freeform for advanced users with review/optimise process
  - Must also consider API/Jupyter queries
  - Review and propose the new system (Gareth and Andy)
  - 3 layers: Form + SQL + Jupyter
- Do we offer filtering on the Kafka stream to users? build queries in KSQL?
- If so, do we also keep the close connection between static and streaming queries?
  - Yes if possible
Relational database
- Are light curves in there? Is forced phot in there?Is there just one RDBMS or several? (see Sci Drivers) Should Sources be kept in blob storage, leaving a lean relational database for Objects?
  - Ken build “blob store” with NoSQL and compare with CephFS
  - Need to build a data-mining API to this store, with query mechanism
- Is there just one RDBMS or several? (see above)
- Can we make a decision now on Cassandra or other noSQL?
  - Not for relational database
- Survey science group for features of light curves
- Build set of representative queries that need to be fast with testrig
Hardware
- Do we plan to use SSDs or not?
  - For relational db, not for blob store
- Can we get openstack nodes with ssd? How many, how much? Or even spinning disk
  - Tell Mark, George about implementation plans (Gareth)
  - IRIS/Cambridge doing this for SKA
Watchlists and user data
- Measure scalability of watchlists(Roy)
- Subsets of the MIllion Quasar catalog from Vizier, with Source radius = 2 arcsec, Matched against 2 million objects (instructions)
  50 sources, upload 0.
- How easy to add a new catalog to sherlock?
Kafka Inside
- 3 sec, crossmatch 1 sec
  1000 sources, upload 2 sec, crossmatch 4 sec
  10000 sources, upload 19 sec, crossmatch 47 sec
  100000 sources, upload 270 sec, crossmatch 530 sec
  1000000 sources, browser cannot paste it,
“Kafka Inside” or not
- Should we base architecture around Kafka vs http, scp?
- Should we make a test version with Kafka inside?
Blob storage
- Are images in there? Or just the whole AVRO packet as received?
- Blob store implmentation as CephFS or SWIFT or Cassandra?
- - Yes. Two versions of Sherlock server. DaveM will do Kafka version. Roy will do http version.
  - DaveM, Gareth, Roy will meet for technical meeting
Service resilience
- What are the changes we should make to improve resilience of Lasair?
  - LSST Targets for rebuild from backup
  - Resilience plan will be in Community Broker proposal
    - Database replication and hot spare
    - Easy re-deployment by containers and kubernetes

Versions Compared

Old Version 3

New Version Current

Key