33 queries

– Now reduced to 18 queries by removing what seem to be repeats.

– Removed ORDER BY from all queries. In a realtime pipeline the order is always time.

– Added LIMIT 10 where necessary, to all queries for the purposes of timing.

– Many queries have pointless joins with the candidates table that can be done without (see below).

– Those that have “real” joins with the candidates I have approximated as object queries. A good example is looking for lightcurves that have detections in a specific narrow time interval.

– Number 4 is asking for lightcurves of 47 objects, which is more suitable for jupyter/API interface

– Should we simply make subqueries impossible, as it is so easy to abuse?

  • 1-- query on objects
    SELECT objectId,ncand
    FROM objects
    WHERE ncand > 20
    AND (latestgmag-maggmean)>2 LIMIT 10

    • Executed in 0.01 seconds

  • 2-- specific object
    SELECT objects.ncand, objects.ramean, objects.decmean, objects.jdmin, objects.maggmean,candidates.magpsf ,candidates.fid
    FROM candidates, objects
    WHERE objects.objectId = candidates.objectId
    AND candidates.objectId ="ZTF17aabdaaq"

    • Executed in 0.03 seconds

  • 3-- specific object
    SELECT *
    FROM objects objectId
    WHERE objectId="ZTF18acmzpbf"

    • Executed in 0.00 seconds

  • 4-- triple join with list of objectIds
    SELECT objects.objectId, candidates.jd - 2400000.5 AS ago, candidates.magpsf, candidates.sigmapsf, candidates.fid
    FROM candidates, objects, sherlock_classifications
    WHERE objects.objectId = candidates.objectId
    AND objects.primaryId = sherlock_classifications.transient_object_id
    AND objects.objectId in ('ZTF18aavqmxr', 'ZTF18aaiugdp', 'ZTF18aaxddex', 'ZTF18aawcbyu', 'ZTF18aavlgby', 'ZTF18aavfhlv', 'ZTF18aainceg', 'ZTF18aaveejk', 'ZTF18aauesbd', 'ZTF18aavelyy', 'ZTF18aavqlnk', 'ZTF18aavefdz', 'ZTF18aasnjav', 'ZTF18aaxjkcm', 'ZTF18aaudxvh', 'ZTF18aauqlxd', 'ZTF18aaxnbfs', 'ZTF18aaxpkty', 'ZTF18aaxumvv', 'ZTF18aaxvbtx', 'ZTF17aadsntf', 'ZTF18aayeepw', 'ZTF18aavghok', 'ZTF18aayotqj', 'ZTF18aaywedv', 'ZTF18aaymybb', 'ZTF18aazmegj', 'ZTF17aabumwe', 'ZTF18aazvjuy', 'ZTF18aazwddf', 'ZTF17aabupib', 'ZTF18aazwdak', 'ZTF17aaapvpp', 'ZTF17aaarqox', 'ZTF18abbmbys', 'ZTF18abcbqxj', 'ZTF18abcbvxc', 'ZTF18abckoxz', 'ZTF18abctttf', 'ZTF18abehhbp', 'ZTF17aabuxdi', 'ZTF18abjhbss', 'ZTF18abjlwzq', 'ZTF18absrlhv', 'ZTF18abbmenz', 'ZTF18acbwaxk', 'ZTF19aaqtpet')

    • Wants the light curves of all these objects. Takes more than 5 minutes. Is not using any information from the Sherlock table even though it is joined in. How about a lightcurve fetch in Jupyter:

    • objects = ['ZTF18aavqmxr', 'ZTF18aaiugdp', 'ZTF18aaxddex', 'ZTF18aawcbyu', 'ZTF18aavlgby', 'ZTF18aavfhlv', 'ZTF18aainceg', 'ZTF18aaveejk', 'ZTF18aauesbd', 'ZTF18aavelyy', 'ZTF18aavqlnk', 'ZTF18aavefdz', 'ZTF18aasnjav', 'ZTF18aaxjkcm', 'ZTF18aaudxvh', 'ZTF18aauqlxd', 'ZTF18aaxnbfs', 'ZTF18aaxpkty', 'ZTF18aaxumvv', 'ZTF18aaxvbtx', 'ZTF17aadsntf', 'ZTF18aayeepw', 'ZTF18aavghok', 'ZTF18aayotqj', 'ZTF18aaywedv', 'ZTF18aaymybb', 'ZTF18aazmegj', 'ZTF17aabumwe', 'ZTF18aazvjuy', 'ZTF18aazwddf', 'ZTF17aabupib', 'ZTF18aazwdak', 'ZTF17aaapvpp', 'ZTF17aaarqox', 'ZTF18abbmbys', 'ZTF18abcbqxj', 'ZTF18abcbvxc', 'ZTF18abckoxz', 'ZTF18abctttf', 'ZTF18abehhbp', 'ZTF17aabuxdi', 'ZTF18abjhbss', 'ZTF18abjlwzq', 'ZTF18absrlhv', 'ZTF18abbmenz', 'ZTF18acbwaxk', 'ZTF19aaqtpet']
      for all object in objects: print(lightcurve(object))

  • 5-- objects AND sherlock
    SELECT objects.objectId, objects.ramean, objects.decmean, objects.jdmin - 2400000.5 AS mjdmin, objects.jdmax - 2400000.5 AS mjdmax, objects.magrmin, latestrmag, sherlock_classifications.classification
    FROM objects, sherlock_classifications
    WHERE objects.primaryId = sherlock_classifications.transient_object_id
    AND objects.objectid='ZTF19abhpvfq'

    • executed in 0.03 seconds

  • 6-- query on candidates with specific position and time
    SELECT candid, mjd, ra, decl, magpsf, fid
    FROM candidates c
    WHERE c.ra BETWEEN 191.8212890625 AND 199.9951171875
    AND c.decl BETWEEN -27.826574830853872
    AND -13.82353192939064
    AND c.jd BETWEEN 2458581.5 AND 2458588.5
    AND c.drb >= 0.75

    • executed in 72 seconds, empty set

    • This alternate objects query gets the empty set in 16 seconds:

    • SELECT objectId, ramean, decmean
      FROM objects
      WHERE objects.ramean BETWEEN 191.8212890625 AND 199.9951171875
      AND objects.decmean BETWEEN -27.826574830853872 AND -13.82353192939064
      AND objects.jdmin < 2458581.5 AND objects.jdmax > 2458588.5
      AND objects.ncandgp > 2

Empty set (15.91 sec)

  • 7-- candidates and objects joined
    SELECT DISTINCT o.objectId
    FROM objects o, candidates c
    WHERE o.sherlock_classification IN ('NT')
    AND o.jdmin > JDNOW() - 100
    AND o.jdmin < JDNOW() - 0
    AND o.ncand > 3
    AND c.objectId = o.objectId
    AND (c.jd > JDNOW() - 100AND c.jd < JDNOW() - 0)
    AND c.magpsf < 20 AND c.drb >= 0.75 LIMIT 10

    • Takes more than 5 minutes. How about this instead which is 4 seconds

    • SELECT o.objectId FROM objects o WHERE o.sherlock_classification IN ('NT')
      AND o.jdmin > JDNOW() - 100 AND o.jdmin < JDNOW() - 0
      AND o.ncandgp > 3
      AND (o.maggmin < 20 OR o.maggmin < 20) LIMIT 10

  • 8-- objects and candidates joined
    SELECT objects.objectId, comments.content
    FROM objects, comments
    WHERE objects.objectId = comments.objectId
    AND comments.content LIKE '% iPTF16geu%'

    • Empty set in 0.05 seconds. Because the comments from Lasair Bot are TNS object names not PTF object names.

  • 9-- distinct objects AND candidates join
    SELECT DISTINCT o.objectId, o.ramean, o.decmean, o.jdmin - 2400000.5 AS mjdmin, o.jdmax - 2400000.5 AS mjdmax, o.magrmin, latestrmag, sherlock_classification,
    IF(c.distpsnr1 < 2 AND c.sgscore1 > 0.49, 'Within 2arcsec of PS1 star', 'Not Near PS1 star') score
    FROM objects o, candidates c
    WHERE o.sherlock_classification NOT IN ( 'AGN', 'BS')AND o.jdmin > JDNOW() - 4
    AND o.ncand > 3 AND c.objectId = o.objectId AND (c.jd > JDNOW() - 4) AND c.magpsf < 20 AND c.drb >= 0.75

    • executed in 7 seconds

  • 10-- TNS AND objects
    SELECT objectId,latestrmag, latestgmag,jdnow()-jdmax as delta, content
    FROM objects NATURAL JOIN comments
    WHERE content LIKE '%TNS%'
    AND jdmin > JDNOW() - 30 AND jdmin < JDNOW() - 0
    AND (jdmin-jdmax>1 or jdmin-jdmax<-1)
    AND ncand > 2
    AND (maggmin>latestgmag-0.3 OR magrmin>latestrmag-0.3)
    AND (latestgmag<17.2 or latestrmag<17.2)
    AND DECMEAN>-10
    AND (sherlock_classification='SN'or sherlock_classification='ORPHAN')

    • executed in 0.5 seconds

  • 11-- TNS AND objects
    SELECT objects.objectId, objects.latestrmag, objects.latestgmag, objects.jdmax - jdnow() AS delta, objects.sherlock_classification, comments.content
    FROM objects, comments
    WHERE objects.objectId = comments.objectId AND comments.content LIKE '%TNS%' LIMIT 10

    • executed in 0.0 seconds

  • 12-- subquery
    SELECT * FROM objects
    WHERE ncand >= 5 AND objectId in
    (SELECT objectId FROM candidates WHERE drb >= 0.75 AND sgscore1 < 0.4 AND jdstarthist > 2458563.7)

    • Takes a long time because of the massive subquery! Burns up temp storgae on database node!

    • This query does the same and runs in 2.4 seconds:

    • SELECT * FROM objects
      WHERE objects.ncandgp >= 5 AND objects.sgscore1 < 0.4 and objects.jdmin > 2458563.7
      LIMIT 10

  • 13-- subquery
    SELECT distinct objectId, ramean, decmean, mjdmin, mjdmax, magrmin, latestrmag, sherlock_classification, ncand
    FROM
    (SELECT o.objectId, c.jd as mjd, c.fid, c.magpsf, o.jdmin as mjdmin, jdmax as mjdmax,
    o.maggmin, o.maggmax, o.magrmin, o.magrmax, o.latestrmag, o.sherlock_classification,
    o.ramean, o.decmean, o.ncand FROM objects o NATURAL JOIN candidates c
    WHERE c.magpsf < 21
    AND c.jd > JDNOW() - 20
    AND o.jdmin > JDNOW() - 20
    AND o.jdmax > JDNOW() - 3
    AND o.ncand >= 3
    AND abs(glatmean) > 10.0
    AND o.sherlock_classification != 'VS'
    AND o.sherlock_classification != 'AGN'
    AND o.sherlock_classification != 'CV'
    AND o.sherlock_classification != 'BS'
    AND o.sherlock_classification != 'NT'
    AND c.rb >= 0.4 AND c.nbad = 0 AND c.fwhm <= 5 AND c.elong <= 1.2 AND abs(c.magdiff) <= 0.2 AND c.isdiffpos = 't'
    AND c.distnr > 5 ) temp ORDER BY ncand asc, mjdmax desc, mjdmin desc, magrmin desc

    • Even the subquery takes more than 5 minutes.

    • I wonder about the distnr condition, distnr is “distance to nearest source in reference image PSF-catalog within 30 arcsec [pixels]”

    • How about this instead which takes 24 seconds:

    • SELECT o.objectId, o.jdmin as mjdmin, jdmax as mjdmax,
      o.maggmin, o.maggmax, o.magrmin, o.magrmax, o.latestrmag, o.sherlock_classification,
      o.ramean, o.decmean, o.ncand FROM objects o
      WHERE (o.maggmin < 21 OR o.magrmin < 21)
      AND o.jdmin > JDNOW() - 20 AND o.jdmax > JDNOW() - 3
      AND o.ncandgp >= 3 AND abs(o.glatmean) > 10.0
      AND o.sherlock_classification != 'VS'
      AND o.sherlock_classification != 'AGN'
      AND o.sherlock_classification != 'CV'
      AND o.sherlock_classification != 'BS'
      AND o.sherlock_classification != 'NT'
      LIMIT 10

  • 14-- subquery
    SELECT distinct objectId, ramean, decmean, mjdmin, mjdmax, magrmin, latestrmag, sherlock_classification, ncand,separationArcsec
    FROM
    (SELECT o.objectId, c.jd as mjd, c.fid, c.magpsf, o.jdmin as mjdmin, jdmax as mjdmax,
    o.maggmin, o.maggmax, o.magrmin, o.magrmax, o.latestrmag, o.sherlock_classification,
    o.ramean, o.decmean, o.ncand, sc.separationArcsec FROM sherlock_crossmatches sc
    JOIN objects o ON sc.transient_object_id = o.primaryid NATURAL JOIN candidates c
    WHERE c.magpsf < 21
    AND c.jd > JDNOW() - 20
    AND o.jdmin > JDNOW() - 20
    AND o.jdmax > JDNOW() - 5
    AND o.ncand >= 2
    AND o.sherlock_classification NOT IN ('VS' , 'AGN', 'CV', 'BS','NT')
    AND c.rb >= 0.4 AND c.nbad = 0 AND c.fwhm <= 5 AND c.elong <= 1.2 AND abs(c.magdiff) <= 0.
    AND c.isdiffpos = 't'
    AND (sc.rank = 1 OR sc.rank=2)
    AND sc.catalogue_object_type = 'galaxy'
    AND o.sherlock_classification = 'SN'
    AND sc.separationArcsec > 0.5
    AND sc.separationArcsec < 100 ) temp

    • Why does this need to be in a subquery??

    • How about this one that takes 10 seconds:

    • SELECT o.objectId, o.jdmin as mjdmin, o.jdmax as mjdmax, o.maggmin, o.maggmax, o.magrmin, o.magrmax, o.latestrmag, o.sherlock_classification, o.ramean, o.decmean, o.ncand, sc.separationArcsec
      FROM sherlock_crossmatches sc JOIN objects o ON sc.transient_object_id = o.primaryid
      WHERE (o.maggmin < 21 OR o.magrmin < 21) AND o.jdmin > JDNOW() - 20 AND o.jdmax > JDNOW() - 5 AND o.ncandgp >= 2
      AND o.sherlock_classification NOT IN ('VS' , 'AGN', 'CV', 'BS','NT')
      AND (sc.rank = 1 OR sc.rank=2) AND sc.catalogue_object_type = 'galaxy'
      AND o.sherlock_classification = 'SN' AND sc.separationArcsec > 0.5 AND sc.separationArcsec < 100 LIMIT 10

  • 15-- triple join
    SELECT objects.objectId, jdnow()-jdmax as delta, comments.content
    FROM objects, comments, sherlock_classifications
    WHERE objects.objectId = comments.objectId AND objects.primaryId = sherlock_classifications.transient_object_id
    AND objects.jdmin>jdnow()-30 AND objects.jdmin<jdnow()-0
    AND (objects.jdmin-objects.jdmax>0.1 or objects.jdmin-objects.jdmax<-0.1)
    AND ncand>2 AND ncand<20 AND (maggmin>latestgmag-0.5 or magrmin>latestrmag-0.5) AND (latestgmag<17.3 or latestrmag<17.3)
    AND comments.content LIKE '%TNS%' AND comments.content LIKE '%AT20%'AND decmean>-20
    AND sherlock_classifications.classification NOT IN ("VS" , "AGN", "CV", "BS", "NT")

    • Executes in 1.6 seconds

  • 16-- quadruple join
    SELECT distinct objects.objectId, ramean, decmean, jdmin-2400000.5 AS mjdmin, jdmax-2400000.5 AS mjdmax, magrmin, latestrmag, sherlock_classifications.classification, ncand,
    (sherlock_crossmatches.distance*1000)TAN((sherlock_crossmatches.separationArcsec/3600)(PI()/180)) as sep
    FROM candidates, objects, sherlock_classifications, sherlock_crossmatches
    WHERE objects.objectId = candidates.objectId
    AND objects.primaryId = sherlock_classifications.transient_object_id
    AND objects.primaryId = sherlock_crossmatches.transient_object_id
    AND objects.jdmax > JDNOW() - 15.00000
    AND candidates.jd > JDNOW() - 15.00000
    AND candidates.magpsf < 21.5
    AND objects.ncand >= 2 AND objects.ncand < 5
    AND objects.latestgmag - objects.latestrmag < 0.2
    AND sherlock_classifications.classification NOT IN ("VS" , "AGN", "CV", "BS")
    AND candidates.drb >= 0.6
    AND objects.latestgmag - candidates.sgmag1 < 0.5
    AND candidates.isdiffpos = "t"
    AND (sherlock_crossmatches.rank = 1)
    AND sherlock_crossmatches.catalogue_object_type = "galaxy"
    AND sherlock_classifications.classification = "SN"

    • Executes in 43 seconds

    • This version avoids the join with the candidates table and takes only 0.7 seconds:

    • SELECT distinct objects.objectId, ramean, decmean, jdmin-2400000.5 AS mjdmin, jdmax-2400000.5 AS mjdmax, magrmin, latestrmag, sherlock_classifications.classification, ncand,
      (sherlock_crossmatches.distance*1000)TAN((sherlock_crossmatches.separationArcsec/3600)(PI()/180)) as sep
      FROM objects, sherlock_classifications, sherlock_crossmatches
      WHERE objects.primaryId = sherlock_classifications.transient_object_id
      AND objects.primaryId = sherlock_crossmatches.transient_object_id
      AND objects.jdmax > JDNOW() - 15.00000
      AND (objects.maggmin < 21 OR objects.magrmin < 21)
      AND objects.ncandgp >= 2 AND objects.ncand < 5
      AND objects.latestgmag - objects.latestrmag < 0.2
      AND sherlock_classifications.classification NOT IN ("VS" , "AGN", "CV", "BS")
      AND objects.latestgmag - objects.sgmag1 < 0.5
      AND (sherlock_crossmatches.rank = 1)
      AND sherlock_crossmatches.catalogue_object_type = "galaxy"
      AND sherlock_classifications.classification = "SN"

  • 17-- watchlist using sql
    SELECT jd,ra,decl FROM candidates
    WHERE candidates.jd > JDNOW() - 10.00000
    AND ( ((ra BETWEEN 190.44 AND 198.67) AND (decl BETWEEN -78.32 AND -76.49))
    OR ((ra BETWEEN 241.49 AND 243.95) AND (decl BETWEEN -40.41 AND -36.83))
    OR ((ra BETWEEN 239.60 AND 241.14) AND (decl BETWEEN -43.12 AND -40.71))
    OR ((ra BETWEEN 233.92 AND 237.04) AND (decl BETWEEN -35.91 AND -32.10))
    OR ((ra BETWEEN 190 AND 244 ) AND (decl BETWEEN -79 AND -32))
    OR ((ra BETWEEN 250.49 AND 251.90) AND (decl BETWEEN -22.89 AND -20.24))
    OR ((ra BETWEEN 244.63 AND 251.85) AND (decl BETWEEN -2614 AND -22.09))
    OR ((ra BETWEEN 244 AND 252 ) AND (decl BETWEEN -27 AND -20))
    OR ((ra BETWEEN 82.99 AND 86.21 ) AND (decl BETWEEN -10.22 AND -4.57))
    OR ((ra BETWEEN 85.06 AND 87.16 ) AND (decl BETWEEN -2.66 AND 0.93))
    OR ((ra BETWEEN 82 AND 88 ) AND (decl BETWEEN -11 AND 1))
    OR ((ra BETWEEN 50.66 AND 57.26 ) AND (decl BETWEEN 29.62 AND 33.16))
    OR ((ra BETWEEN 276.80 AND 277.84) AND (decl BETWEEN -0.55 AND 1.86))
    OR ((ra BETWEEN 61.85 AND 73.27 ) AND (decl BETWEEN 21.00 AND 30.02)))

    • Executes in 14 seconds

  • 18-- noncandidates
    SELECT objects.objectId, objects.ramean, objects.decmean, objects.jdmin - 2400000.5 AS mjdmin, objects.jdmax - 2400000.5 AS mjdmax, objects.magrmin, latestrmag, sherlock_classifications.classification,
    IF(candidates.distpsnr1 < 2 AND candidates.sgscore1 > 0.49, "Within 2 arcsec of PS1 star", "Not Near PS1 star") score
    FROM candidates, noncandidates, objects, sherlock_classifications
    WHERE objects.objectId = candidates.objectId
    AND objects.objectId = noncandidates.objectId
    AND objects.primaryId = sherlock_classifications.transient_object_id

    • Takes many minutes and does not finish … does not actually need either candidates or noncandidates table, this executes in 0.0 seconds:

  • SELECT objects.objectId, objects.ramean, objects.decmean, objects.jdmin - 2400000.5 AS mjdmin, objects.jdmax - 2400000.5 AS mjdmax, objects.magrmin, latestrmag, sherlock_classifications.classification,
    IF(objects.distpsnr1 < 2 AND objects.sgscore1 > 0.49, "Within 2 arcsec of PS1 star", "Not Near PS1 star") score
    FROM objects, sherlock_classifications
    WHERE objects.primaryId = sherlock_classifications.transient_object_id
    LIMIT 10

If you require this document in an alternative format, please contact the LSST:UK Project Managers lusc_pm@mlist.is.ed.ac.uk or phone +44 131 651 3577