33 queries
– Now reduced to 18 queries by removing what seem to be repeats.
– Removed ORDER BY from all queries. In a realtime pipeline the order is always time.
– Added LIMIT 10 where necessary, to all queries for the purposes of timing.
– Many queries have pointless joins with the candidates table that can be done without (see below).
– Those that have “real” joins with the candidates I have approximated as object queries. A good example is looking for lightcurves that have detections in a specific narrow time interval.
– Number 4 is asking for lightcurves of 47 objects, which is more suitable for jupyter/API interface
– Should we simply make subqueries impossible, as it is so easy to abuse?
1-- query on objects
SELECT objectId,ncand
FROM objects
WHERE ncand > 20
AND (latestgmag-maggmean)>2 LIMIT 10
Executed in 0.01 seconds
2-- specific object
SELECT objects.ncand, objects.ramean, objects.decmean, objects.jdmin, objects.maggmean,candidates.magpsf ,candidates.fid
FROM candidates, objects
WHERE objects.objectId = candidates.objectId
AND candidates.objectId ="ZTF17aabdaaq"
Executed in 0.03 seconds
3-- specific object
SELECT *
FROM objects objectId
WHERE objectId="ZTF18acmzpbf"
Executed in 0.00 seconds
4-- triple join with list of objectIds
SELECT objects.objectId, candidates.jd - 2400000.5 AS ago, candidates.magpsf, candidates.sigmapsf, candidates.fid
FROM candidates, objects, sherlock_classifications
WHERE objects.objectId = candidates.objectId
AND objects.primaryId = sherlock_classifications.transient_object_id
AND objects.objectId in ('ZTF18aavqmxr', 'ZTF18aaiugdp', 'ZTF18aaxddex', 'ZTF18aawcbyu', 'ZTF18aavlgby', 'ZTF18aavfhlv', 'ZTF18aainceg', 'ZTF18aaveejk', 'ZTF18aauesbd', 'ZTF18aavelyy', 'ZTF18aavqlnk', 'ZTF18aavefdz', 'ZTF18aasnjav', 'ZTF18aaxjkcm', 'ZTF18aaudxvh', 'ZTF18aauqlxd', 'ZTF18aaxnbfs', 'ZTF18aaxpkty', 'ZTF18aaxumvv', 'ZTF18aaxvbtx', 'ZTF17aadsntf', 'ZTF18aayeepw', 'ZTF18aavghok', 'ZTF18aayotqj', 'ZTF18aaywedv', 'ZTF18aaymybb', 'ZTF18aazmegj', 'ZTF17aabumwe', 'ZTF18aazvjuy', 'ZTF18aazwddf', 'ZTF17aabupib', 'ZTF18aazwdak', 'ZTF17aaapvpp', 'ZTF17aaarqox', 'ZTF18abbmbys', 'ZTF18abcbqxj', 'ZTF18abcbvxc', 'ZTF18abckoxz', 'ZTF18abctttf', 'ZTF18abehhbp', 'ZTF17aabuxdi', 'ZTF18abjhbss', 'ZTF18abjlwzq', 'ZTF18absrlhv', 'ZTF18abbmenz', 'ZTF18acbwaxk', 'ZTF19aaqtpet')
Wants the light curves of all these objects. Takes more than 5 minutes. Is not using any information from the Sherlock table even though it is joined in. How about a lightcurve fetch in Jupyter:
objects = ['ZTF18aavqmxr', 'ZTF18aaiugdp', 'ZTF18aaxddex', 'ZTF18aawcbyu', 'ZTF18aavlgby', 'ZTF18aavfhlv', 'ZTF18aainceg', 'ZTF18aaveejk', 'ZTF18aauesbd', 'ZTF18aavelyy', 'ZTF18aavqlnk', 'ZTF18aavefdz', 'ZTF18aasnjav', 'ZTF18aaxjkcm', 'ZTF18aaudxvh', 'ZTF18aauqlxd', 'ZTF18aaxnbfs', 'ZTF18aaxpkty', 'ZTF18aaxumvv', 'ZTF18aaxvbtx', 'ZTF17aadsntf', 'ZTF18aayeepw', 'ZTF18aavghok', 'ZTF18aayotqj', 'ZTF18aaywedv', 'ZTF18aaymybb', 'ZTF18aazmegj', 'ZTF17aabumwe', 'ZTF18aazvjuy', 'ZTF18aazwddf', 'ZTF17aabupib', 'ZTF18aazwdak', 'ZTF17aaapvpp', 'ZTF17aaarqox', 'ZTF18abbmbys', 'ZTF18abcbqxj', 'ZTF18abcbvxc', 'ZTF18abckoxz', 'ZTF18abctttf', 'ZTF18abehhbp', 'ZTF17aabuxdi', 'ZTF18abjhbss', 'ZTF18abjlwzq', 'ZTF18absrlhv', 'ZTF18abbmenz', 'ZTF18acbwaxk', 'ZTF19aaqtpet']
for all object in objects: print(lightcurve(object))
5-- objects AND sherlock
SELECT objects.objectId, objects.ramean, objects.decmean, objects.jdmin - 2400000.5 AS mjdmin, objects.jdmax - 2400000.5 AS mjdmax, objects.magrmin, latestrmag, sherlock_classifications.classification
FROM objects, sherlock_classifications
WHERE objects.primaryId = sherlock_classifications.transient_object_id
AND objects.objectid='ZTF19abhpvfq'
executed in 0.03 seconds
6-- query on candidates with specific position and time
SELECT candid, mjd, ra, decl, magpsf, fid
FROM candidates c
WHERE c.ra BETWEEN 191.8212890625 AND 199.9951171875
AND c.decl BETWEEN -27.826574830853872
AND -13.82353192939064
AND c.jd BETWEEN 2458581.5 AND 2458588.5
AND c.drb >= 0.75
executed in 72 seconds, empty set
This alternate objects query gets the empty set in 16 seconds:
SELECT objectId, ramean, decmean
FROM objects
WHERE objects.ramean BETWEEN 191.8212890625 AND 199.9951171875
AND objects.decmean BETWEEN -27.826574830853872 AND -13.82353192939064
AND objects.jdmin < 2458581.5 AND objects.jdmax > 2458588.5
AND objects.ncandgp > 2
Empty set (15.91 sec)
7-- candidates and objects joined
SELECT DISTINCT o.objectId
FROM objects o, candidates c
WHERE o.sherlock_classification IN ('NT')
AND o.jdmin > JDNOW() - 100
AND o.jdmin < JDNOW() - 0
AND o.ncand > 3
AND c.objectId = o.objectId
AND (c.jd > JDNOW() - 100AND c.jd < JDNOW() - 0)
AND c.magpsf < 20 AND c.drb >= 0.75 LIMIT 10
Takes more than 5 minutes. How about this instead which is 4 seconds
SELECT o.objectId FROM objects o WHERE o.sherlock_classification IN ('NT')
AND o.jdmin > JDNOW() - 100 AND o.jdmin < JDNOW() - 0
AND o.ncandgp > 3
AND (o.maggmin < 20 OR o.maggmin < 20) LIMIT 10
8-- objects and candidates joined
SELECT objects.objectId, comments.content
FROM objects, comments
WHERE objects.objectId = comments.objectId
AND comments.content LIKE '% iPTF16geu%'
Empty set in 0.05 seconds. Because the comments from Lasair Bot are TNS object names not PTF object names.
9-- distinct objects AND candidates join
SELECT DISTINCT o.objectId, o.ramean, o.decmean, o.jdmin - 2400000.5 AS mjdmin, o.jdmax - 2400000.5 AS mjdmax, o.magrmin, latestrmag, sherlock_classification,
IF(c.distpsnr1 < 2 AND c.sgscore1 > 0.49, 'Within 2arcsec of PS1 star', 'Not Near PS1 star') score
FROM objects o, candidates c
WHERE o.sherlock_classification NOT IN ( 'AGN', 'BS')AND o.jdmin > JDNOW() - 4
AND o.ncand > 3 AND c.objectId = o.objectId AND (c.jd > JDNOW() - 4) AND c.magpsf < 20 AND c.drb >= 0.75
executed in 7 seconds
10-- TNS AND objects
SELECT objectId,latestrmag, latestgmag,jdnow()-jdmax as delta, content
FROM objects NATURAL JOIN comments
WHERE content LIKE '%TNS%'
AND jdmin > JDNOW() - 30 AND jdmin < JDNOW() - 0
AND (jdmin-jdmax>1 or jdmin-jdmax<-1)
AND ncand > 2
AND (maggmin>latestgmag-0.3 OR magrmin>latestrmag-0.3)
AND (latestgmag<17.2 or latestrmag<17.2)
AND DECMEAN>-10
AND (sherlock_classification='SN'or sherlock_classification='ORPHAN')
executed in 0.5 seconds
11-- TNS AND objects
SELECT objects.objectId, objects.latestrmag, objects.latestgmag, objects.jdmax - jdnow() AS delta, objects.sherlock_classification, comments.content
FROM objects, comments
WHERE objects.objectId = comments.objectId AND comments.content LIKE '%TNS%' LIMIT 10
executed in 0.0 seconds
12-- subquery
SELECT * FROM objects
WHERE ncand >= 5 AND objectId in
(SELECT objectId FROM candidates WHERE drb >= 0.75 AND sgscore1 < 0.4 AND jdstarthist > 2458563.7)
Takes a long time because of the massive subquery! Burns up temp storgae on database node!
This query does the same and runs in 2.4 seconds:
SELECT * FROM objects
WHERE objects.ncandgp >= 5 AND objects.sgscore1 < 0.4 and objects.jdmin > 2458563.7
LIMIT 10
13-- subquery
SELECT distinct objectId, ramean, decmean, mjdmin, mjdmax, magrmin, latestrmag, sherlock_classification, ncand
FROM
(SELECT o.objectId, c.jd as mjd, c.fid, c.magpsf, o.jdmin as mjdmin, jdmax as mjdmax,
o.maggmin, o.maggmax, o.magrmin, o.magrmax, o.latestrmag, o.sherlock_classification,
o.ramean, o.decmean, o.ncand FROM objects o NATURAL JOIN candidates c
WHERE c.magpsf < 21
AND c.jd > JDNOW() - 20
AND o.jdmin > JDNOW() - 20
AND o.jdmax > JDNOW() - 3
AND o.ncand >= 3
AND abs(glatmean) > 10.0
AND o.sherlock_classification != 'VS'
AND o.sherlock_classification != 'AGN'
AND o.sherlock_classification != 'CV'
AND o.sherlock_classification != 'BS'
AND o.sherlock_classification != 'NT'
AND c.rb >= 0.4 AND c.nbad = 0 AND c.fwhm <= 5 AND c.elong <= 1.2 AND abs(c.magdiff) <= 0.2 AND c.isdiffpos = 't'
AND c.distnr > 5 ) temp ORDER BY ncand asc, mjdmax desc, mjdmin desc, magrmin desc
Even the subquery takes more than 5 minutes.
I wonder about the
distnr
condition,distnr
is “distance to nearest source in reference image PSF-catalog within 30 arcsec [pixels]”How about this instead which takes 24 seconds:
SELECT o.objectId, o.jdmin as mjdmin, jdmax as mjdmax,
o.maggmin, o.maggmax, o.magrmin, o.magrmax, o.latestrmag, o.sherlock_classification,
o.ramean, o.decmean, o.ncand FROM objects o
WHERE (o.maggmin < 21 OR o.magrmin < 21)
AND o.jdmin > JDNOW() - 20 AND o.jdmax > JDNOW() - 3
AND o.ncandgp >= 3 AND abs(o.glatmean) > 10.0
AND o.sherlock_classification != 'VS'
AND o.sherlock_classification != 'AGN'
AND o.sherlock_classification != 'CV'
AND o.sherlock_classification != 'BS'
AND o.sherlock_classification != 'NT'
LIMIT 10
14-- subquery
SELECT distinct objectId, ramean, decmean, mjdmin, mjdmax, magrmin, latestrmag, sherlock_classification, ncand,separationArcsec
FROM
(SELECT o.objectId, c.jd as mjd, c.fid, c.magpsf, o.jdmin as mjdmin, jdmax as mjdmax,
o.maggmin, o.maggmax, o.magrmin, o.magrmax, o.latestrmag, o.sherlock_classification,
o.ramean, o.decmean, o.ncand, sc.separationArcsec FROM sherlock_crossmatches sc
JOIN objects o ON sc.transient_object_id = o.primaryid NATURAL JOIN candidates c
WHERE c.magpsf < 21
AND c.jd > JDNOW() - 20
AND o.jdmin > JDNOW() - 20
AND o.jdmax > JDNOW() - 5
AND o.ncand >= 2
AND o.sherlock_classification NOT IN ('VS' , 'AGN', 'CV', 'BS','NT')
AND c.rb >= 0.4 AND c.nbad = 0 AND c.fwhm <= 5 AND c.elong <= 1.2 AND abs(c.magdiff) <= 0.
AND c.isdiffpos = 't'
AND (sc.rank = 1 OR sc.rank=2)
AND sc.catalogue_object_type = 'galaxy'
AND o.sherlock_classification = 'SN'
AND sc.separationArcsec > 0.5
AND sc.separationArcsec < 100 ) temp
Why does this need to be in a subquery??
How about this one that takes 10 seconds:
SELECT o.objectId, o.jdmin as mjdmin, o.jdmax as mjdmax, o.maggmin, o.maggmax, o.magrmin, o.magrmax, o.latestrmag, o.sherlock_classification, o.ramean, o.decmean, o.ncand, sc.separationArcsec
FROM sherlock_crossmatches sc JOIN objects o ON sc.transient_object_id = o.primaryid
WHERE (o.maggmin < 21 OR o.magrmin < 21) AND o.jdmin > JDNOW() - 20 AND o.jdmax > JDNOW() - 5 AND o.ncandgp >= 2
AND o.sherlock_classification NOT IN ('VS' , 'AGN', 'CV', 'BS','NT')
AND (sc.rank = 1 OR sc.rank=2) AND sc.catalogue_object_type = 'galaxy'
AND o.sherlock_classification = 'SN' AND sc.separationArcsec > 0.5 AND sc.separationArcsec < 100 LIMIT 10
15-- triple join
SELECT objects.objectId, jdnow()-jdmax as delta, comments.content
FROM objects, comments, sherlock_classifications
WHERE objects.objectId = comments.objectId AND objects.primaryId = sherlock_classifications.transient_object_id
AND objects.jdmin>jdnow()-30 AND objects.jdmin<jdnow()-0
AND (objects.jdmin-objects.jdmax>0.1 or objects.jdmin-objects.jdmax<-0.1)
AND ncand>2 AND ncand<20 AND (maggmin>latestgmag-0.5 or magrmin>latestrmag-0.5) AND (latestgmag<17.3 or latestrmag<17.3)
AND comments.content LIKE '%TNS%' AND comments.content LIKE '%AT20%'AND decmean>-20
AND sherlock_classifications.classification NOT IN ("VS" , "AGN", "CV", "BS", "NT")
Executes in 1.6 seconds
16-- quadruple join
SELECT distinct objects.objectId, ramean, decmean, jdmin-2400000.5 AS mjdmin, jdmax-2400000.5 AS mjdmax, magrmin, latestrmag, sherlock_classifications.classification, ncand,
(sherlock_crossmatches.distance*1000)TAN((sherlock_crossmatches.separationArcsec/3600)(PI()/180)) as sep
FROM candidates, objects, sherlock_classifications, sherlock_crossmatches
WHERE objects.objectId = candidates.objectId
AND objects.primaryId = sherlock_classifications.transient_object_id
AND objects.primaryId = sherlock_crossmatches.transient_object_id
AND objects.jdmax > JDNOW() - 15.00000
AND candidates.jd > JDNOW() - 15.00000
AND candidates.magpsf < 21.5
AND objects.ncand >= 2 AND objects.ncand < 5
AND objects.latestgmag - objects.latestrmag < 0.2
AND sherlock_classifications.classification NOT IN ("VS" , "AGN", "CV", "BS")
AND candidates.drb >= 0.6
AND objects.latestgmag - candidates.sgmag1 < 0.5
AND candidates.isdiffpos = "t"
AND (sherlock_crossmatches.rank = 1)
AND sherlock_crossmatches.catalogue_object_type = "galaxy"
AND sherlock_classifications.classification = "SN"
Executes in 43 seconds
This version avoids the join with the candidates table and takes only 0.7 seconds:
SELECT distinct objects.objectId, ramean, decmean, jdmin-2400000.5 AS mjdmin, jdmax-2400000.5 AS mjdmax, magrmin, latestrmag, sherlock_classifications.classification, ncand,
(sherlock_crossmatches.distance*1000)TAN((sherlock_crossmatches.separationArcsec/3600)(PI()/180)) as sep
FROM objects, sherlock_classifications, sherlock_crossmatches
WHERE objects.primaryId = sherlock_classifications.transient_object_id
AND objects.primaryId = sherlock_crossmatches.transient_object_id
AND objects.jdmax > JDNOW() - 15.00000
AND (objects.maggmin < 21 OR objects.magrmin < 21)
AND objects.ncandgp >= 2 AND objects.ncand < 5
AND objects.latestgmag - objects.latestrmag < 0.2
AND sherlock_classifications.classification NOT IN ("VS" , "AGN", "CV", "BS")
AND objects.latestgmag - objects.sgmag1 < 0.5
AND (sherlock_crossmatches.rank = 1)
AND sherlock_crossmatches.catalogue_object_type = "galaxy"
AND sherlock_classifications.classification = "SN"
17-- watchlist using sql
SELECT jd,ra,decl FROM candidates
WHERE candidates.jd > JDNOW() - 10.00000
AND ( ((ra BETWEEN 190.44 AND 198.67) AND (decl BETWEEN -78.32 AND -76.49))
OR ((ra BETWEEN 241.49 AND 243.95) AND (decl BETWEEN -40.41 AND -36.83))
OR ((ra BETWEEN 239.60 AND 241.14) AND (decl BETWEEN -43.12 AND -40.71))
OR ((ra BETWEEN 233.92 AND 237.04) AND (decl BETWEEN -35.91 AND -32.10))
OR ((ra BETWEEN 190 AND 244 ) AND (decl BETWEEN -79 AND -32))
OR ((ra BETWEEN 250.49 AND 251.90) AND (decl BETWEEN -22.89 AND -20.24))
OR ((ra BETWEEN 244.63 AND 251.85) AND (decl BETWEEN -2614 AND -22.09))
OR ((ra BETWEEN 244 AND 252 ) AND (decl BETWEEN -27 AND -20))
OR ((ra BETWEEN 82.99 AND 86.21 ) AND (decl BETWEEN -10.22 AND -4.57))
OR ((ra BETWEEN 85.06 AND 87.16 ) AND (decl BETWEEN -2.66 AND 0.93))
OR ((ra BETWEEN 82 AND 88 ) AND (decl BETWEEN -11 AND 1))
OR ((ra BETWEEN 50.66 AND 57.26 ) AND (decl BETWEEN 29.62 AND 33.16))
OR ((ra BETWEEN 276.80 AND 277.84) AND (decl BETWEEN -0.55 AND 1.86))
OR ((ra BETWEEN 61.85 AND 73.27 ) AND (decl BETWEEN 21.00 AND 30.02)))
Executes in 14 seconds
18-- noncandidates
SELECT objects.objectId, objects.ramean, objects.decmean, objects.jdmin - 2400000.5 AS mjdmin, objects.jdmax - 2400000.5 AS mjdmax, objects.magrmin, latestrmag, sherlock_classifications.classification,
IF(candidates.distpsnr1 < 2 AND candidates.sgscore1 > 0.49, "Within 2 arcsec of PS1 star", "Not Near PS1 star") score
FROM candidates, noncandidates, objects, sherlock_classifications
WHERE objects.objectId = candidates.objectId
AND objects.objectId = noncandidates.objectId
AND objects.primaryId = sherlock_classifications.transient_object_id
Takes many minutes and does not finish … does not actually need either candidates or noncandidates table, this executes in 0.0 seconds:
SELECT objects.objectId, objects.ramean, objects.decmean, objects.jdmin - 2400000.5 AS mjdmin, objects.jdmax - 2400000.5 AS mjdmax, objects.magrmin, latestrmag, sherlock_classifications.classification,
IF(objects.distpsnr1 < 2 AND objects.sgscore1 > 0.49, "Within 2 arcsec of PS1 star", "Not Near PS1 star") score
FROM objects, sherlock_classifications
WHERE objects.primaryId = sherlock_classifications.transient_object_id
LIMIT 10
If you require this document in an alternative format, please contact the LSST:UK Project Managers lusc_pm@mlist.is.ed.ac.uk or phone +44 131 651 3577