SA13 master catalogue

Preparation of Legacy Survey data

The catalogue comes from dmu0_LegacySurvey.

In the catalogue, we keep:

  • The identifier (it's unique in the catalogue);
  • The position;
  • The stellarity;
  • The aperture fluxes. Are these aperture corrected?
  • The kron magnitude to be used as total magnitude (no “auto” magnitude is provided).

We don't know when the maps have been observed. We will use the year of the reference paper.

In [1]:
from herschelhelp_internal import git_version
print("This notebook was run with herschelhelp_internal version: \n{}".format(git_version()))
This notebook was run with herschelhelp_internal version: 
04829ed (Thu Nov 2 16:57:19 2017 +0000)
In [2]:
%matplotlib inline
#%config InlineBackend.figure_format = 'svg'

import matplotlib.pyplot as plt
plt.rc('figure', figsize=(10, 6))
plt.style.use('ggplot')

from collections import OrderedDict
import os

from astropy import units as u
from astropy import visualization as vis
from astropy.coordinates import SkyCoord
from astropy.table import Column, Table
import numpy as np

from herschelhelp_internal.flagging import  gaia_flag_column
from herschelhelp_internal.masterlist import nb_astcor_diag_plot, nb_plot_mag_ap_evol, \
    nb_plot_mag_vs_apcor, remove_duplicates
from herschelhelp_internal.utils import astrometric_correction, mag_to_flux, aperture_correction, flux_to_mag
In [3]:
OUT_DIR =  os.environ.get('TMP_DIR', "./data_tmp")
try:
    os.makedirs(OUT_DIR)
except FileExistsError:
    pass

RA_COL = "legacy_ra"
DEC_COL = "legacy_dec"
In [4]:
# Pritine LS catalogue
orig_legacy = Table.read("../../dmu0/dmu0_LegacySurvey/data/LegacySurvey-dr4_SA13.fits")
WARNING: UnitsWarning: '1/deg^2' did not parse as fits unit: Numeric factor not supported by FITS [astropy.units.core]
WARNING: UnitsWarning: 'nanomaggy' did not parse as fits unit: At col 0, Unit 'nanomaggy' not supported by the FITS standard.  [astropy.units.core]
WARNING: UnitsWarning: '1/nanomaggy^2' did not parse as fits unit: Numeric factor not supported by FITS [astropy.units.core]
WARNING: UnitsWarning: '1/arcsec^2' did not parse as fits unit: Numeric factor not supported by FITS [astropy.units.core]

I - Aperture correction

To compute aperture correction we need to dertermine two parametres: the target aperture and the range of magnitudes for the stars that will be used to compute the correction.

Target aperture: To determine the target aperture, we simulate a curve of growth using the provided apertures and draw two figures:

  • The evolution of the magnitudes of the objects by plotting on the same plot aperture number vs the mean magnitude.
  • The mean gain (loss when negative) of magnitude is each aperture compared to the previous (except for the first of course).

As target aperture, we should use the smallest (i.e. less noisy) aperture for which most of the flux is captures.

Magnitude range: To know what limits in aperture to use when doing the aperture correction, we plot for each magnitude bin the correction that is computed and its RMS. We should then use the wide limits (to use more stars) where the correction is stable and with few dispersion.

In [5]:
bands = ["g", "r", "z"]
apertures      = [0,      1,   2,   3,    4,   5,   6,   7] 
aperture_sizes = [0.5, 0.75, 1.0, 1.5,  2.0, 3.5, 5.0, 7.0] #arcsec aperture sizes

flux = {}
flux_errors ={}
magnitudes = {}
flux_errors ={}
magnitude_errors = {}
stellarities = {}

flux_to_mag_vect = np.vectorize(flux_to_mag)

for band in bands:
    flux[band] = np.transpose(np.array( orig_legacy["apflux_{}".format(band)], dtype=np.float )) 
    flux_errors[band] = np.transpose(np.array( orig_legacy["apflux_ivar_{}".format(band)], dtype=np.float  ))
    
    magnitudes[band], magnitude_errors[band] = flux_to_mag_vect(flux[band] * 3.631e-6 ,flux_errors[band] * 3.631e-6)
    
    stellarities[band] = np.full(len(orig_legacy),0., dtype='float32')
    stellarities[band][np.array( orig_legacy["type"]) == "PSF" ] = 1.
    
    # Some sources have an infinite magnitude
    mask = np.isinf(magnitudes[band])
    magnitudes[band][mask] = np.nan
    magnitude_errors[band][mask] = np.nan
    

    
    
mag_corr = {}
/opt/herschelhelp_internal/herschelhelp_internal/utils.py:76: RuntimeWarning: invalid value encountered in log10
  magnitudes = 2.5 * (23 - np.log10(fluxes)) - 48.6
/opt/herschelhelp_internal/herschelhelp_internal/utils.py:76: RuntimeWarning: divide by zero encountered in log10
  magnitudes = 2.5 * (23 - np.log10(fluxes)) - 48.6
/opt/herschelhelp_internal/herschelhelp_internal/utils.py:80: RuntimeWarning: invalid value encountered in double_scalars
  errors = 2.5 / np.log(10) * errors_on_fluxes / fluxes

I.a - g band

In [6]:
nb_plot_mag_ap_evol(magnitudes['g'], stellarities['g'], labels=apertures)

We will use aperture 5 as target.

In [7]:
nb_plot_mag_vs_apcor(magnitudes['g'][4], 
                     magnitudes['g'][5], 
                     stellarities['g'])

We will use magnitudes between 17.0 and 18.5

In [8]:
# Aperture correction
mag_corr['g'], num, std = aperture_correction(
    magnitudes['g'][4], magnitudes['g'][5], 
    stellarities['g'],
    mag_min=17.0, mag_max=18.5)
print("Aperture correction for g band:")
print("Correction: {}".format(mag_corr['g']))
print("Number of source used: {}".format(num))
print("RMS: {}".format(std))
Aperture correction for g band:
Correction: -0.15662539731054892
Number of source used: 49
RMS: 0.06093891487424599

I.b - r band

In [9]:
nb_plot_mag_ap_evol(magnitudes['r'], stellarities['r'], labels=apertures)

We will use aperture 5 as target.

In [10]:
nb_plot_mag_vs_apcor(magnitudes['r'][4], 
                     magnitudes['r'][5], 
                     stellarities['r'])

We use magnitudes between 16.5 and 18.

In [11]:
# Aperture correction
mag_corr['r'], num, std = aperture_correction(
    magnitudes['r'][4], magnitudes['r'][5], 
    stellarities['r'],
    mag_min=16.5, mag_max=18.5)
print("Aperture correction for r band:")
print("Correction: {}".format(mag_corr['r']))
print("Number of source used: {}".format(num))
print("RMS: {}".format(std))
Aperture correction for r band:
Correction: -0.06083346555517011
Number of source used: 108
RMS: 0.011034948483842937

I.c - z band

In [12]:
nb_plot_mag_ap_evol(magnitudes['z'], stellarities['z'], labels=apertures)

We will use aperture 5 as target.

In [13]:
nb_plot_mag_vs_apcor(magnitudes['z'][4], 
                     magnitudes['z'][4], 
                     stellarities['z'])

We use magnitudes between 16.0 and 17.5.

In [14]:
# Aperture correction
mag_corr['z'], num, std = aperture_correction(
    magnitudes['z'][4], magnitudes['z'][5], 
    stellarities['z'],
    mag_min=16.0, mag_max=17.5)
print("Aperture correction for z band:")
print("Correction: {}".format(mag_corr['z']))
print("Number of source used: {}".format(num))
print("RMS: {}".format(std))
Aperture correction for z band:
Correction: -0.03418313894711389
Number of source used: 95
RMS: 0.011261883161465066

II - Stellarity

Legacy Survey does not provide a 0 to 1 stellarity so we replace items flagged as PSF accpording to the following table:

\begin{equation*} P(star) = \frac{ \prod_{i} P(star)_i }{ \prod_{i} P(star)_i + \prod_{i} P(galaxy)_i } \end{equation*}

where $i$ is the band, and with using the same probabilities as UKDISS:

HSC flag UKIDSS flag Meaning P(star) P(galaxy) P(noise) P(saturated)
-9 Saturated 0.0 0.0 5.0 95.0
-3 Probable galaxy 25.0 70.0 5.0 0.0
-2 Probable star 70.0 25.0 5.0 0.0
0 -1 Star 90.0 5.0 5.0 0.0
0 Noise 5.0 5.0 90.0 0.0
1 +1 Galaxy 5.0 90.0 5.0 0.0
In [15]:
stellarities['g'][np.isclose(stellarities['g'], 1.)] = 0.9
stellarities['g'][np.isclose(stellarities['g'], 0.)] = 0.05
In [16]:
orig_legacy.add_column(Column(data=stellarities['g'], name="stellarity")) #Stelarites computed earlier

II - Column selection

In [17]:
imported_columns = OrderedDict({
        "objid": "legacy_id",
        "ra": "legacy_ra",
        "dec": "legacy_dec",
        "flux_g": "f_bass_g",
        "flux_ivar_g": "ferr_bass_g",
        "apflux_g": "f_ap_bass_g",
        "apflux_ivar_g": "ferr_ap_bass_g",
        "flux_r": "f_bass_r",
        "flux_ivar_r": "ferr_bass_r",
        "apflux_r": "f_ap_bass_r",
        "apflux_ivar_r": "ferr_ap_bass_r",
        "flux_z": "f_bass_z",
        "flux_ivar_z": "ferr_bass_z",
        "apflux_z": "f_ap_bass_z",
        "apflux_ivar_z": "ferr_ap_bass_z",
        "stellarity": "legacy_stellarity"
    })


catalogue = orig_legacy[list(imported_columns)]
for column in imported_columns:
    catalogue[column].name = imported_columns[column]

epoch = 2017

# Clean table metadata
catalogue.meta = None
In [18]:
# Adding flux and band-flag columns
for col in catalogue.colnames:
    if col.startswith('f_'):
        
        errcol = "ferr{}".format(col[1:])
        
        #First we take aperture 4 if it is an aperture flux
        if 'ap' in col:
            catalogue[col] = catalogue[col][:, 4] 
            catalogue[errcol] = catalogue[errcol][:, 4] 
            
        #Convert nanomaggies to uJy
        # 1 nanomaggy = 1.e-9 maggy
        # 1 maggy = 3631 Jy
        # 1 nanomaggy = 3.631×10-6 Jy
        catalogue[col] = catalogue[col] * 3.631 #* 1.e9
        catalogue[errcol] = (1/np.sqrt(catalogue[errcol])) * 3.631 #* 1.e9
        catalogue[col].unit = u.microjansky
        catalogue[errcol].unit = u.microjansky
        
        mag, magerror = flux_to_mag(np.array(catalogue[col])* 1.e-6, np.array(catalogue[errcol])* 1.e-6)
        mag[mag == np.inf] = np.nan #The very low fluxes yield infinite mags
        magerror[magerror == np.inf] = np.nan
        
        if 'ap' in col:
            mag += mag_corr[col[-1]]
            catalogue[col],catalogue[errcol] = mag_to_flux(mag,magerror)
        
        # Add magnitudes
        catalogue.add_column(Column(mag , name="m{}".format(col[1:])))
        catalogue.add_column(Column(magerror , name="m{}".format(errcol[1:])))
        
        
        # Band-flag column
        if 'ap' not in col:
            catalogue.add_column(Column(np.zeros(len(catalogue), dtype=bool), name="flag{}".format(col[1:])))
        
# TODO: Set to True the flag columns for fluxes that should not be used for SED fitting.
/opt/anaconda3/envs/herschelhelp_internal/lib/python3.6/site-packages/ipykernel/__main__.py:17: RuntimeWarning: divide by zero encountered in true_divide
/opt/herschelhelp_internal/herschelhelp_internal/utils.py:76: RuntimeWarning: divide by zero encountered in log10
  magnitudes = 2.5 * (23 - np.log10(fluxes)) - 48.6
/opt/herschelhelp_internal/herschelhelp_internal/utils.py:76: RuntimeWarning: invalid value encountered in log10
  magnitudes = 2.5 * (23 - np.log10(fluxes)) - 48.6
/opt/herschelhelp_internal/herschelhelp_internal/utils.py:80: RuntimeWarning: divide by zero encountered in true_divide
  errors = 2.5 / np.log(10) * errors_on_fluxes / fluxes
In [19]:
catalogue[:10].show_in_notebook()
Out[19]:
<Table length=10>
idxlegacy_idlegacy_ralegacy_decf_bass_gferr_bass_gf_ap_bass_gferr_ap_bass_gf_bass_rferr_bass_rf_ap_bass_rferr_ap_bass_rf_bass_zferr_bass_zf_ap_bass_zferr_ap_bass_zlegacy_stellaritym_bass_gmerr_bass_gflag_bass_gm_ap_bass_gmerr_ap_bass_gm_bass_rmerr_bass_rflag_bass_rm_ap_bass_rmerr_ap_bass_rm_bass_zmerr_bass_zflag_bass_zm_ap_bass_zmerr_ap_bass_z
degdeguJyuJyuJyuJyuJyuJy
00197.80688986942.6230990601-0.3220460.206981nannan-0.1552130.352215nannan32.09020.5952353.33756e-051.29737e-060.9nan-0.697808Falsenan-1.02523nan-2.46378Falsenan-0.60160220.13410.0201391False20.09140.0422046
11197.81085241542.62340163580.1016110.204789nannan0.1831730.3529295.95851e-073.51655e-0714.64580.5634931.45733e-051.21578e-060.926.38272.18822Falsenan-0.6665925.74292.09195False24.46220.64077120.98570.0417733False20.99110.0905776
22197.8104563642.62455279050.2700650.301339nannan0.3613060.4623542.08765e-073.51655e-0722.13420.8238521.68652e-051.28366e-060.0525.32131.21147Falsenan-1.6323625.00531.38939False25.60091.8288720.53730.0404119False20.83250.0826386
33197.80996265842.62216426660.3722590.2050021.07892e-063.30958e-070.07043470.3520064.98561e-073.51656e-0710.43870.5571729.22235e-061.18359e-060.924.97290.597911False23.81750.33304926.78055.4261False24.65570.76581621.35340.0579516False21.48790.139342
44197.98327362842.62318179982.479610.2782922.92783e-062.51719e-074.986350.4220894.62524e-063.18239e-0711.33430.7574371.10271e-051.29737e-060.0522.9140.121855False22.73360.093345522.15550.0919065False22.23720.074703921.2640.0725563False21.29380.12774
55197.98292211142.62485642660.5544530.207611.22263e-062.51719e-071.42970.3392051.80439e-063.18238e-075.990230.5330036.30741e-061.29738e-060.924.54030.406545False23.68180.22353523.51190.257598False23.25920.1914921.95640.0966073False21.90040.223326
66197.78471376842.624203482383.30940.5365929.48396e-053.30958e-07137.450.5760820.0001391683.14384e-07172.9250.7911050.0001759391.29737e-060.919.09830.00699318False18.95750.0037888518.55460.00455056False18.54110.002452718.30540.00496708False18.28660.00800619
77197.92144336442.62498746380.0infnannan6.713040.5877266.49972e-064.73811e-0710.38310.7605748.33476e-061.29737e-060.05nannanFalsenannan21.83270.0950561False21.86780.079147121.35920.0795314False21.59780.169004
88197.92228078642.6236523680.0infnannan2.717940.5783332.15447e-064.73814e-078.255150.7863719.24578e-061.29737e-060.05nannanFalsenannan22.81440.231027False23.06660.23877621.60820.103425False21.48510.152351
99197.85839081842.6240674463.912730.2897374.50884e-063.30958e-077.548490.4462487.61231e-063.51655e-079.37770.7796371.27619e-051.29737e-060.0522.41880.0803986False22.26480.079695121.70530.0641861False21.69620.050156221.46980.0902652False21.13520.110376

III - Removal of duplicated sources

We remove duplicated objects from the input catalogues.

In [20]:
SORT_COLS = [
        'merr_ap_bass_g', 'merr_ap_bass_r', 'merr_ap_bass_z']
FLAG_NAME = 'legacy_flag_cleaned'

nb_orig_sources = len(catalogue)

catalogue = remove_duplicates(
    catalogue, RA_COL, DEC_COL, 
    sort_col= SORT_COLS,
    flag_name=FLAG_NAME)

nb_sources = len(catalogue)

print("The initial catalogue had {} sources.".format(nb_orig_sources))
print("The cleaned catalogue has {} sources ({} removed).".format(nb_sources, nb_orig_sources - nb_sources))
print("The cleaned catalogue has {} sources flagged as having been cleaned".format(np.sum(catalogue[FLAG_NAME])))
The initial catalogue had 9367 sources.
The cleaned catalogue has 9258 sources (109 removed).
The cleaned catalogue has 109 sources flagged as having been cleaned

III - Astrometry correction

We match the astrometry to the Gaia one. We limit the Gaia catalogue to sources with a g band flux between the 30th and the 70th percentile. Some quick tests show that this give the lower dispersion in the results.

In [21]:
gaia = Table.read("../../dmu0/dmu0_GAIA/data/GAIA_SA13.fits")
gaia_coords = SkyCoord(gaia['ra'], gaia['dec'])
In [22]:
nb_astcor_diag_plot(catalogue[RA_COL], catalogue[DEC_COL], 
                    gaia_coords.ra, gaia_coords.dec)
In [23]:
delta_ra, delta_dec =  astrometric_correction(
    SkyCoord(catalogue[RA_COL], catalogue[DEC_COL]),
    gaia_coords
)

print("RA correction: {}".format(delta_ra))
print("Dec correction: {}".format(delta_dec))
RA correction: -0.0027332342597219395 arcsec
Dec correction: -0.0011730408644439194 arcsec
In [24]:
catalogue[RA_COL].unit = u.deg
catalogue[DEC_COL].unit = u.deg
catalogue[RA_COL] = catalogue[RA_COL] +  delta_ra.to(u.deg)
catalogue[DEC_COL] = catalogue[DEC_COL] +  delta_dec.to(u.deg)
In [25]:
nb_astcor_diag_plot(catalogue[RA_COL], catalogue[DEC_COL], 
                    gaia_coords.ra, gaia_coords.dec)

IV - Flagging Gaia objects

In [26]:
catalogue.add_column(
    gaia_flag_column(SkyCoord(catalogue[RA_COL], catalogue[DEC_COL]), epoch, gaia)
)
In [27]:
GAIA_FLAG_NAME = "legacy_flag_gaia"

catalogue['flag_gaia'].name = GAIA_FLAG_NAME
print("{} sources flagged.".format(np.sum(catalogue[GAIA_FLAG_NAME] > 0)))
569 sources flagged.

V - Saving to disk

In [28]:
catalogue.write("{}/LegacySurvey.fits".format(OUT_DIR), overwrite=True)