Project Description

Collaborative Research
A Comprehensive Multi-wavelength Picture of Black Hole Growth Over Cosmic Time

Introduction to Current Understanding of Cosmic Black Hole Growth

Most galaxies have a supermassive black hole at their centers, ranging in mass from 105 to 1010 M. The growth of these black holes, particularly in massive galaxies, represents an important aspect of cosmic evolution, as black hole growth is predicted to regulate star formation and bulge growth through various feedback mechanisms (e.g. Kormendy & Ho 2013). That is, supermassive black holes may be the linchpin in the evolution of massive galaxies. Some 109 M black holes lie at z~6, very early in the development of galaxies (Fan 2006, Mortlock et al. 2011, Venemans et al. 2013), implying they either had to grow at super-Eddington rates (Haiman 2013, Madau et al. 2014) or start from larger seed black holes (Lodato & Natarajan 2006, Volonteri & Bellovary 2012). Most of the known high-redshift (z>5) quasars were found in optical surveys, especially the Sloan Digital Sky Survey (SDSS; Fan et al. 2003, Richards et al. 2004, Schneider et al. 2005), meaning they are UV-bright and thus unobscured[1] (Güver & Özel 2009).

Yet popular theories of quasar activation and massive galaxy evolution, based on gas-rich mergers, imply an extended stage of obscured black hole growth (Hopkins et al. 2006). During that phase, the most characteristic signatures should be luminous X-ray and infrared (IR) emission. If most black hole growth takes place in heavily obscured regions, which hide optical/UV signatures of accretion, then optically-selected, unobscured, Type I quasars do not tell the whole story. Instead, the obscured phase must be sampled via multi-wavelength surveys, with X-ray and IR recovering AGN missed by optical surveys.

A complete sample of active galaxies, across a broad range of obscuration, luminosity, and redshift, will allow us to build a consistent picture of how black holes grow over cosmic time, affect their host galaxy’s gas and dust, and produce the present-day mass distribution of dormant black holes in the centers of galaxies.

Previous Work and What Is Missing: Over the past decade, X-ray+multi-wavelength surveys have advanced our understanding of the growth of supermassive black holes over cosmic time. Starting with the GOODS pencil-beam survey (GOODS is the name of the HST+Spitzer survey of the CDF-South and –North fields; Brandt et al. 2001, Giacconi et al. 2002, Dickinson et al. 2003, Giavalisco et al. 2004), we learned that ~75% of black hole growth happens in obscured systems, according to a simple multi-wavelength population synthesis model (Treister et al. 2004, 2005, 2006; see §IIb for details). The small volume of the GOODS survey limited these conclusions to moderate luminosity AGN (LX<1044 erg/s), while the order-of-magnitude larger COSMOS field (Cappelluti et al. 2007, Elvis et al. 2009, Civano et al. 2015a,b, Marchesi et al. 2015) started to probe quasar luminosities. However, fully sampling the AGN luminosity function over the past 12 billion years requires much larger numbers of quasars, including obscured quasars at z>2, which is unachievable with pencil-beam fields. We will combine the well-sampled GOODS and COSMOS fields with a large-volume survey of Stripe 82X (~15 times larger than COSMOS; LaMassa et al. 2013a,b, 2015), which contains >300 rare luminous quasars (LX>1045 erg/s, or Lbol>1046 erg/s) and hundreds of AGN and quasars at z>2 (Fig. 1a), to comprise a statistical sample of supermassive black holes spanning ~12 billion years of cosmic history.

The Multi-wavelength “Wedding Cake” Survey: The Stripe 82X+COSMOS+GOODS fields comprise a “wedding cake” tiered survey, where each cake layer probes a different area and flux limit, and thus a different luminosity-redshift range (Fig. 1). Collectively, the three cake layers fully sample the AGN luminosity function out to high redshift (z>3). In addition to X-ray coverage, each field contains a compelling suite of radio-through-UV multiwavelength data from which we will generate an empirical library of AGN spectral energy distributions (SEDs) as a function of luminosity, redshift, obscuring column density along the line of sight, and host galaxy type (see §IIa for details). The IR and X-ray data, in particular, enable a comprehensive accounting of emission from black hole accretion at the nucleus of the host galaxy. Table 1 lists source numbers at selected wavelengths from the three surveys.

These data also provide independent constraints at each wavelength on a comprehensive new population synthesis modeling proposed in §IIb.

Science Goals:

  • Obtain spectra for unidentified AGN and construct SEDs for all AGN in the GOODS, COSMOS and Stripe 82X surveys, using complementary selection techniques to minimize population biases, and using both spectroscopic and photometric redshifts to maximize statistics.
  • Constrain the key parameters of black hole growth over the past ~12 billion years, including the energy output as a function of wavelength and redshift, by developing a comprehensive population synthesis model that agrees with the GOODS+COSMOS+Stripe82X data.
  • Investigate the interplay between star formation and black hole growth, especially in massive galaxies, using measures of stellar mass, star-formation rate, galaxy morphology, and clustering of both AGN and galaxies.

Below we describe in detail how we will achieve these science goals.

IIa. Toward Unbiased AGN Identification and Characterization

AGN can be identified by their blue/UV excess, optical narrow-emission-line ratios, X-ray luminosity, strong radio emission and/or strong IR emission. Each selection method is biased but combining complementary selection methods comes much closer to an unbiased survey (Fig. 2a). This is why extensive multi-wavelength data are essential, and X-ray and IR data are especially critical  because they

Table 1:  Number of Sources at Each Wavelength






(# sources)


Individual Galaxies

(after cross-match)

Stripe82X 31.3 6,000       3,000    ~1,700 ~6,000 ~3,000
COSMOS 2.0 3,500  4,100(c) ~8,500(d) ~8,000 ~4,000
GOODS 0.1    800       1,900      1,000 ~2,000 ~1,000
Total   ~10,000     ~9,000  ~11,000         ~16,000   ~8,000

The three fields are observable from both hemispheres and the combination allows year-round observing. Further details of available data sets given in §IIa. (b)Number of spectra expected to be available for final analysis. (c)4100 sources published (Lee et al. 2014); 8000 in deeper private catalog. (d)Private catalog of JVLA data for COSMOS; (Smolcic et al., in prep).

are the least biased against obscured AGN. Our wedding-cake survey spans the full area-depth range of current X-ray surveys (Fig. 1b), with sensitivity to AGN at z ~5 or more (COSMOS and GOODS are limited only by volume), and it has the most extensive multi-wavelength data available, including far-IR imaging from Herschel and Spitzer (Table 2). The longer wavelength data are well matched to the X-ray depth; furthermore, we note that, for higher redshifts (z >2), the X-ray K correction is very favorable, so issues of absorption, even from Compton-thick AGN (NH>1024 cm–2), are less of a concern (the rest-frame emission observed in the rest frame 2-8 keV Chandra band is 6-24 keV for a z ~2 AGN, comparable to the NuSTAR band for z ~0). A cartoon of the field layouts is shown in Figure 3; all three survey layers are accessible from both hemispheres. We describe our AGN selection techniques below.

X-ray: X-ray samples are heavily dominated by AGN, so they are the most efficient selection method. The COSMOS and GOODS survey layers
are substantially complete, with ~95% of 4300 X-ray-selected sources having spectroscopic or photometric redshifts (Table 1) and thus known luminosities (LX>1043 erg/s). The newer Stripe 82X survey has 6200 X-ray sources over 31.3 deg2 (Table 1), 92% complete in spectroscopic or photometric redshifts; the optical and infrared spectroscopy proposed here will increase the completeness of Stripe 82X to a level similar to that of the other cake layers. We have proposed for additional XMM imaging to reach 70 deg2 (toward the ultimate goal of 100 deg2 of X-ray coverage in Stripe 82) but the present sample is more than large enough for the work proposed here.


Figure 2. Left: Different AGN selection criteria yield overlapping but not identical sets of AGN, and each method misses some AGN; we do a complete census combining all these samples in the context of a comprehensive population synthesis model (§IIb). Hard X-ray selection is unbiased apart from the most heavily obscured AGN. UV-optical surveys miss obscured AGN. Narrow-line diagnostics find unobscured narrow-line regions and are best calibrated at low redshift (z<0.5). Radio selection preferentially picks out large-scale jets and lobes. IR selection detects all AGN, but light from accretion can be confused with starlight and most detections are galaxies. Far-IR data have high flux limits because big PSFs mean high confusion limits. Right: Cartoon showing various evolutionary paths for galaxies through mergers (at high mass) or secular growth (Hickox et al. 2009, adapted).

*************missing table 2 ********************

*We list only Stripe 82X fully, as this is the newest and least familiar survey; information about GOODS and COSMOS can be found at

IR: In obscured AGN, absorbed radiation is re-radiated in the IR, so in principle all AGN (both obscured and unobscured) can be detected in the IR, although it can be inefficient because galaxies dominate IR samples. In addition, traditional IR selection techniques (e.g., Stern et al. 2013, Donley et al. 2012) are biased towards strongly accreting AGN in moderate mass galaxies (Mendez et al. 2013). But because of the Herschel data in all three fields, this bias can be overcome by comparing the amount of warm dust (heated by the AGN) relative to cold dust (located in the diffuse ISM), allowing efficient identification of AGN within star-forming galaxies (Kirkpatrick et al. 2013, 2015; Fig. 3). For example, the ratio of 250µm to 24µm flux is lower where AGN heating raises the dust temperature, while the 8 to 3.6 µm flux ratio separates thermal stellar radiation from power-law AGN radiation. Importantly, this color-color selection is sensitive to intrinsically luminous but obscured AGN and to less strongly accreting AGN in star-forming hosts. Furthermore, we can account quantitatively for the contribution of the host galaxy to the bolometric luminosity, thus measuring each more accurately. One of us has already applied this IR selection to the GOODS and COSMOS fields, identifying 1100 galaxies that harbor buried AGN, many of which (interestingly!) do not overlap with the X-ray selected AGN.

Optical spectroscopy: A third common technique for identifying AGN is the so-called BPT diagram (Fig. 4a; Baldwin, Phillips & Terlevich 1979), which separates AGN from star-forming galaxies using narrow-line ratios because an AGN continuum produces higher ionization lines. We have, or will have, Keck MOSFIRE and Subaru FMOS spectra, allowing for full coverage of the key BPT lines. The BPT diagnostic at high redshift has known biases against AGN with low Eddington ratios and AGN in galaxies with high specific star-formation rates, due to host galaxy contamination of the optical emission lines (Coil et al. 2015, Trump et al. 2015), and BPT dividing lines are not yet well calibrated at high redshift (Kewley et al. 2013, Melendez et al. 2014, Kartaltepe et al. 2015, Azadi et al. 2016). By combining with IR and X-ray selection, we can offset these biases.

Additional diagnostics: The highest priority candidates for spectroscopic follow-up are heavily obscured quasars—to date a poorly studied population—which will have very red colors and large X-ray-to-optical ratios (see Fig. 4b; also, Fiore et al. 2009, Treister et al. 2009b, Brusa et al. 2010, Civano et al. 2012, LaMassa et al. 2016). We have institutional access to dozens of nights on some of the world’s best facilities, including Keck (LRIS, DEIMOS, MOSFIRE, NIRSpec), VLT (VIMOS, FORS2, SINFONI, ISAAC, MUSE), Magellan (IMACS, FIRE), MMT Hectospec and Palomar (DoubleSpec, TripleSpec). Table 3 shows the amount of telescope time we can reasonably expect from Yale (Urry), IfA/ Hawai`i (Sanders), ESO (Schawinski), Harvard (Civano) and Chile (Treister). In addition, spectroscopy of broad-line AGN will be used to derive black hole masses (§IIc) and we will target normal (non-AGN) galaxies in all three fields for comparison samples, especially for the clustering measurement (§IIc). We will also target a smaller number of host galaxies to look for direct signs of AGN feedback.

Stripe 82X is the least complete of the three survey layers, so the majority of spectroscopy will be done in that field. Roughly 800 Stripe 82X sources were targeted with eBOSS spectroscopy last year (as part of SDSS IV), yielding ~500 additional spectroscopic redshifts. High-quality photometric redshifts are available in all 3 fields: COSMOS, Δz/(1+z)<0.02, <6% outliers, 91% complete (Salvato et al. 2009, Salvato et al. 2011); CDFS Δz/(1+z) < 0.02, < 6% outliers, 96% complete (Hsu et al. 2014); Stripe 82X, Δz/(1+z)<0.06, <11% outliers, 92% complete (Ananna, Salvato et al. 2016, in prep).

Informing AGN Selection in Future Surveys: We will compare AGN selection functions, especially in the X-ray, IR, and optical, to probe the characteristics of the (complete) underlying AGN population. Understanding what AGN are being selected and missed is critical to designing new surveys and interpreting results from new telescopes like JWST, WFIRST, Euclid, TMT and Athena. With the largest, most complete sample of AGN from 0< z <5 to date, we are in a unique position to provide these tools for future observations. Moreover, with our wealth of optical/NIR spectroscopy for high redshift AGN, we will identify new line diagnostics (e.g., HeII, see Oh et al. 2016) and calibrate existing ones. The BPT dividing lines are highly uncertain at high redshift. Our sample includes galaxies with high specific star-formation rates and well-quantified AGN emission (identified in the IR), allowing us to observationally test how these theoretical lines evolve for AGN and SF+AGN composites.                    

Table 3: Access to 8-10-meter Telescopes*

  Year 1 Year 2 Year 3
Yale 7n Keck 7n Keck 7n Keck
Hawai`i 8n Keck+Subaru 8n Keck+Subaru 8n Keck+Subaru
Chile 6n VLT 6n VLT 6n VLT
Harvard/Texas 4n MMT 4n MMT 4n MMT
Total 25 nights 25 nights 25 nights

*Includes team access to 8-10m telescopes on Mauna Kea and in Chile, based on previous awards of time. Further details given in Facilities section.

Figure 4. Examples of AGN selection methods. Top left: AGN are easily distinguished from star-forming galaxies using dust temperature, here measured from mid- to far-IR data from Herschel (Kirkpatrick et al. 2013, 2015). AGN-heated dust is much warmer than cool interstellar dust where stars form. Top right: X-ray-to-optical flux ratio versus R-K for Stripe 82 X-ray sources; obscured AGN candidates are X-ray-bright and have red optical/IR colors (Brusa et al. 2010, Civano et al. 2012, LaMassa et al., in prep). Box shows obscured candidates and excludes stars (LaMassa et al. 2016b). Some have strong outflows, possibly feedback in action (Brusa et al. 2015, Perna et al. 2015). Bottom right: Rest-frame BPT diagnostic for IR-bright AGN and star-forming galaxies (Chu et al. 2016, in prep) works well: red dots are well fit by power-law AGN-heated dust emission (Kirkpatrick, priv com), while black dots are optically-selected star-forming galaxies (Steidel et al. 2014). 

IIb. Population Synthesis Using AGN SEDs to Study Black Hole Growth and Radiative Energy

A “population synthesis model” is a self-consistent phenomenological model of an evolving AGN population that can be compared to, and constrained by, observations. It attempts to describe reality—typically, the numbers of AGN as a function of luminosity and redshift, and their spectral energy distributions as a function of obscuring column, host galaxy type, etc.—before selection effects operate. We will apply specific cuts to the model (e.g., the flux limit and sky coverage of a survey), then compare to the corresponding survey data. Where there are disagreements, the model assumptions and parameters must be adjusted. Each set of cuts in a given waveband constitutes an independent constraint.

X-ray astronomers have frequently used X-ray-only population synthesis models to show that the summed X-ray spectra of a mix of absorbed and unabsorbed AGN fits the extragalactic X-ray “background” well (Setti & Woltjer 1989, Madau et al. 1994, Comastri et al. 1995, Gilli et al. 2001, 2007, Ballantyne et al. 2006, Treister & Urry 2005, Treister et al. 2009, Ueda et al. 2015). Our group built the first population synthesis model that connected X-rays to the rest of the spectral energy distribution, so that the data in every waveband constrained the model and thus the intrinsic properties of AGN (Treister et al. 2004). However, that model was constrained only by the very small GOODS survey, so it probed only moderate-luminosity AGN, and its assumptions were overly simple.

The GOODS+COSMOS+Stripe 82X survey collectively contains at least 8,000 AGN out to redshift z~5 (double that if the XMM proposal is approved), each with redshifts and fluxes in 10-30 photometric bands from radio through X-rays (Tables 1 and 2), easily an order of magnitude more data than previously available. Most importantly, it includes luminous quasars, both obscured and unobscured. By tuning a fully developed population synthesis model to match these data, we will characterize the full AGN population across redshift and luminosity, in terms of accretion rate, obscuration, host galaxy, dust content and more. Moreover, this kind of multi-wavelength model constitutes a description of IR-through-X-ray light emitted over cosmic time due to black hole accretion. Assuming an accretion efficiency (which depends on black hole spin), this translates to a description of the total mass accreted onto black holes. The model can also be disaggregated to explore the amount of radiation emitted (or black hole mass accreted) as a function of waveband, quasar luminosity, obscuring column density, instantaneous black hole mass, etc.

A comprehensive description of radiation emitted by accreting black holes over cosmic time is a critical ingredient for theorists’ assumptions about radiative feedback onto host galaxies. Cosmological simulations will be improved by using data-validated descriptions of radiation deposited into galaxies. That is one of the broad scientific impacts of this proposal.

Ingredients of the New Population Synthesis Model: The wealth of new data, especially in Stripe 82X, motivates a new population synthesis analysis, which will form the core of the graduate thesis of Yale graduate student Tonima Ananna. Two ingredients are needed: a hard X-ray luminosity function and a library of AGN spectral energy distributions (SEDs). Following Treister et al. (2004), Ballantyne et al. (2006) and Simmons et al. (2011), we will create a library of SEDs based on observed Type 1 AGN (Fig. 5a), modified by absorption at optical/ultraviolet/soft X-ray energies, plus re-radiated energy in the IR as in the clumpy-torus models Nenkova et al. (2002) and Elitzur et al. (2006). These SEDs will be validated against the observed data in GOODS+COSMOS+Stripe 82X (Fig. 5b).

The second ingredient is the evolving hard X-ray luminosity function, for which the new Stripe 82X data are essential. Specifically, the recent luminosity function from Ueda et al. (2014) used soft X-ray-selected samples for high redshift, high luminosity AGN; instead we will use the hard X-ray-selected quasars in Stripe 82X to derive a new luminosity function that is valid at high luminosity and high redshift. Other hard-X-ray-selected samples, such as GOODS and COSMOS, contain few or no quasars because the survey volume is small.

Using these two inputs, we then simulate a universe of AGN, sample it at the wavelengths and flux limits of the GOODS, COSMOS and Stripe 82X survey layers, and compare to the observed flux and redshift distributions. This allows us to tune key parameters of the model, such as the accuracy of the overall SED  shapes and the intrinsic ratio of obscured to unobscured AGN. How this ratio evolves with redshift is a strong clue to how black hole accretion evolved, and whether mergers are relevant to galaxy-AGN co-evolution.

Because of the greatly improved statistics and L-z coverage, Ms. Ananna’s population synthesis model will probe quantities previously ignored, such as how the properties of the host galaxy link to the active nucleus. We have information about the host galaxies types and masses from Stripe 82 value-added data, including Galaxy Zoo identifications. We are able to fully sample host galaxy types by combining different methods of AGN selection, particularly IR and optical, which are sensitive to different host properties. These can be tied to the black hole mass, which we will obtain from broad-line widths in a few dozen cases (§IIc), but in general will be estimated from the combination of measured luminosity of each AGN, with the host contribution carefully removed, plus a random Eddington ratio taken from the observed distribution for AGN in that luminosity range (taken from samples that have black hole mass estimates, e.g., Woo et al. 2002, 2004, Simmons et al. 2011).

Quasar Luminosity Function at High Redshift: The luminosity function of X-ray-selected quasars—and thus total black hole growth—is not well constrained at high redshifts (z>2) and luminosities. Roughly speaking, the quasar luminosity function (QLF) is a broken power law with a break that evolves with redshift; accretion at the break dominates overall black hole growth, so it is an important thing to determine. Stripe 82X surveys a large enough volume to constrain the number density of high redshift quasars. We will empirically determine the Stripe 82X selection function (i.e., the fraction of missed quasars as a function of luminosity, redshift and location in the survey), in order to calculate the probability of detecting a quasar of a given X-ray luminosity at the flux limit of the survey (LaMassa et al. 2013b, based on Cappelluti et al. 2007 and Ranalli et al. 2013). We then generate the QLF of hard X-ray-selected quasars by summing the density contributed by each quasar, in bins of luminosity modified to account for systematic errors near the flux limits (Schmidt 1968, Page & Carrera 2000), and taking into account the shape of the probability distribution function of each photometric redshift following the prescription of Marchesi et al. (2016). We will incorporate data from the XMM-XLL survey (Georgakakis et al. 2015) and other relevant large-volume surveys, maximizing statistics while anchoring the selection functions in our derived knowledge of the SED shapes. In this way we can determine the X-ray luminosity function, the bivariate optical luminosity function, and (with multi-wavelength data) the bolometric luminosity function for a relatively unbiased sample of X-ray-selected quasars.

This will be the most accurate high-redshift QLF ever, given the hard X-ray selection and excellent statistics. The comprehensive X-ray QLF of Ueda et al. (2014) had ~25 LX > 1045 erg/s AGN detected at z>2; however, these quasars were selected in the soft X-ray ROSAT band (0.5-2 keV), which biases the sample toward unobscured sources in a similar way to the SDSS. Georgakakis et al. (2015) did use hard X-ray-selected AGN, from 20 deg2 in the northern XXL field (similar depth to our Stripe 82X); they found 60 AGN at z>3, while our data will more than double that number. The QLF using the slightly smaller 10 deg2 ChaMP survey was based on 35 z > 2 quasars at LX >1045 erg/s (Silverman et al. 2008), whereas we expect roughly 600 quasars in this redshift and luminosity range, roughly 5 times as many per square degree (due to deeper exposures).

Separating AGN Light from Stellar Emission: Using the extensive multi-wavelength data available in Stripe 82, we will study the SEDs of hundreds of X-ray-selected quasars at high redshift. Host galaxy light can be a  significant source of contamination in the SED,  particularly at optical and IR wavelengths,

Figure 6. We reproduce here Figure 21 from Ueda et al. (2014), which shows the evolution of the co-moving mass density in super-massive black holes as a function of redshift, for all AGN luminosities (solid black line) or separated into different luminosity bins (colored lines). According to their X-ray population synthesis model, which converts mass to light with average efficiency η=0.05, the total black hole mass accreted is dominated by moderate to high luminosity AGN, 1045-1047 ergs/s (cyan solid and magenta dashed lines) and a large fraction of these quasars are obscured. But their quasar sample is largely unobscured, whereas Stripe 82X is designed to find obscured quasars. That is, we will find black hole growth so far unobserved—if it’s there.

but a well-sampled SED can be separated into its stellar and AGN components, which each have distinct emission signatures (e.g., Hainline et al. 2011, Mullaney et al. 2011, Kirkpatrick et al. 2015). We will decompose SEDs into stars and AGN to study both dust-obscured and less luminous AGN alongside their host galaxies. Better-characterized AGN SEDs, particularly at high luminosity and redshift, will be an important product of this project, and are an essential input to the population synthesis model. Comparison of the GOODS, COSMOS (Elvis et al. 2012, Hao et al. 2010, 2013, Lusso et al. 2012) and Stripe 82X SEDs will show the relation of galaxy components to underlying AGN power.

In some cases, it is possible to separate the black hole-powered point source cleanly from the diffuse host galaxy light, improving the SED decomposition—a technique our group pioneered (Simmons et al. 2011). A consistency check is the absorption implied by the X-ray spectrum, which is usually parameterized in terms of the equivalent hydrogen column density, NH, assuming solar abundances. Another consistency check is the ratio of Balmer lines, which gives an estimate of the reddening, AV. These can be compared to the values measured in the Milky Way (Güver & Özel 2009), and interpreted in terms of the gas-to-dust ratio in other galaxies (e.g., Maiolino et al. 2001).

IIc. The Black Hole-Host Galaxy Connection   

Overcoming Selection Biases to Get A Full AGN Sample: The AGN-star formation connection has yet to be broadly studied within galaxies for a statistical sample across cosmic time, largely due to difficulties in observing AGN in actively star-forming galaxies, particularly at z>0.5. Using deep Spitzer spectroscopy to identify buried AGN within host galaxies, Kirkpatrick et al. (2015) found that AGN lurk in at least 40% of IR-selected galaxies at z>0.5; however, less than a quarter of these would be identified with current X-ray or Spitzer IRAC selection techniques (Kirkpatrick et al. 2012, 2015). Even with deep observations, X-ray selection can miss heavily obscured AGN (e.g., Akylas et al. 2012, Assef et al. 2015). Spitzer IRAC colors provide evidence of a hot torus (e.g., Donley et al. 2012), but these colors predominantly select luminous AGN (LX>1043) and suffer from galaxy contamination in deeper data (e.g., Mendez et al. 2013). Within massive dusty galaxies, AGN and star formation may be linked through major mergers that provide the fuel for both (e.g., Draper & Balantyne 2012). Galaxies undergoing a merger are more likely to be Compton thick, so the predicted merger link between star-forming galaxies and AGN can be missed if only X-ray detected AGN are considered (Kocevski et al. 2012, 2015). This makes identifying luminous quasars and buried AGN crucial to understanding what fueling mechanism drives black hole growth in massive star-forming host galaxies. At the same time, it appears that only the most luminous AGN are triggered by mergers; for most galaxies, high star-formation rates likely depend on gas fraction, which increases with look-back time (e.g., Treister et al. 2012, Steinborn et al. 2016). Our data span the full luminosity and mass range, and using multiple, proven AGN selection techniques, we are in a unique position to explore the AGN-host galaxy connection for a range of host galaxy and black hole properties.

Understanding the History of Black Hole Accretion: It is currently a matter of debate whether moderate luminosity AGN or quasars dominate the total mass accreted onto black holes. Theories invoking mergers suggest the accretion rate is highest during the most luminous and obscured phase, before excess gas and dust is expelled by the quasar (e.g., Hopkins et al. 2006, 2008). Earlier population synthesis models suggest substantial (perhaps most) mass accretion onto supermassive black holes occurred in obscured, high luminosity systems (Treister & Urry 2006, Treister et al. 2009a, Ballantyne et al. 2011, Ueda et al. 2014). The model from Ueda et al. (2014; see Fig. 6), strongly favors the quasar team, while the Treister et al. (2009a) model has roughly equal contributions from quasars and lower luminosity AGN. But even this recent luminosity function from Ueda et al. (2014) had to rely on soft-X-ray-detected AGN at high redshift and luminosity, i.e., similar to optically-selected samples. The Stripe 82X data we will use for this regime are much more sensitive to obscured quasars. There may be three times as many quasars as detected by soft X-ray experiments, as suggested by fluctuations analysis (Shafer 1983) and initial results from the Stripe 82X survey (LaMassa et al. 2013a,b), meaning it is likely an important piece of the story of cosmic black hole growth is still missing. Fitting to the multi-wavelength Stripe 82X samples, in particular, will quantify the amount of black hole growth in rare, luminous quasars, and allow us to estimate for the first time the fraction of black hole growth that is obscured at high luminosity and high redshift. In this context, it is notable that recent NuSTAR data show many of the brightest obscured AGN accreting at very high rates (Arevalo et al. 2014, Puccetti et al. 2014, Bauer et al. 2015).

We will produce a data-validated description of when and where supermassive black holes grew, i.e., the evolving mass function. This is one of the broader impacts from our proposal.

Additionally, although the majority of present-day black hole and stellar mass formed contemporaneously at z~1–3, it is not yet clear whether the bulk of black hole growth occurs in the same IR-luminous galaxies that dominate the buildup of stellar mass. Using our IR-identified AGN and host galaxies we will measure, for the first time, the concurrent black hole accretion rate and star-formation rate density in individual galaxies across cosmic time. Because GOODS and COSMOS are such well-studied fields, IR luminosity functions already exist (which can be converted into star-formation rate densities). Our black hole accretion rate measurements and updated IR AGN luminosity functions will enable us to measure the accretion rate density over time in these AGN+SF composite systems. Comparing to quasar accretion rates (described above), we can quantify which population of AGN dominate black hole growth.

Black Hole Mass Estimates and Evolving Mass Functions at High Redshift: For quasars at z~2-3, we use IR spectroscopy to measure broad emission-line widths with which to estimate the black hole masses. Specifically, we have been using Keck and VLT IR spectrometers to measure broad Ha and Hb lines for Type 1 quasars, then applying the virial theorem to get MBH (Trakhtenbrot & Netzer 2012). Previously, we targeted AGN in the COSMOS field; now we will target Stripe 82X quasars to explore the black hole mass function in at much higher luminosity. In some cases, these quasars will be brighter in the IR and thus accessible with Palomar or other 4-meter class telescopes that have excellent instrumentation. The black hole masses provide a direct check on the efficiency of converting mass to light and an independent constraint on the global accretion of mass onto black holes, which can be compared to the comprehensive population synthesis models discussed earlier. Comparing black hole mass functions and AGN SEDs in several redshift bins, we can constrain the evolving distribution of Eddington ratios, providing an important constraint on theories of accretion (another broader impact of this proposal). In addition, the wealth of multi-wavelength data allows us to measure the mass function of the host galaxies in our samples (to which we can add galaxy mass functions in the literature), and thus to determine whether black hole mass evolves proportionally with galaxy mass, or whether smaller galaxies hosted larger black holes in the past. This could shed light on when the M-s relation fell into place (another broader impact).

Measuring the Clustering of High-Redshift AGN: The study of AGN clustering and evolution is a powerful tool to understand, from a statistical point of view, what kind of environment is most likely to host AGN and how gas loses its angular momentum, is funneled to the central region of galaxy, and becomes available for accretion. The proposed work is unprecedented in using both data on large scales and relatively unbiased samples.  So far, discrepant results are obtained based on AGN selection methods. In the same redshift bin, optically selected, luminous AGN (Lbol>1046 ergs/s) are found to be hosted in dark matter (DM) halos with masses log(Mh=M_)=12.0-12.5, consistent with being triggered by mergers (Shanks et al. 2011). X-ray-selected, moderate luminosity AGN (LX =1044-45 erg/s; Lbol = 1045 erg/s), reside in more massive halos, log(Mh=M_)_13.5), suggesting the AGN are triggered by tidal disruptions or disk instabilities (Cappelluti etal. 2012). These discrepancies likely arise due to different efficiencies of X-ray and optical selection and the very different volumes probed. Moreover, the X-ray selection function is better determined than the optical due to lower contamination and homogeneous depth.  Unfortunately, knowledge of the redshift evolution of the bias associated with X-ray-selected AGN is limited to redshifts z <2 (Allevato et al. 2011, 2014, Cappelluti et al. 2012); over that range, bias increases with redshift and with the mass of the DM halo. However, these trends cannot continue to early times in the Universe because the number density of AGN rises quickly with increasing redshift, even as the mass of the DM halo falls rapidly with redshift. Clustering measurements at 2<z<4 then become important for determining whether the typical mass of a DM halo hosting an AGN changes and/or if the occupation number of AGN per DM halo evolves as a function of redshift. (Clustering of X-ray-selected AGN or quasars is poorly known for z >2.5; cf. Allevato et al. 2014.) This result impacts our understanding of the interplay between the growth of SMBH and large-scale structure.

We propose to measure the 2-point spatial correlation function (2pcf) of Stripe 82X AGN. From this analysis, we will be able to derive the average mass of the host dark matter halos and the number of AGN per mass interval at 2< z <4. Following our campaign of spectroscopic/photometric identification of Stripe 82X X-ray sources, we expect <10% accuracy for the z >2 2pcf of X-ray-selected AGN at all luminosities (see fig from XMM). Because of the large number of sources Stripe 82X will provide at least a factor of 16 improvement compared to previous work (e.g. Allevato+14).  In particular, thanks to the large number of sources this program will allow detailed study of the halo occupation distribution (HOD, i.e., the probability of a dark matter halo of a given mass to host a central and satellite AGN above a given luminosity; Cooray & Sheth 2002, Allevato et al. 2012). The HOD describes the physical relation between AGN and dark matter halos at the level of individual halos, which will allow us to:

  • search for distinct evolutionary paths in the HOD as a function of luminosity, where we have enough sources to ensure the results are not biased by small-number statistics;
  • determine the dominant AGN triggering mechanism (mergers vs. secular processes) by analyzing

HOD as a function of Lbol, z, and AGN selection, comparing results with semi-analytic models.

We will thereby achieve similar statistics as optical and infrared quasar samples that cover ~1/3 to the full-sky, breaking the degeneracy of AGN selection method (see e.g., Mendez et al. 2016, but our sample will be larger and probe to higher redshifts and luminosities).  We will also be able to address a current controversy in the community: the simplest AGN unification models would posit that obscured and unobscured AGN should share the same distribution of environments (when selection effects are properly taken into account). On the other hand, if obscured AGN are preferentially found in an earlier phase than unobscured AGN, their clustering properties can differ. Thus far, results in the literature have been contradictory, perhaps because they have been limited to small samples or they did not account for the different volumes sampled because of different selection effects on each Type (e.g., Hickox et al. 2011, DiPompeo et al. 2014, Donoso et al. 2013). As with the mass functions, we can utilize the rich body of literature in the GOODS and COSMOS fields to compare our quasar clustering measurements with the clustering of massive galaxies, to form a more complete picture of whether AGN live in particularly special environments compared with non-active galaxies. This sort of broad environment picture naturally complements more detailed measurements of the AGN merger fraction with time, made possible with morphological imaging (GOODS+COSMOS+Stripe82X AGN already have morphological classifications). Comparing to the merger fraction of massive galaxies with time can yield better constraints on whether mergers are needed to trigger the most luminous AGN, even in the presence of high gas fractions, as predicted (Steinborn et al. 2016).

******************Missing Figure 7 ***************

Figure 7.  Solid lines show the measured fractional error on the spatial correlation function of Stripe 82X quasars in the redshift bins z=1-2 and z=2-4 in black and red, respectively. Dashed lines show the simulated accuracies for our proposed program in the same redshift bins. Current data gives a precision of ~30% on large scales, which will improve to ~3% with our S82X data.  (Fig. 7 is not explicitly referenced in Nico’s new text !)

IId. Future Work

Our comprehensive population synthesis model will allow for many future studies, including short-term projects for undergraduates and first-year graduate students. To a degree, specifying the exact student projects we will be able to do is slightly speculative, since a necessary first step is building the model. With 1-2 orders of magnitude more data, from GOODS+COSMOS+Stripe 82X, as well as CANDELS, AEGIS, ECDFS and other deep fields, plus the Böotes and XXM-XXL large fields (~25 deg2 each), we will understand black hole growth far better than was possible a decade ago. We here describe two of the most interesting questions that we expect to fuel exciting future research projects.

(1) How is galaxy mass linked to black hole mass? This connection is suggested by the local MBHs* relation, but recent results suggest this relation can be strongly violated in individual systems (van den Bosch et al. 2012, Trakhtenbrot et al. 2015). We can investigate whether the population synthesis model requires (or just allows) that the local MBHs* relation is obeyed.

(2) What are the relations among the amount of dust in a given system, the dust-to-gas ratio, the star-formation rate, the accretion and obscuration geometry (which could depend on AGN luminosity, as suggested by Lawrence 1991), and perhaps even accretion rate and efficiency (spin)? To add statistical power, we will assume all these variables have smooth distributions beyond the observable limits; for example, the observed NH distribution cuts off near 1024 atoms/cm2 because of absorption, but galaxies surely don’t “know” about Compton scattering, meaning the distribution from 1023-1025 atoms/cm2 must be flat or changing gradually.

Finally, we will use the results of the proposed project to write compelling JWST proposals. First, we will propose for mid-IR spectra in order to measure the fine structure Si and Ne lines, with which we can probe metallicity and the ionization parameter (estimated to be higher in high-redshift star-forming galaxies; Cullen et al. 2016). We will propose to use the MIRI integral-field capability to look for shocks (traced by molecular hydrogen) in composite AGN (near the BPT dividing line), to test whether the increased ionization is really due to an accreting black hole.

IIe. Summary and Deliverables

We have described a large ground-based observing campaign (90 nights with 6-10-meter telescopes, achievable through guaranteed time) and theoretical analysis of an evolving population of AGN, designed to provide a more complete picture of black hole growth since z~5. Roughly half the observations will be spectroscopic follow-up of Stripe 82X, to complete the census of the most luminous sources at all redshifts, while the other half will focus on detailed observations of the most interesting objects from all three cake layers. Nearly 2000 2-dimensional spectra from FMOS-COSMOS are available for analysis starting immediately in Year 1, with ~2000 new spectra (focused on targets at high redshift) to be obtained in Years 1-3 with MOSFIRE and DEIMOS on Keck and, if still available, MOIRCS on Subaru. A more complete timeline of our observing plan and data analysis plan is presented in Sec. IIg.

These deliverables will be made available to the astronomical community:

  • Multi-wavelength catalogs of AGN in GOODS+COSMOS+Stripe 82X, selected in complementary ways (X-ray, IR, radio).
  • Calibrated radio-to-X-ray SEDs for up to ~16,000 unique AGN and quasars, across a broad range of luminosity, redshift, and type.
  • New optical and near-IR spectra for ~8000 AGN in GOODS+COSMOS+Stripe 82X, including line widths, line strengths, reddening, MBH and Lacc.
  • A comprehensive population synthesis model, fitted to the GOODS+COSMOS+Stripe 82X AGN samples, which will indicate the amount of black hole growth as a function of obscuration, luminosity, waveband, and redshift.
  • Evolving black hole mass and quasar luminosity functions at high redshift.
  • Halo masses and bias from clustering analysis.
  • Published analyses and all reduced data sets for the above.

IIf. Broader Impacts of the Proposed Work                                    

We mentioned above some of the broader impacts of our proposed scientific investigation, such as data-validated measures of the radiation history and mass accretion history of supermassive black holes over cosmic time, which will impact cosmological simulations of galaxy evolution. In this section, we focus on the broader impacts that our efforts to improve equity and inclusion in the astrophysics enterprise.

Increasing Diversity in STEM with the Meyerhoff Scholars Summer Program: As part of this proposal, we will bring up to 6 undergrads each summer from the University of Maryland, Baltimore County (UMBC) Meyerhoff Scholars Program ( to Yale and the U. Hawai`i Institute for Astronomy (IfA) to work on a research project under the supervision of a carefully chosen faculty mentor. The goals are to de-mystify the off-putting aura of a highly ranked, research-oriented, Ivy League university like Yale or a top-ranked astronomy program like IfA, and to provide the students with a cutting-edge research experience that feeds their interest in science, in a carefully created environment of support and encouragement. Here we describe the UMBC Meyerhoff Scholars program and detail specific plans to ensure the students have a successful research experience.

Briefly, Meyerhoff Scholars are a competitively selected cohort of students who plan to pursue a PhD in science, technology, engineering or math (the STEM fields). As undergraduates, these honor students are immersed in a rigorous academic program and provided with the resources and mentoring necessary for success. Applicants to the program must have an interest in advancing minorities in those fields, which in practice means many Meyerhoff Scholars are from groups traditionally under-represented in STEM. Thus their individual successes make a measureable difference in the much-too-slow rate at which under-represented minorities earn advanced degrees in STEM (see Table 4). Part of the students’ success comes from the sense of belonging and a supportive community of classmates and Meyerhoff alumni. Graduate programs at Yale or the Institute for Astronomy can inspire trepidation, so Urry and Sanders will carefully select mentors who will provide a positive, supportive environment. Harmon, Turner and Johnson will select students so that the equation makes sense at both ends.

Table 4. Results from the University of Maryland, Baltimore County Meyerhoff Scholars Program

Meyerhoff Program results over 4 years are impressive:  126 Physical Sciences majors have graduated, (109 were under-represented minorities or women), including 30 physics majors (22 URM + women).

Professional Development for Summer Researchers: At Yale and UH we regularly do collective training for incoming summer students. As part of this proposal, we will expand that training into a one-week “Granville Academy” at Yale, named after Evelyn Boyd Granville, one of the first African-American women to earn a PhD in Mathematics (Yale, 1949). Inspired by the Banneker[2] and Aztlán[3] Institutes, created by John Johnson and Jorge Moreno, respectively, Louise Edwards (recently at Yale, now at Cal Poly San Luis Obispo) has created a similar set of activities that mix learning about astronomical tools and social activism. Topics for sessions include Introductions and Ground Rules; Notebooks and Library Tools; Demographics, Statistics and the Scientific Process; Python; Privilege, Implicit Bias and Stereotype Threat; Image Processing; and Teaching Techniques and Public Engagement. Summer students from Yale and Cal Poly San Luis Obispo will interact with Yale graduate students and postdocs (who will teach some sessions), helping build networks. Throughout the summer we will provide other career development activities, as part of the UH IfA REU program and the Yale SURF program.

The Yale SURF Program, operating for 18 years, is led by Collaborator Michelle Nearon, Director of Yale’s Office for Diversity and Equal Opportunity. Yale SURF undergraduates spend 8 weeks at Yale, becoming familiar with the kind of work they can expect in graduate school, learning the many steps involved in building a career, and developing confidence in their own abilities and potential. Examples of career development activities provided by the SURF Program include how to write a research proposal or abstract, preparing and delivering oral presentations, and how to craft a strong application to graduate school. Several Meyerhoff Scholars participated in the SURF Program in past years and Urry has hosted SURF Program researchers in previous summers. All students develop a proposal, give a final presentation to their peers, submit a written final paper, and attend the Leadership Alliance National Symposium to present their research at the meeting.

The University of Hawai`i, Institute for Astronomy Summer Research-Teaching Programs: The IfA partners with the Akamai Workforce Initiative to advance students into Science and Technology Careers. College students are offered internships to gain summer work experience at an observatory, company or scientific/technical facility in Hawai`i. The interns are carefully matched with a project and a mentor who supervises them in a 7-week project, and integrates the intern into the work environment. The internships are led by Institute Faculty, staff and engineers in the community, with overall supervision provided by Dr. Paul Coleman. Meyerhoff Scholars who spend the summer at the IfA will participate in the Akamai program, on projects related to this proposal.

Commitment to Diversity, Equity and Inclusion: A number of the proposers are known for their efforts to increase diversity in astrophysics. Coleman has connected members of the local community, including Native Hawai`ians, to the world of professional astronomy. Urry organized the first conference on Women in Astronomy in 1992 (and co-organized the next two); co-wrote the Baltimore Charter; and served on numerous women-in-science committees. Now that (mostly white) women have risen to approximate parity in the undergraduate ranks, and are visible in healthy and increasing numbers at higher levels as well, it is past time to focus on the severe deficit of minority scholars (of African American, Native American, Hispanic, and Asian/Pacific Islander heritage, as well as the LGBT+ community, people with disabilities, and other underrepresented groups). This is why we are focusing our collective efforts—at the retail level, student by student—on increasing retention of promising young students from under-represented communities, and preparing the ground for them to matriculate in graduate programs at the most highly ranked research universities.  Urry, Sanders, and other collaborators have mentored many graduate students and postdocs; Urry was nominated as outstanding postdoctoral advisor at Yale. Collectively we advise dozens of undergraduates every year, as well as outside college students, high school students and quite a few women and URM scientists at various career stages. Urry is currently Past President of the American Astronomical Society, where her two principle focus issues were diversity and connecting academic astronomy more closely with astronomy alumni who are employed in other sectors.

IIg. Work Plan, Effort, and Timeline

Principal contributions and expertise of team members are as follows:

Meg Urry (PI, Yale): Expert on AGN unification and evolution; co-designer of the GOODS survey, member of COSMOS team, designed the Stripe 82X survey. Responsible for the overall management of the project, postdoc and graduate student mentoring, and ensuring timely dissemination of the results.

Dave Sanders (PI, U. Hawai`i): Expert on IR quasars, Luminous Infrared Galaxies, and molecular gas and star formation in galaxies. Principal member of the COSMOS Collaboration. Responsible for overall management of the proposed COSMOS observations, and Hawai`i graduate student mentoring.

Jane Turner (PI, UMBC): Expert on X-ray and multi-wavelength studies of AGN. Will help select UMBC Meyerhoff Scholars for the Yale and U Hawai`i summer research programs.

Stephanie LaMassa (funded collaborator, NASA Goddard): Co-leader of Stripe 82X survey, expert on X-ray data analysis, leading the quasar luminosity function project.

Unfunded Collaborators: This project benefits greatly from collaboration with a broad international team who contribute expertise on a wide range of X-ray and AGN related topics, who have access to ESO and Chilean observational facilities, and who are experts on diversity and mentoring. Their specific roles are outlined in the Facilities section. Timeline:

Year 1: For Stripe 82X, we will propose for more telescope time for spectroscopy, observe, reduce spectra and write up results for publication. LaMassa will lead this effort (working with students). We will publish photometric redshifts for the Stripe 82X catalog, obtained by Ananna working with Salvato. Ananna will construct SEDs and code the population synthesis model. LaMassa will lead the work on the X-ray QLF at z >2 and z >3. Trakhtenbrot, Civano and Urry will continue their work on the black hole mass function. Cappelluti will explore clustering on the three spatial scales of the Wedding Cake.

Analysis of the large volume of FMOS-COSMOS NIR-spectroscopic data already in hand (~2000 targets; ~30 nights of FMOS-HiRes data) will be completed at the IfA (Sanders) and RIT (Kartaltepe), with the help of the Hawai`i graduate student. These data will be used to construct SED templates for obscured AGN at z <2, selected by different techniques (e.g., X-ray, far-IR, radio, near-IR colors). Ten nights of Keck-MOSFIRE time will be proposed for obtaining rest-frame optical emission line diagnostics for ~800 additional COSMOS sources at higher redshift (z ~2-4).

Year 2: Hawai`i will continue to propose for 10 nights of Keck-MOSFIRE time to obtain spectroscopic observations of ~800 additional COSMOS X-ray, far-IR, radio and near-IR–selected targets at z >2. The black hole mass function, from fitting broad-line AGN found in existing databases and those that we target in dedicated follow-up observations, will be completed (Trakhtenbrot, Civano, Urry). Yale and UMBC summer students will do this work under coordinated supervision from the lead collaborators.

Year 3: Throughout the 3 years of NSF funding, Ananna will finish population synthesis modeling and compare to survey data to determine global accretion history of black holes, explore uncertainties, and compare energy input from AGN compared to stars. If she has additional time, she will develop a fully parameterized, self-consistent analytic description of black hole Eddington ratios and the black hole mass function over cosmic time. Hawai`i will continue to propose 10 nights of Keck-MOSFIRE time to obtain spectroscopic observations of an additional ~800 COSMOS X-ray, far-IR, radio and near-IR–selected targets at z >2. The Hawai`i graduate student will incorporate all of the MOSFIRE observations of the COSMOS Field as part of their PhD thesis, which is expected to be completed by the end of Year 3.

Throughout this time-frame, we plan to publish results from our ground-based optical and near-IR observing runs on an annual basis, building a statistically significant sample of obscured black hole growth at cosmological distances. Additionally, unexpected discoveries in a dataset this rich are bound to occur and will be published as they are uncovered.

IIh. Prior NSF Support

No PI/Co-PI has received a previous NSF grant in this area.

[1] We use the term “obscured” to mean roughly NH >1022 cm–2, and more specifically, “unobscured” means AV < 1 or  NH <1021 cm–2; moderately obscured means 1021 cm–2 < NH < 1023 cm–2; and heavily obscured means NH > 1023 cm–2. Also, “Type 1” (“Type 2”) AGN are defined by the presence (absence) of broad emission lines in optical spectra.