Speckle-free holography with partially coherent light sources and camera-in-the-loop calibration

A holographic display combines artificial intelligence with partially coherent light sources to reduce speckle.

This supplementary document includes complementary derivations related to propagation and optimization models of a partially coherent holographic system, implementation details of both display hardware and computer-generated hologram algorithms, and additional experimental results.
Here we list the abbreviations and notations used across this document.
• SLM: a spatial light modulator • CGH: computer-generated holography • LED: light emitting diode • SLED: super-luminescent light emitting diode • ASM: the angular spectrum method [9] • CITL: camera-in-the-loop optimization • SGD: stochastic gradient descent-based phase retrieval • laser CITL: an optical implementation that uses a laser and SGD-based CITL optimization • LED CITL: an optical implementation that uses an LED and SGD-based CITL optimization • SLED CITL: an optical implementation that uses an SLED and SGD-based CITL optimization • Coherent Model: propagation model that considers only one single wavelength emitting from one spatial point • Partially Coherent Model: propagation model that considers multiple wavelengths emitting from multiple spatial points

DERIVATION OF PARTIALLY COHERENT HOLOGRAM SYNTHESIS
Throughout this work, we consider an electronic holographic display system that uses a single phase-only SLM. In this system, either a coherent laser or a partially coherent source, such as an LED or SLED, is used. The coherent or partially coherent source field is incident on the SLM, where the phase is delayed in a spatially varying manner. The field continues to propagate in free space by some distance to a target plane, where a user or a detector observe the intensity. Thus, we consider a free space propagation-type holographic display system as opposed to a Fourier-type system.

A. Coherent Image Formation and Inversion
To model the free-space propagation between the source, i.e., the SLM, and the target planes, the ASM [9,26] is applied. For a coherent source u src (x, y, λ), this model propagates the field by a fixed distance z as g c (φ, u src , λ) = F u src (x, y, λ) e iφ(x,y,λ) H k x , k y , λ e i2π(k x x+k y y) dk x dk y , where λ is the wavelength, k x and k y are spatial frequencies, H is the transfer function, and F (·) denotes the Fourier transform. In this coherent case, a phase retrieval algorithm would then aim at estimating the SLM phase pattern φ that minimizes the error with respect to some target amplitude a target by solving the optimization problem minimize where L (·) is a loss function, such as the 2 norm. Numerous approaches to solving phase retrieval problems of this form have been proposed in the literature [30]. However, even the most sophisticated algorithms cannot prevent speckle in the physical display system, because this is caused by the coherent property of the light source.

B. Partially Coherent Image Formation and Inversion
A partially coherent source, e.g., an LED or SLED, may have both a reasonably broad emission spectrum q (λ) and also a finite-sized area over which it emits light with some spatial intensity profile. Thus, it can exhibit partial coherence in both the spatial and/or temporal domain. We assume that the light source is reasonably well collimated, such that the spatial emission profile directly maps to the angular profile of source field w (ω), where ω = (ω x , ω y ) is a vector indicating a two-dimensional direction. The specific mapping from area to angle depends on the focal length of the lens that collimates the source (see Eq. S8). Taking this into account, we derive the image intensityÎ using a partially coherent image formation model g aŝ where e i(ω x x+ω y y) models a tilted plane wave propagating into direction ω, and w is the relative intensity of the source field along direction ω.
In practice, we deal with discrete signals, including the SLM phase φ, sampled wavelengths and angles ω (k) K k=1 , so Eq. S3 can be represented as the weighted sum over spectral and spatial domains (S4) A phase retrieval algorithm that takes partial coherence into account can then be formulated as Problem S5 can be efficiently solved with many existing approaches. In this work, we employ the SGD algorithm because it has recently been demonstrated as an intuitive and robust solver for computer-generated holography problems [23]. What makes this problem challenging is the integration over the wavelength spectrum and also over the incident angles-the very properties that provide the benefit of speckle reduction. These integrals make this partially coherent formulation more akin to a deconvolution problem embedded in a phase retrieval problem than the coherent phase retrieval problem alone. Next, we describe how to practically characterize the propagation in discrete spectral and spatial domain, respectively. Derivation for temporal incoherence For a broadband incident beam emitting from a point source, and ignoring the source field u src , the forward wave propagation model incorporates a sum of propagated intensity over multiple sampling wavelengths where H λ n is the transfer function for wavelength λ n .
Derivation for spatial incoherence Here, we consider a single wavelength and multiple mutually incoherent sources that are laterally shifted at the source plane. The wave emitting from an ideal point source at the center, i.d., δ(r 0 ), is first collimated by a collimating lens and hits the SLM as a plane wave a(r 0 )δ(r 0 − 0) source plane where a indicates the spatial amplitude profile of the finite source, r and r 0 are the coordinates at the SLM plane and source plane, respectively. r s indicates a lateral shift of the point light source, which is 0 here, and c is a constant. If the point light source is shifted for r s = 0, the wave incident on the SLM becomes a tilted planar wave [31], indicated as where f c is the focal length of the collimating lens. Again, this implies that the lateral shift of light source leads to the tilted plane wave with direction ω = 2π r s λ f c and the angular profile of source intensity w maps to the spatial amplitude profile at the pinhole as w (ω) = (c · a (r s )) 2 .
A finite-sized emitting area can be represented as a set of many incoherent point sources. Thus, contributions from angles {ω k } K k=1 are added up in intensity to derive the spatial incoherence as The image intensity emitting from a shifted point light source results in the laterally shifted image intensity from the centered point light source and it can be calculated using the convolution with a spatially invariant kernel. Relevant details can be found in the literature [14,18,19].
Stochastic sampling over spectral and spatial domain The discrete partially coherent propagation model described in Eq. S4 requires a sufficient number of samples and it is memory demanding since it calculates a wave propagation NK times. In practice, for each iteration, we dynamically sample M tuples ω (m) , λ (m) uniformly over the finite size of the source (or physical pinhole) and the characterized spectra, which is shown in Figure S5. Then, we weigh the emission spectrum q λ (m) and angular profile of source intensity w(ω (m) ) on the propagated intensity as in Eq. S4 and sum them up for all M pairs. One can pre-compute N (>M) tuples of ω (m) , λ (m) and set them as a pool, and randomly pick from the pool every iteration, which gives significant speedup. We show the convergence of the optimization with different light sources and propagation models in Figure S1.

C. Camera-in-the-loop Hologram Optimization
It is clear that the mismatch between the simulated wave propagation model g (Eqs. S1, S5) and the physical wave propagation of the display g drastically influences the holographic image quality. Optical aberrations, phase nonlinearties of the SLM pixels, source intensity variation, diffraction efficiency of the SLM and many other factors typically degrade the image quality of experimental results of holographic display systems compared to comparable simulations.
To mitigate this discrepancy, Peng et al. [23] recently introduced a camera-in-the-loop hologram optimization strategy. Their approach builds on the insight that an SGD optimization strategy for problems like Eqs. S2 or S5 can use a camera to have the optimization work directly with g and parts of its gradients for many of the required calculations, rather than relying on the approximate model g. Specifically, a CITL-type gradient descent solver starts with some initial guess φ (0) , we use . Convergence graph of CITL optimization with different light sources and propagation models. This evaluation is run on the red channel. The PSNRs are averaged over 5 images at each iteration. Note that the coherent/partially coherent models only differ in their gradients since the CITL optimization uses captured images for forward propagation. Our partially coherent model leads to a performance improvement of around 1.5 dB when using SLED and LED.
random values, and then iterates as Note that both the forward image formation model g φ (k−1) and the partial derivative ∂L ∂|g| (formulated as a Jacobian matrix) are computed with the captured camera image using the physical image formation g. Thus, the only part where a model mismatch occurs here is in the term ∂| g| ∂φ , which uses the gradients of the proxy model from Eq. S5 for the backpropagation pass, because these are physically inaccessible by the camera. s is a scale factor that can contribute to bridging the gap between the captured and target amplitudes. We adopt this CITL optimization strategy for our partially coherent image formation and implement the forward model as well as the optimization routines in PyTorch, which makes it easy to have Pytorch's automatic differentiation engine calculate the partial derivatives of the proxy model. Refer to the next section for implementation details.

A. Hardware Implementation
As shown in Figure S2, we build a benchtop holographic near-eye display setup using two types of partially coherent illumination sources-LED and SLED.
For the first demonstration, we use a white mounted LED (Thorlabs MNWHL4f) with a maximum output power of 880 mW, a multi-mode fiber (Thorlabs M72L01) with a diameter of 200 µm and an NA of 0.39, a pinhole with a diameter of 75 µm, and one of three laser line filters with a 1 inch diameter and with their central wavelengths at 633 nm, 532 nm, and 460 nm, respectively. The FWHM of the filters is 10 nm. For another demonstration, we use a SLED module (EXALOS RGB-SLED engines) that contains three aligned diodes and is coupled with a single-mode fiber with a maximum output power of 5 mW. The central wavelengths are at 635 nm, 510 nm, and 450 nm, respectively. Step 1 Step 2

…
Step n-1 Step n PH Partially coherent holographic display The collimating lens (L3 in Figure S2) is an achromatic doublet with a focal length of 200 mm. The baseline laser for comparison experiments is a FISBA RGBeam fiber-coupled module with three optically aligned laser diodes with a maximum output power of 50 mW. In our implementation, color images are captured as separate exposures for each channel and then combined in post-processing.
The SLM is a HOLOEYE LETO phase-only LCoS with a resolution of 1,920 × 1,080 and a pixel pitch of 6.4 µm. This device provides a bit depth of 8 bits and a diffraction efficiency of over 80%. The eyepiece is a Nikon AF-S 50 mm f/1.4D lens (L6). Other components include a polarizer (Thorlabs WP25M-VIS), and a beam splitter (BS, Thorlabs BS016). We further use a 4f system consisting of two Nikon 50 mm f/1.4D lenses (L4 and L5) and an iris with a diameter of 4 mm to optically filter out higher diffraction orders. Note that this mechanism does not filter out the undiffracted light (i.e., the direct current or DC component). All images are captured with a FLIR Grasshopper3 2.3 MP color vision sensor through a Nikon AF-S Nikkor 35 mm f/1.8G lens. Captured images are processed on a PC to update the patterns displayed on the SLM. As such, we have realized a partially coherent holographic display with camera-in-the-loop optimization.
We experimentally characterize the coherence property of both light sources. The spectral response curves of the customized partially coherent light sources are illustrated in Figure S3. In the optimization, we extract the weighting response q from these characterized plots.

B. Software Implementation
All CGH algorithms are implemented in PyTorch [29]. Pseudocode for SGD and CITL algorithms with the stochastic sampling are outlined in Algorithms S1 and S2, respectively. The homography used in the experiments follows the same procedure in the recent work [23]. As a specific instance, with the SLED-based implementation on the GPU Nvidia RTX 2080Ti, the optimizations process outlined in Algorithms S1 and S2 takes about 100 s and 480 s for 500 iterations, respectively. For all algorithm implementation in this work, we set the learning rate to 0.006 for all phase variables, 0.001 for the scalar, and we use the 2 loss function. randomly sample ω 1 , · · · , ω M , λ 1 , · · · , λ M 3: randomly sample ω 1 , · · · , ω M , λ 1 , · · · , λ M 3: |g| ← replace(| g|, camera_p(φ))

ADDITIONAL RESULTS
Ablation study on temporal and spatial incoherence modeling To fully explore the impact of the pinhole size on the spatial coherence and on the perceptual quality, we conduct an ablation study. In Figure S4, we present comparison results using the laser, SLED, LED + 100 µm pinhole, LED + 75 µm pinhole, and LED + 50 µm pinhole, respectively. We observe that decreasing the pinhole size leads to stronger spatial coherence that further contributes to a less ill-posed optimization problem. However, a smaller pinhole comes at the cost of sacrificing light efficiency. We experimentally observe that a 75 µm can provide an almost comparable display quality for an LED with reference to laser-based holography. Importantly, speckle artifacts are significantly mitigated. In these experiments an achromatic doublet with a focal length of 200 mm is used to collimate the light emitting from the pinhole. Next, we present a more comprehensive study with different sampling schemes along the wavelength and angle diversity (see Figure S5), as the complementary to Table 1 in the main text. We observe that when only sampling one wavelength and one source point for the LED, the results exhibit blur and low contrast. Increasing the number of samples in either spectral or spatial domain leads to improved image contrast as well as higher spatial resolution. The full partially coherent model leads to the best reconstruction quality, i.e., the highest PSNR. However, comparing with the target image, partially coherent holography with the LED still exhibits an expected amount of resolution and contrast loss. In contrast, using an SLED leads to a much better compromise between speckle and resolution (see the bottom row of Figure S5).

Analyzing SLED Bandwidth and LED Pinhole Size
Our results, especially the ablation study shown in the previous subsection, indicate that there may be an optimal spectral bandwidth of an SLED source and an optimal pinhole size of the LED. In this subsection, we aim at answering the following questions: (1) what is the optimal bandwidth of an SLED?; (2) what is the optimal pinhole size for an LED with an extremely narrow bandwidth?; (3) what is the optimal pinhole size for an LED with a moderate bandwidth?
To answer these questions, we perform three experiments in simulation. For all of these, we compare the results achieved by a conventional SGD solver and our CITL approach. Here, SGD uses our partially coherent forward model (Eq. S4) to simulate the forward pass from phase-only SLM pattern to intensity on the target plane and it also uses the gradients of this partially coherent model for the error backpropagation pass. With the SGD approach, we optimize the SLM phase pattern using this partially coherent model and then we simulate how this phase pattern would look on a physical display by introducing some amount of mismatch between the wave propagation model and the physical optics. Specifically, we simulate this mismatch by introducing zero-mean Gaussian noise with a standard deviation of σ = 0.01, 0.3, 0.42 on the SLM before propagating the field to the target plane. SGD is thus oblivious to this model mismatch and we expect the results to degrade with increasing σ as in conventional holographic displays. Our CITL approach uses the proposed camera-based forward model, which perfectly describes the physical optical image formation, including SLM phase noise, as captured by the camera, but it uses the same approximated gradients as SGD for the error backpropagation pass. We expect CITL to be better than SGD, because it uses a more accurate forward model, which is what we have observed in all of our experimental results so far. Experiment 1 (SLED). Our first experiment, shown on the left of Figure S6, aims to answer question (1): what is the optimal bandwidth of an SLED? We simulate varying bandwidths of a collimated SLED source, measured as full width at half maximum (FWHM) of the SLED's spectral emission profile for 3 different amounts of model mismatch σ. As expected, when the model mismatch is very low (σ = 0.01), i.e., the difference between simulated and physical wave propagation is small, both SGD and CITL perform roughly equal. In this case, there is probably no need to use a CITL approach, as the physical optics can be perfectly modeled. For an increasing bandwidth, the performance of both SGD and CITL drops, as it becomes increasingly difficult to correct the blur introduced by the increasing spectral bandwidth of the source; remember, this is an ill-posed inverse problem. For larger amounts of model mismatch (σ = 0.3, 0.42), however, CITL outperforms SGD by a large margin -about 10 dB of PSNR. The model mismatch results in speckle when the SLED has a very narrow bandwidth, which is mostly but not fully corrected by the CITL optimization approach. Although the forward model for CITL matches the physical optics, the approximated gradients prevent CITL from achieving optimal results for very narrow bandwidths -there will always be some speckle. This is exactly the effect we observe in our experimental results with the laser. As the bandwidth of the SLED increases, the system becomes more tolerant to a model mismatch and achieves better results. Nevertheless, as the bandwidth increases further, it becomes increasingly difficult to solve the ill-posed inverse problem. We note that a bandwidth of approx. 3-5 nm seems to be optimal for the experimental conditions simulated here. This is lower than the bandwidth of 15 nm we determined to be optimal in Table 1 of the main paper. This difference is due to the way we simulate the model mismatch in this experiment and that observed in our physical display. Moreover, the fact that the optimal bandwidth changes for both conditions indicates that best spectral bandwidth of an SLED really does depend on the specific display and its particular optical characteristics. Also see Figure S7 for qualitative results of this experiment for σ = 0.42. Experiment 2 (LED with single wavelength). Our second experiment, shown in the center of Figure S6, aims to answer question (2): what is the optimal pinhole size for an LED with an extremely narrow bandwidth? This experiment is a bit contrived, because LEDs typically do not have such narrow bandwidths, but the experiment is still educational. Unsurprisingly, we observe the same trends as for experiment 1. Varying the spatial coherence of this source leads to a tradeoff between speckle and blur whenever there is a model mismatch between simulated wave propagation and physical optics. The optimal pinhole size is somewhere between 25-35 µm for this experiment, but the optimal diameter depends on the specific amount of model mismatch. This experiment thus confirms that the insights from our first experiment, testing temporal coherence, carry over to spatial coherence. Also see Figure S8 (top row) for qualitative results of this experiment. Experiment 3 (LED with 5 nm bandwidth). Our third experiment, shown on the right of Figure S6, aims to answer question (3): what is the optimal pinhole size for an LED with a moderate bandwidth? This is a more realistic scenario than experiment 2, because we now picked a bandwidth of 5 nm for the LED, as determined for the SLED in these simulated experimental conditions in experiment 1, but we also vary the pinhole size, as in experiment 2. We observe that smaller pinhole sizes are always better than larger sizes using the CITL approach. Also see Figure S8 (bottom row) for qualitative results of this experiment.
These three experiments indeed validate the primary conclusions of our paper: SLEDs, which are spatially coherent (i.e., collimated) and thus do not require a pinhole, always achieve a better quality than LEDs, which are spatially and temporally incoherent. We also confirm that the SLED is the best choice, even over a coherent laser, whenever there is some amount of mismatch between the simulated wave propagation model and the physical optics. If the model perfectly matches the physical optics, however, then a laser is the best choice. However, we have not been able to perfectly calibrate an analytic wave propagation model in our experimental settings, nor have we seen that in the literature. Therefore, SLEDs seem like the ideal light sources for practical holographic displays.
Please note that whenever a pinhole is used, as simulated here for the LEDs, the light efficiency of the optical system suffers, which is a severe problem in practice, especially for optical see-through augmented reality displays. We did not take this loss of light efficiency into account for our quality metrics. Doing so would emphasize our conclusion that SLEDs are ideal light sources for holographic displays even more. Also, we simulated the mismatch between simulated model and physical optics as phase noise on the SLM whereas this would result from a variety of different factors, including phase nonlinearity, optical aberrations, and undiffracted light, in practice. Therefore, the specific numbers of optimal bandwidth for the SLEDs and pinhole size for LEDs derived here should not be taken too literally, although the trends certainly apply in different experimental conditions. Verification of CITL improvement In Figure S9, we present the results obtained with the same setting using our full partially coherent modeling without and with CITL calibration. These results are obtained using LED on single channel (green) and visualized as grayscale images.
Resolution assessment To intuitively show the perceived resolution of partially coherent holography, we present the results of USAF-1951 resolution chart for the green channel, as shown in Figure S10. We observe resolution and contrast loss (see the cropped and zoomed patches) in the result of LED. However, for block regions with relatively uniform target intensities, the two partially coherent holography results show much less speckle artifact, with the SLED preserving the sharpness comparable as that of laser. The SLM phase patterns corresponding to these results are shown in Figure S20.
Additional comparison results of laser-, LED-, and SLED-based CITL holography We present additional comparison results of laser-, LED-, and SLED-based CITL holography, as shown in Figure S12, S14, and S16. We recommend readers to zoom in for distinguishing the difference in perceptual resolution and speckle level. The SLM phase patterns corresponding to all of these results are shown in Figures S13, S15, and S17, respectively.

Additional comparison results of LED-and SLED-based CITL holography
We present additional comparison results of LED-and SLED-based CITL holography, as shown in Figure S18. We recommend readers to zoom in for distinguishing the difference in perceptual resolution and speckle level. In addition, we show that out SLED-based CITL holography is able to deliver great image quality on realistic 2D scenes, as shown in Figure S19. Their corresponding phases to display on SLM are show in Figure S20.   S5. Experimental ablation study on sampling diversity of wavelengths and angles. For each sub-figure, we show peak signal-to-noise ratio (PSNR) in addition to the captured images. Here, a Dirac delta δ A represents the sampling of a single angle, which is equivalent to the assumption that the incident beam is emitted from one single point and well collimated. 37.5µm and 75µm represent the sampling over a finite-sized spot of 37.5 µm and 75 µm, respectively. The latter matches the physical size of the pinhole placed in the prototype display. A Dirac delta δ λ represents the sampling of one single wavelength, 5nm and 15nm represent the sampling over the spectra of 5 nm and 15 nm, respectively. Again, all these results are obtained with CITL calibration on one single channel (green). Image Credit: Eirikur Agustsson, Radu Timofte, and Rachel Davis. The optimal bandwidth of the SLED is between 3-5 nm. (B) The spatial coherence of an LED with a very narrow bandwidth can be controlled by a pinhole. Similar to the varying temporal coherence of the SLED, this results in a tradeoff between speckle and blur whenever there is a model mismatch. (C) For an LED with a bandwidth of 5 nm, larger pinholes always reduce the image quality. Note that using a pinhole with an LED always reduces the light efficiency of the system compared to the SLED; this degradation is not taken into account for the PSNR reported with these experiments.
Single wavelength 5nm FWHM spectrum 40nm FWHM spectrum A B C Fig. S7. Analyzing SLED bandwidth in simulation. When the bandwidth is just a single wavelength, as is the case for a coherent laser, we observe speckle when the simulated wave propagation model does not exactly match the physical optics (A). An SLED with a small bandwidth can optimize the image quality (B). As the bandwidth increases, the blur introduced by the optical image formation becomes too severe to be compensated by the algorithm (C). All of these results are optimized with the proposed CITL approach. Image Credit: Eirikur Agustsson, Radu Timofte, and Rachel Davis. For an LED emitting just a single wavelength over a finite area, a pinhole can control its spatial coherence. A smaller pinhole diameter makes the source more coherent, but results in speckle, whereas a larger pinhole naturally blurs out the speckle but at the cost of degraded image sharpness. (D-F) For an LED with a small bandwidth of 5 nm, a smaller pinhole always results in better image quality. All of these results are optimized with the proposed CITL approach. Image Credit: Eirikur Agustsson, Radu Timofte, and Rachel Davis.     Figure S19.