Single-sample image-fusion upsampling of fluorescence lifetime images

Fluorescence lifetime imaging microscopy (FLIM) provides detailed information about molecular interactions and biological processes. A major bottleneck for FLIM is image resolution at high acquisition speeds due to the engineering and signal-processing limitations of time-resolved imaging technology. Here, we present single-sample image-fusion upsampling, a data-fusion approach to computational FLIM super-resolution that combines measurements from a low-resolution time-resolved detector (that measures photon arrival time) and a high-resolution camera (that measures intensity only). To solve this otherwise ill-posed inverse retrieval problem, we introduce statistically informed priors that encode local and global correlations between the two “single-sample” measurements. This bypasses the risk of out-of-distribution hallucination as in traditional data-driven approaches and delivers enhanced images compared, for example, to standard bilinear interpolation. The general approach laid out by single-sample image-fusion upsampling can be applied to other image super-resolution problems where two different datasets are available.


I. INTRODUCTION
Fluorescence lifetime imaging microscopy (FLIM) finds extensive applications in biological studies where the lifetimes of fluorophores can be used as indicators of the cellular metabolism [1][2][3][4][5], cellular environment [6][7][8] or changes in molecular conformation visible through Förster resonance energy transfer (FRET), enabling measurement of protein:protein interactions during processes such as cellular signalling [9][10][11][12].In medical settings, endogenous FLIM can be used for identifying cancerous tissue [13,14].FLIM setups excite a sample with short-wavelength light and measure the temporal profile of longwavelength fluorescence from the sample [15].Excitation is achieved using a pulsed or amplitude modulated laser for timedomain and frequency domain FLIM, respectively [11], while emission is usually collected with time correlated single photon counting (TCSPC) or time-gated hardware.Fluorescence lifetime is then recovered from the temporal decay of fluorescence emission.Popular lifetime estimation schemes include least squares deconvolution [16], Laguerre expansion [17], phasor fitting [2,3], rapid lifetime determination [18,19], centre-of-mass estimation [20,21] and machine learning [22][23][24].Images are formed through raster-scanning or widefield detection.Scanning systems allow confocal or 2-photon microscopy setups, giving excellent image resolution, and aligning well with TCSPC methods that give rich fluorescence information.
However, scanning also presents drawbacks, such as the lack of instantaneous complete field-of-view information, and long acquisition times which are incompatible with the rapid intracellular dynamics of living cells [25,26].Widefield systems overcome these challenges by measuring temporal decay from the full field of view in parallel, often using time-gated cameras like intensified charge-coupled devices (iCCDs) [27,28], externally gated devices [29,30], or single photon avalanche diode (SPAD) arrays [23,31].However, iCCD resolution is limited by the intensifier point-spread-function, whilst SPAD arrays typically have low-pixel counts and/or low fill-factors.Computational super-resolution (SR) provides a route to overcome the trade-off between acquisition time and spatial resolution by offloading imaging from optics onto software.SR takes an undersampled image of a scene and estimates its high-resolution features.Multiple flavours of SR exist, which are generally either interpolation, reconstruction (inverse retrieval) or example-(learning) based.Interpolation is the simplest form of upsampling, encompassing several methods for connecting datapoints with some curve [32].For images, this ranges from simple schemes like nearest, bilinear and bicubic interpolation, through frequency-based approaches sinc and Lanczos interpolation, to covariance-based algorithms like kriging (Gaussian processes) [33].While interpolation is fast and computationally inexpensive, it does not add new information to the image.Reconstruction-based modeling instead manipulates the detection to optically redistribute information about the high-resolution target into fewer measurements.This encoding provides a mathematical forward model that is employed to reconstruct the non-sampled points in an inverse retrieval framework, for example via point spread function (PSF) engineering [34], blurring [35], or compressed sensing [36][37][38][39].Lastly, example-based schemes rely on computation and pattern recognition to upsample images in a data driven manner [40].
Classical approaches include neighbour embedding [41], sparse coding [42] and anchored neighbourhood regression [43].More recently, machine learning algorithms have seen widespread adoption for super-resolution [44,45].These range from super-resolution convolution neural networks [46,47], through generative adversarial networks [48,49], to diffusion models [50].However, learningbased schemes traditionally need large, diverse training datasets, which can pose a bottleneck in niche fields like FLIM; further, different fluorophores behave differently, hampering generalisation in traditional machine learning methods [51].Self-similarity-based super-resolution [52] and self-supervised clustering [53] approaches offer an alternative to external training set, deriving statistical information for super-resolution from the very image that is up-sampled.Data from different sensing modalities can yield more information about a subject than is contained in each modality alone [45,54].Fusion-based inference is a growing field with applications from medical imaging using PET and MRI [55], through autonomous driving using camera and LiDAR [56], to content classification using video and text [57].Data fusion has been applied to FLIM, by interpolating lifetime images and weighting them with intensity images for visualisation [24,58].
Here we introduce a super-resolution method that relies on the fusion of two images: a high-resolution intensity image (no lifetime information) and a low-resolution lifetime image.Our method is called 'single sample image fusion upsampling' (SiSIFUS).SiSIFUS generates data-driven lifetime priors matching the resolution of the intensity image; this is relatively easy and inexpensive to acquire at high-resolution, compared to FLIM images.Crucially, our method generates 'single sample' priors: all information in our scheme comes from the given field of view, not external training data.We develop two priors, which extract this information from the FLIM-intensity image pair in different ways.Local priors correlate low resolution FLIM pixels with corresponding intensity pixels in small neighbourhoods.Global priors instead exploit morphological signatures in the image, using a neural network to predict fluorescence lifetime from intensity patches.SiSIFUS combines data fusion and self-supervised learning into a practical super-resolution framework.Like example-based self-similarity approaches, it avoids complex hardware modifications and external training data.Like reconstruction-based modelling, we optically measure high-resolution features, giving more information than is available in the low-resolution images alone.

II. RESULTS
Forward and inverse models.We apply SiSIFUS to both raster-scanning and widefield FLIM.The scanning system uses a PMT to gather both the FLIM and intensity image, while the widefield system uses a SPAD array to measure FLIM, and a complementary metal-oxide-semiconductor (CMOS) camera to measure intensity.Both of the setups are detailed in the Methods.SiSIFUS involves two measurements.The first is the time-resolved, low-resolution datacube, r ∈ N m,n,t , where m, n denote spatial position and t denotes time.The fluorescence lifetime image, τ LR ∈ R m,n , is estimated from r via a standard least squares deconvolution -other schemes, like phasor analysis or centre-of-mass estimation, could be used equivalently.The second measurement is the high-spatial resolution intensity measurement, I ∈ N M,N , where M, N denote the pixel numbers of the high-spatial resolution sensor.SiSIFUS then super-resolves the lifetime image τ LR to match the pixel count of the intensity image I. Our setups sample fluorescence lifetime sparsely across the field of view.In the widefield setup, this arises from low fill-factor, ergo large dead spaces between the active areas of the SPAD pixels.In the scanning setup, this arises from the large sampling period relative to the spot size of the excitation beam in the object plane.We also assume that the intensity measurement has approx.100% fill-factor.Fig. 1A and Fig. 2A depict how intensity and lifetime are sampled.For 256×256 sized high-resolution intensity image I of the sample, the acquired dataset is integrated along the time axis.In a practical scenario for upsampling a confocal scan image, the FLIM samples would be acquired by taking a large line-average of low-resolution scans.A large line-average is needed for the fitted lifetime to have decent signal-to-noise ratio (SNR).The intensity image has decent SNR even with just a few line-averages, therefore the high-resolution intensity image could be obtained without adding an external sensor, by simply scanning a second time, with a higher pixel count but much fewer line averaging.Consequently, τ LR is decimated (sparsely sampled) from the high-resolution fluorescence lifetime target, τ HR ∈ R M,N , that we aim to reconstruct: where A represents sparse sampling (decimation).We feed the two images, I(M, N ) and τ LR (m, n), to our prior-generation pipeline (explained below), which outputs a local and global prior, τLP (M, N ) and and τGP (M, N ), respectively.These priors constrain an (otherwise ill-posed) inverse retrieval algorithm.We finally recover the high-resolution lifetime image τ HR * by minimizing the following cost function: where The first term in C(τ HR ) ensures the data fidelity between the low-resolution measured lifetime image and the downsampled optimal high resolution lifetime solution in each iteration.Prior constraints on the target high resolution lifetime image are enforced through the second and the third data fidelity term, weighed by the factor γ and β respectively, which are empirically optimised to yield best results.The fourth term is the L1norm of the 2D total variation (TV) evaluated on the high-resolution lifetime image and weighed by α [59].We zoom in on a 5 × 5 window.All SPAD pixels have a corresponding CMOS measurement, but so do the areas in-between SPAD pixels.We aim to find the lifetime at points with no SPAD samples.For this, we fit a function, for instance linear interpolation, a cubic spline or a radial basis function gaussian process.Then, the high-resolution CMOS pixels xHR which we wish to upsample are fitted with this function, producing a lifetime estimate tauHR.(C) We slide the window across the field of view, fitting new functions for each new window, and predicting the centres, upsampling the FLIM image to the resolution of the intensity image, window-by-window.
We consider the anisotropic form of the TV [60], and so the operator D represents the finite differences approximation of the horizontal and vertical image gradients.
A. Dependence between lifetime and intensity.
SiSIFUS priors exploit inter-dependence between fluorescence lifetime and intensity.Although these variables are interdependent at the single molecule level via fluorescence quantum yield, this dependence is modulated by fluorophore concentration and other complex and often unpredictable biophysical mechanisms, thus necessitating statistical methods to create our priors.Fluorescence quantum yield Q is the ratio of the number of emitted photons to the number absorbed.It depends on the radiative and non-radiative decay rates k r and k n r that depopulate excited molecules.The measured fluorescence lifetime τ also depends on these rates [16]: therefore Q = k r τ for a single molecule.Across a given field-of-view, fluorescence intensity variations are given by the quantum yield of fluorophores (equivalently, fluorescence lifetime) multiplied by their absorbance (concentration times absorptivity times sample thickness).
Absorbance is typically unknown and unpredictable, hence it acts as a confounding variable in intensitylifetime dependencies, so fluorescence intensity alone cannot give us full lifetime information.This means that two samples might have the same lifetime, but completely different intensities, or vice versa.
However, across a single sample, fluorophore concentration typically varies slowly compared to lifetime and/or covaries with it on local scales, such that it is possible to build local priors that capture the resulting intensitylifetime dependencies.Further, absorbance and lifetime often co-vary with cellular morphology, enabling us to create global priors.As a fail-safe, if neither local nor global dependencies exist across a specific sample or sub-region, TV-minimisation (a form of edge-preserving interpolation) in our inverse retrieval ensures that our method still performs at least as well as standard interpolation (see supplemental material for details).
Local prior.The local prior (LP) relies on direct, pixelwise dependencies between lifetime and intensity on micron scales: Fig. 1 illustrates our workflow.If the images come from different detectors, the lifetime and intensity image are first coregistered to match their fields of view.Fig. 1A shows a sparse, low-resolution lifetime image (RGB) overlayed on the corresponding intensity image (grayscale).The field of view (FOV) is divided into windows, each containing a set of corresponding intensity-lifetime samples.These samples neighbour intensity pixels in the window centre, hence this window is used to create a prior for those pixels.In each window, the intensity and lifetime pairs are vectorised and fitted with a function, f -see Fig. 1B.Thus, our lifetime estimate τ for pixel (λi + x, λj + y), is: with samples i ∈ {0, 1, ..., n − 1} and j ∈ {0, 1, ..., m − 1}, and x ≥ 0, y < λ.Importantly, the functions f i,j are fitted locally, not globally.Consequently, this procedure is repeated by sliding the window across the field of view, as shown in Fig. 1C. of similar features, with similar lifetime distributions, across the field of view.This motivates our development of global priors (GPs) that exploit correlations between high-resolution morphology and lifetime.We then label these patches with the same lifetime as their sampled neighbour.Our approach is visualised in Fig. 2B; see Methods for details.

Global prior. Images often contain multiple examples
Our global priors are designed to generalise to new samples with previously unseen morphologies, morphologylifetime dependencies, and lifetime ranges.A deep neural network (DNN), shown in Fig. 2B, is trained from scratch for each new sample, on the intensity-patch inputs and lifetime labels obtained from the given microscope field of view.Consequently, different DNN initialisations give slightly different predictions.To estimate high-resolution lifetime, we pass each intensity patch through our trained DNN, predicting the central lifetime value, as shown in Fig. 2C.Quality metrics.We track reconstruction quality using three metrics: learned perceptual image patch similarity (LPIPS), structural similarity index measure (SSIM) and peak signal-to-noise ratio (PSNR).LPIPS measures distance between images in feature space.It has a minimum of 0 and grows with image dissimilarity (higher values are worse).SSIM tracks the similarity in luminance, contrast and structure between two images; it is bound between -1 and 1, with larger values indicating better image similarity.Lastly, PSNR is a pixel-to-pixel comparison, where larger values are better.See the Supplement for details.Sample 1: 16x16 (MDCK Flipper-TR).We examined a validation sample of Madin-Darby canine kidney (MDCK) cells that had been treated with Flipper-TR dye (Spirochrome Inc.), which allows quantitation of tension in living, migrating cells.Data was acquired using a commercial LaVision BioTec TriM Scope system, using two-photon excitation scanning, and detecting emission via a photo-multiplier tube (PMT).The sample was imaged at 512 × 512 spatial points covering a 167 × 167 µm 2 FOV, and binned into 75 time bins, giving a 512 × 512 × 75 datacube.See Methods for details.
Ground truth (GT) fluorescence lifetime was estimated from this datacube using least squares deconvolution.This was decimated 16-fold to give a low-resolution FLIM image, shown in Fig. 3A.The low-resolution FLIM image is severely undersampled; a lot of the detail was lost.Fluorescence intensity (Fig. 3B) was obtained in parallel, by summing the datacube along time.
In the intensity image, we see that the probe mainly localised to two types of structures: small blobs (vesicles) and edges (cell membranes).In Fig.  Fig. 3E shows the ground truth lifetime, weighted with local contrast enhanced fluorescence intensity for visualisation (details in the Supplement).In Fig. 3(F-G  trast between these structures more consistently than bilinear interpolation.We do note though that global SiSIFUS misses certain hotspots in the GT lifetime image (high-lifetime, yellow/red coloured areas), likely because few globules in the training set have these lifetimes, hence the model treats them as outliers.SiSI-FUS achieves an LPIPS of 0.  pendencies and priors.This sample shows non-linear negative local interdependencies at most regions; SiSIFUS can exploit these to accurately determine the lifetime based on local intensity patterns.Conversely, global patch-lifetime dependencies are negligible.This is mainly because the field of view lacks repeating morphological features (in contrast to the MDCK and convallaria samples in Fig. 3 and Fig. 4).Our algorithm prioritises the local priors.Finally, Fig. 6(E-G) shows the ground truth compared to SiSIFUS and bilinear interpolation.SiSIFUS succeeds in reconstructing the lifetime boundaries seen at the cell edges, and also reconstructs the speckliness of the ground truth lifetime map, allowing the user to infer that there might be lifetime estimation uncertainty.Interpolation fails in these regards: edges are blurred, and lifetime estimates appear smooth give the impression of structures that are absent in the ground truth.

B. Acquisition times
SiSIFUS provides an advantage in terms of acquisition times.For example, if we consider the case of measurements taken with our TriM Scope I (Figures 3 and 5), the acquisition time scales linearly with pixel number as this is a galvo-scanning system.Therefore, we have an immediate advantage given by the SiSIFUS resolution enhancement factor that is applied.Specifically, in Figure 3, where we apply 16x16 resolution enhancement, we have a 256x reduction in the number of points that need to be scanned and hence a 256x reduction in acquisition time.In a scanning system we still, however, need to perform a second scan for the high-resolution intensity measurement but this typically can be at substantially higher speed, of order 35x in our system (and this therefore remains the limiting factor).If we therefore consider the specific case of a 512x512 image (Figure 3), the total acquisition time without SiSIFUS was 73 seconds and with SiSIFUS is 2.4 seconds allowing a 0.4 fps acquisition rate.If instead we consider the case of measurements with a SPAD camera (in our case, the Horiba FLIMera system, Figures 4 and 6) this currently operates at 30 fps, i.e. 33 ms acquisition time for a SiSIFUS image of any size (all pixels are acquired in parallel without any pointby-point scanning used in confocal imaging systems).We note that in this case, the intensity CMOS image is acquired in parallel and hence does not add to the acquisition time.
We may compare this also with existing commercial systems, e.g.current B&H FLIM systems can measure 512x512 pixels in 1 second [62] or previous work that operated directly with megapixel SPAD arrays in which, however, the smaller pixel size implied longer acquisition times in order to accumulate sufficient signal and was thus limited to ∼1 second acquisition times [23].

III. DISCUSSION
We introduce SiSIFUS, a robust, data-fusion pipeline based on prior-augmented inverse retrieval for upsampling fluorescence lifetime images.We create two classes of priors that explicitly exploit a high-resolution intensity image to provide approximations for the nonsampled datapoints in a fluorescence lifetime image.
The goal of SiSIFUS is to provide a "physics inspired" approach to image resolution enhancement that performs better than standard bilinear or similar interpolation methods.
Local priors capture pixel-wise correlations between fluorescence lifetime and intensity.For this, we find a direct mapping from intensity values to lifetime in small, local neighbourhoods, and use this mapping to predict the lifetime of intensity pixels that lack corresponding lifetime pixels.This allows SiSIFUS to maintain sharp spatial boundaries, tracking the boundaries of our intensity image.The local prior is limited by measurement noise and sampling frequency.Since structures of similar intensity are assigned the same lifetime, undersampled regions may receive homogeneous lifetime estimates with sharp boundaries, as seen in the leftmost cell in Fig. 5 (c).Noisy regions can instead artificially track the intensity of an image's noise, as in Fig. 6(c).TVminimization and the global prior help combat these issues.
Global priors capture inter-dependence between FLIM and intensity on a morphological level.This is achieved by learning a mapping (deep neural network) from in-tensity patches to central-pixel lifetime samples, and then using mapping to predict the lifetime of patches that have no central lifetime measurements.Thus, we capture non-linear correlations between the brightness and shape of intensity features and lifetime.In microscopic samples, there often exist strong global trends between these variables, allowing the model to predict the lifetimes of patches with unsampled centres.However, the global prior has limited ability to distinguish between similar morphologies with different lifetimes, typically assigning them with the average lifetime of such structures.This causes outliers like the low-lifetime globules on the left of Fig. 3 or the high-lifetime vesicles in Fig. 4  A key feature that we believe will be beneficial in any such approach is that image reconstruction is never based on statistical inference from other images -only the single image samples acquired from the two cameras are used thus strongly reducing or eliminating artefacts that may occur for example in other machine learned approaches that do indeed rely on large sets of additional data and images and thus representing a potential point of failure that is of concern in many applications.

A. Experimental design
Mammalian cell culture conditions.Both the SKOV3 ovarian cancer cells and the Madin-Darby canine kidney (MDCK) cells were maintained in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% FBS, 2 mM L-Glutamine and 1X PenStrep.Cell lines were maintained in 10 cm dishes at 37 • C and 5% CO2.SKOV3 cells were transfected in the morning using Amaxa Nucleofector (Lonza) kit V, program V-001 with either 5 µg Raichu-Rac1-Clover-mCherry or pcDNA3.1-mCloverDNA (adapted from [63]) following manufacturers guidelines and replated on 6 cm TC-treated dishes at 37 • C and 5% CO2.For live cell imaging, cells were collected and replated onto 35 mm glass bottom MatTek dishes that were previously coated overnight with laminin (10 µgml −1 ) diluted in PBS.These were left overnight at 37 • C, 5% CO2.The next morning prior to imaging, the dishes were washed twice with pre-warmed PBS and replaced with pre-warmed FluoroBrite DMEM supplemented with 10% FBS, 2 mM L-Glutamine and 1X PenStrep.For fixed cell imaging, the cells were collected and replated onto 22 mm glass coverslips that were previously coated overnight with laminin (10 µgml −1 ) diluted in PBS.These were left overnight at 37 • C, 5% CO2.The next day, these cells were fixed in 4% PFA for 10 minutes and washed with PBS and mounted using Fluromount-G (Southern Biotech).The MDCK cells were trypsinised and plated on 35 mm glass-bottom MatTek dishes and left to settle for 4 hours.Flipper-TR ® probe (Cytoskeleton; CY-SC020) was resuspended in 50 µl anhydrous DMSO as per manufacturers instructions to yield a 1 mM stock.Flipper-TR was diluted in culture media to 2 µM and incubated on the cells overnight at 37 • C, 5% CO2.The next morning prior to imaging, the dishes were washed twice with pre-warmed PBS and replaced with prewarmed FluoroBrite DMEM (ThermoFisher Scientific; A1896701) supplemented with 10% FBS, 2 mM L-Glutamine, 1X PenStrep and 2 µM Flipper-TR.Multiphoton raster-scanning time-domain FLIM: Experimental set-up details.For the dataset shown in Fig. 5, cells were left to equilibrate on a heated microscope insert at 37 • C, perfused with 5% CO2 prior to imaging.Images were acquired in the dark using a multiphoton LaVision TRIM scan head mounted on a Nikon Eclipse inverted microscope with a 20X water objective.Illumination is provided by a Ti:Sapphire femtosecond laser (Coherent Chameleon Ultra II) used at 920 nm (12% power).The fluorescence signal was passed through band pass filters 525/50 nm emission and acquired using a FLIM X-16 Bioimaging Detector TCSPC FLIM system (LaVision BioTec).A 301 × 301 µm 2 FOV corresponding to 256 × 256 pixels was imaged at 600 Hz with a 10 line average in a total acquisition time of 5199 ms.For the dataset shown in Fig. 3, cells were left to equilibrate on a heated microscope insert at 37 • C, perfused with 5% CO2 prior to imaging.Images were acquired in the dark using a multiphoton LaVision TRIM scan head mounted on a Nikon Eclipse inverted microscope with a Nikon Apo 60X oil objective, 1.4 NA.Illumination is provided by a Ti:Sapphire femtosecond laser used at 970 nm (8% power) with an acquisition delay of 5.440 ns.The fluorescence signal was passed through emission band pass filters 600/60 nm and acquired using a FLIM X-16 Bioimaging Detector TCSPC FLIM system (LaVision BioTec).A 163×163 µm 2 field of view correlating to 512×512 pixels was imaged at 600 Hz with a 70-line average for a total acquisition time of 72575 ms (High-Res).A total of 100 High and Low-Res images taken from 3 independent experiments.Background images (High and Low-Res) were obtained by closing the scan-head using the above settings.Instrument response function (IRF) was obtained using carbon nanorods with the above settings and a 1% laser power.Widefield time-domain FLIM: Experimental set-up details.For the datasets shown in Figures 4  and 6, a custom microscope system was built using high spatial resolution sCMOS sensor (Andor's Zyla) and the FLIMera SPAD array sensor.Spatial registration was achieved by identifying a set of four co-registered points on the SPAD and CMOS, and mapping the CMOS image with a perspective transformation to match the field of view of the SPAD image.See the Supplement for a schematic of the experimental set up.

B. Statistical Analysis
Inverse Retrieval Algorithm.The optimization is implemented using the alternating direction method of multipliers (ADMM) algorithm.For this the minimisation in Eq. 2 can be re-formulated as: The Augmented Lagrangian for this problem can be written as: Here y is the Lagrangian multiplier (or the dual variable) and ρ is the penalty parameter.The ADMM approach involves jointly minimizing the Lagrangian over all the primal variables followed by the updates over the dual variables.The primal updates for the variables τHR and z are given by: τHR k+1 ←arg min The dual update is given by: For the primal minimisation update, we use the standard optimization technique based on the fast iterative soft thresholding algorithm (FISTA).Each iteration of the ADMM hence comprises of 90 iterations of FISTA for the τHR variable update.The weighting factor γ for the local prior term in the cost function has been kept constant for all the cases wherein γ = 1.The factor β on the other hand is varied for different upsampling factors such that it is 0.02 for 2x and 4x upsampling factors and 0.5 for higher upsampling factor of 8x and 16x.The GP cannot predict lifetimes within 6 pixels of the edges of the sample, since one cannot extract a 13x13 window centred on these pixels.Consequently, the GP's contributions from these regions are removed from the IR reconstruction.A total of 20 Lifetime and quantum yield.Fluorescence lifetime is described in literature as being independent of fluorescent intensity [12,65], and of fluorophore concentration [66] and excitation intensity.Here, we examine the context of these claims, and demonstrate the limitations of these generalisations.
In fluorescence, a photon excites a ground-state electron into an excited state, which then decays back to the ground state radiatively at a rate known as the decay rate.Other decay pathways compete with fluorescence, such as non-radiative decay and inter-system energy transfer between the fluorescent molecule and its environment.
The probability of emitting a fluorescent photon per excitation event is called the quantum yield of fluorescence.Fluorescent intensity is the product of excitation intensity, the absorbance of the fluorophores (which depends strongly on their concentration) and fluorescence quantum yield.The decay rate is the inverse of fluorescence lifetime, which is the expected time that an electron spends in the excited state before decaying via fluorescence.This is an intrinsic property of the molecule, and thus, is assumed to be independent of factors like fluorophore concentration.Consequently, fluorescence lifetime can be used to distinguish between different molecule populations.However, fluorophores interact with their environment.The environment, in turn, can modulate both the excitation and emission pathways, changing both intensity and lifetime.Excitation can be enhanced or quenched by metallic surfaces or particles within the sample such as silver [67] via plasmonic resonance.Emission is modulated via nonradiative (or alternative) decay pathways, quenching the molecule's radiative fluorescence as well as its lifetime, as derived in the main section of the paper.Fluorescence intensity.An imaging system generates a fluorescence intensity signal that depends on the spectral radiance L f (λ o ) of the sample and the net photon detection efficiency P DE(λ o ) of the imaging system.Let us consider a thin sample within the focal length of the optical system, using an epifluorescence setup.Using nomenclature from [68], the spectral radiance L f (λ o )[W sr −1 m −2 nm −1 )] emitted by the sample at wavelength λ o from excitation light at λ x is given by: where I x is the incident excitation power [W ], Ñ (x, y) is the 2D concentration of fluorophores [m 2 ] (the integral of the 3D concentration N of fluorophores along the length of the sample along the optical axis z, Ñ (x, y) = z N (x, y, z)dz, Ω is the solid angle through which emitted light is collected from the sample [sr], ϵ(λ x ) is the absorptivity [m 2 ] of the fluorophore as per the Beer-Lambert Law, and Q(τ, λ o , λ x ) is the lifetime-dependent, spectral quantum yield of fluorescence.This spectral radiance is imaged onto a detector that has a response R[AW −1 ] using a system with some étendue Γ[m 2 sr].To obtain the signal generated by the emission spectrum, we must integrate over the emission spectrum, giving: Substituting Eq. 10 into Eq.12, and integrating over the acquisition time t a gives us the measurement M [C]: A fixed excitation and detection system allows us to calibrate the intensity I x , the collection solid angle Ω, the etendue Γ, the response R(λ o ), and the acquisition time t a .Therefore, variations of intensity across the field of view will depend on molecular concentration Ñ and absorptivity ϵ(λ x ) (whose product is the absorbance of the fluorophores), as well as the spectral quantum yield of fluorescence Q, which depends on fluorescence lifetime.Dependence of intensity on lifetime.Absorbance and fluorescence lifetime appear to be unrelated, hence absorbance (ergo, fluorophore concentration) is an unpredictable confounding variable in intensity-lifetime dependencies.Consequently, a fluorescence intensity measurement alone cannot give us full lifetime information.We therefore must use statistical priors to extract intensity-lifetime dependencies in the presence of biological confounding variables.A local prior is developed to extract dependencies when lifetime varies more rapidly in space than these confounding variables, or when they correlate with lifetime on local scales (either positively or inversely).Further, many biological samples absorb fluorophores into particular subcellular compartments such as the cell membrane [69], vesicles [70] or the nucleus [71].This results in lifetime patterns that often track cellular morphology.A global prior is developed to extract such dependencies.If absorbance were completely randomly distributed (which tends not to be the case in real samples), our method would not offer improvement over interpolation, instead our methods might overfit on noise patterns.To prevent this, our algorithm uses TV-filtering to prevent very unrealistically noisy lifetime estimates.The question is whether recognisable intensity-lifetime dependencies actually exist in biological samples, or if absorbance renders them unusable.Below, we consider a series of case studies of fluorophore-environment interactions reported in literature, focusing on how these interactions modulate intensity and lifetime.Case studies.Okabe et.al. [6] used a complex fluorescent molecule made of a thermosensitive unit, a hydrophilic unit and a fluorescent unit to monitor temperature.In response to higher temperature, the molecule becomes hydrophobic, curling up and increasing both fluorescence quantum yield (thereby, intensity) and fluorescence lifetime.Fluorophore concentration still affects fluorescence intensity; however, locally (in regions of uniform concentration or at organelle edges), intensity and lifetime covary.Indeed, the authors use this probe to demonstrate temperature differences between the nucleus and cytoplasm of cells, which are visibly differentiable on both the lifetime and intensity maps.Ogikubo et.al. [7] used cellular auto-fluorescence of NADH to monitor intracellular pH.Their results show evident covariance of fluorescent intensity with fluorescence lifetime within cells; even though intensity is not a marker of pH, both intensity and fluorescence lifetime depend on the location of NADH within the cell.The reason for this is not explicitly explored, but different works have shown that the ratio of bound to free NADH depends on the local metabolism of the cell, which influences both the fluorescence lifetime and concentration of NADH autofluorescence [11].Correlations are similarly visible between NADH fluorescence intensity and lifetime in works by Stringari et.al. [12], as both of these parameters are covariate with cellular redox ratio.
Van der Linden et.al. [72] use FLIM as a tool for a quantitative measurement of calcium levels, independent of hardware.However, for a given hardware, fluorescent intensity spikes clearly show calcium spikes, even if they do not give absolute calcium concentrations on their own.Indeed, the authors demonstrate that their FLIM probe works by showing Supplementary videos of fluorescent intensity and lifetime side-by-side, which both show synchronised flickering.Lifetime and and intensity are strongly temporally correlated and are also locally correlated: cellular organoids have quasi uniform intensity and lifetime, both of which experience sudden gradients at organoid boundaries.
Verboogen et.al. [73] demonstrate a FLIM-FRET probe for the imaging of SNARE trafficking in dendrites.For example, Förster resonance energy transfer (FRET) relies on this phenomenon.In FRET, the fluorophore, known as the donor, is linked to another fluorophore known as the acceptor, such that their relative conformation can change.The donor molecule is excited and its fluorescence measured.If the donor and acceptor are far, the donor will decay as if it were alone.If the donor and acceptor are in close vicinity, excited electrons can transfer energy from the donor onto the acceptor molecule, providing an alternative decay path for electrons, decreasing both fluorescence quantum yield (thus, intensity) and lifetime.Gorpas et.al. [14] use skin autofluorescence to determine qualitative boundaries between cancerous and healthy skin tissue.They demonstrate that FLIM shows skin cancer; they do so by overlaying an augmented reality image of FLIM onto a visibly melanated patch of skin, whose colour correlates strongly with its lifetime.

II. LOCAL PRIOR
We performed a study to find the best window size and best function to map fluorescent intensity onto fluorescence lifetime with local priors.The window sizes were in the range 2 to 8, while the functions were a set of common schemes, ranging from B-splines (linear, quadratic and cubic), through regular interpolation (nearest, linear and cubic); and kriging (radial basis function Gaussian process fitting).
We applied these window sizes and functions to 4 samples (including the three shown in Figures 4, 5, and 6) and 4 upsampling factors (2,4,8, and 16x).We evaluated the methods based on mean-absolute-error and LPIPS between the reconstruction and ground truth, averaged over these 4×4 scenarios.Our results are shown in Supplementary Fig. 1 and Supplementary Fig. 2. Based on these results, we decided to use a window size of 5 and linear interpolation for generating LPs.Supplementary Fig. 1.We found the mean LPIPS for priors generated using various window sizes, averaged across our 4 samples and 4 upsampling factors (2,4,8,16).We plot both the geometric and arithmetic mean.

FIG. 1 .
FIG. 1. Schematic of the local prior method.(A) Shown are a CMOS (fluoresence intensity) field of view, with the SPAD field of view (fluorescence lifetime), overlayed on top of it so as to match the sparse, low fill-factor pixel layout of the SPAD array.(B)We zoom in on a 5 × 5 window.All SPAD pixels have a corresponding CMOS measurement, but so do the areas in-between SPAD pixels.We aim to find the lifetime at points with no SPAD samples.For this, we fit a function, for instance linear interpolation, a cubic spline or a radial basis function gaussian process.Then, the high-resolution CMOS pixels xHR which we wish to upsample are fitted with this function, producing a lifetime estimate tauHR.(C) We slide the window across the field of view, fitting new functions for each new window, and predicting the centres, upsampling the FLIM image to the resolution of the intensity image, window-by-window.

FIG. 2 .
FIG. 2. Schematic of the global prior method.(A) Fluorescence intensity of a convallaria -acridine orange sample, with 8 × 8 sparse lifetime samples overlayed.We extract intensity patches from this image; a few of them correspond to a central lifetime sample.Such patches are training data, which we can use to predict the central lifetime of the rest of the patches.(B) Training inputs (patches) are augmented via rotation and mirroring.They can be further augmented by adding the patches which are nearest neighbours of training patches and allocating them the same label (lifetime) as the sampled patch.The deep neural network (DNN) architecture is simple, consisting of three 2D convolutional layers followed by three fully connected layers.(C) Finally, the trained DNN evaluates patches with unsampled centres, thus super-resolving the lifetime image.

Fig. 2
shows our pipeline.We first extract intensity patches centred on our SPAD pixels, as shown in Fig.2A.To deal with the relatively small number of patchlifetime pairs contained in a single sample image, we augment our training set.We use a commonly used dataset augmentation technique by reflecting and rotating the intensity windows in the training set.These operations increase our dataset 8-fold, as shown in Fig.2B.For high upsampling factors (8 × 8 and 16×16), we further augment the training set by estimating the lifetimes of the patches neighbouring our sampled patches.

FIG. 3 .
FIG. 3. 16x16 upsampling of MDCK cells.(A) Low resolution fluorescence lifetime image (32x32) of Madin-Darby canine kidney (MDCK) cells expressing Flipper-TR dye.(B) Corresponding high resolution intensity image (512x512) of the sample.(C) 5x5 windows of low-resolution FLIM are fitted to corresponding intensity values, to generate a local prior image (two example windows are shown).(D) A global prior image is generated from 13x13 intensity patches with central FLIM measurements (two examples are shown).(E) The ground truth high-resolution FLIM target, intensity weighted for visualisation.(F) The proposed method, upsampling the low-resolution measurement by a factor of 16x16.(G) Bilinear interpolation upsampling the FLIM measurement by 16x16.

FIG. 4 .
FIG. 4. 8x8 upsampling of convallaria images.(A) Low-resolution fluorescence lifetime image (24 × 32) of a convallaria rhizome sample stained with Acridine Orange, viewed under a widefield microscope.(B) High-resolution intensity image (192 × 256).(C) Example 5x5 windows of low-resolution intensity vs FLIM, used for generating the local prior shown on the right.(D) High-resolution intensity patches are labelled with lifetime, letting us create a global prior.(E-G) Ground truth, 8x8 SiSIFUS and 8x8 bilinear interpolation of the data, weighted by local contrast enhanced intensity for visualisation -see Supplementary Materials for details.