Sample deposition and characterization
An approximately 45 nm thick thin film of VO2 was deposited onto a 001 (c-cut) sapphire substrate using pulsed laser deposition at room temperature and subsequent annealing37. The mean-square strain was measured using the Williamson-Hall method to be 0.13, consistent with an earlier determination for a similar film. The dielectric properties and thickness were determined using ellipsometry. Data were taken at three angles of incidence (60, 65, and 70 degrees) and wavelengths spanning from 200 to 2700 nm, which were then fit with a similar model to that proposed in Qazilbash et al. 21, yielding good agreement with the parameters determined there. A film thickness of 45 nm and a penetration depth of 65 nm at 660 nm wavelength were determined. Upon heating above 340 K a sharp change in the optical properties was observed as the sample underwent the phase transition into the rutile metallic phase. The full reflectivity of the rutile metallic phase was then fit to the same dielectric functions as in Qazilbash et al. 21, with excellent agreement found in the dielectric parameters as compared to previous literature. In particular the imaginary part of the dielectric function was modelled as:
(1)
where \(\omega\) is the photon energy, G denotes a causality-corrected Gaussian function38 and TL denotes a Tauc-Lorentz function29; the real part is related via the Kramers-Kronig relations. The terms in the expression are a UV pole, IR pole, Gaussian resonance describing the \({{{{\rm{O}}}}_{2p}\to d}_{\parallel }\) transition, a Tauc-Lorentz describing the \({{{\rm{O}}}}_{2p}\to {\pi }^{*}\) transition, a Tauc-Lorentz describing the \({{{\rm{O}}}}_{2p}\to {\sigma }^{*}\) transition, and the Drude plasma term. The Drude plasma term is determined by the plasma frequency ωp and damping term γ. Apart from the Drude term, in each case \({A}_{X}\), \({\omega }_{X}\), and \({\sigma }_{X}\) denote the amplitude, resonant frequency and width of resonance X. The Tauc-Lorentz resonances include an additional term \({g}_{X}\) which denotes the gap term. In a semiconductor or insulator the gap term denotes the energy of the bandgap; for photon energies below this, the probability of the optical transition goes to zero, leading to a strong asymmetry in the resonance structure. In materials without a bandgap there is no sharp turn-on energy to the resonance, and this term instead relates approximately to the next lowest lying energy band into which carriers can be promoted instead of to the Tauc-Lorentz resonance, leading to a more gradual turn-on of the resonance.
Ultrafast spectroscopy
A schematic of the experimental apparatus for ultra-broadband few-femtosecond spectroscopy is shown Supplementary Fig. 6. A titanium-doped sapphire amplifier (Coherent Legend Elite Duo USP) delivers 35 fs pulses (full width at half-maximum, FWHM) with 8 mJ of energy at 1 kHz repetition rate. These are converted to 1.45 mJ pulses at 1750 nm in an optical parametric amplifier (Light Conversion TOPAS-PRIME-HE). The infrared pulses are spectrally broadened in a gas-filled hollow capillary fibre (HCF) with 700 μm core diameter and 1.25 m length which is filled with argon gas at 450 mbar pressure, and subsequently compressed by propagation through bulk material. This results in 12 fs FWHM pulses with around 1 mJ of energy. A beamsplitter divides the beam into two, and both beams pass through variable attenuators formed of an achromatic half-wave plate and a silicon plate at Brewster’s angle before being coupled into separate HCFs for the generation of pump and probe pulses.
The pump pulse is generated via soliton self-compression and resonant dispersive wave (RDW) emission in a 1.3 m HCF with 450 μm core diameter filled with 1050 mbar of argon. When driving this HCF with around 135 μJ of pulse energy (measured before the HCF entrance window), around 22 μJ of energy is converted into an RDW pulse centred at 610 nm with a transform-limited pulse duration of approximately 3.5 fs. After exiting the HCF system this pulse passes through a 0.5 mm CaF2 beam sampler and a motorised chopper wheel and then into the vacuum chamber. After collimation with a spherical metallic mirror, four chirped mirrors (Ultrafast Innovations PC70) both compensate for the dispersion of optical components and act as dichroic mirrors to remove the infrared supercontinuum from the pump beam. An in-vacuum half-wave plate and silicon-plate form an attenuator for the pump pulse. The filtered and attenuated pulse is then focused onto the VO2 sample with a spherical metallic mirror to a spot size of approximately 200 μm FWHM. Supplementary Fig. 1b shows the result of an in-situ self-diffraction cross-correlation frequency-resolved optical gating (SD-XFROG) measurement of the pump pulse, taken by replacing the VO2 sample with a 170 μm thick piece of silica glass and one of the wedges in the probe arm with a mirror and measuring the transmitted pump spectrum as a function of delay. While the pump pulse profile cannot be retrieved from this measurement without precise knowledge of the on-target probe spectrum and/or pulse profile, the trace clearly shows a single isolated pulse with negligible pre- or post-pulse structure (note the logarithmic colour scale). The temporal width of the cross-correlation, obtained by integrating the SD-XFROG trace over all wavelengths, is consistent with a root-mean-square time resolution of \(\sim 5\,\) fs, which in turn is consistent with the fastest feature observed in Fig. 2b.
A second HCF with 200 μm core diameter and 40 cm length provides the probe pulses. This HCF is directly connected to the vacuum chamber, creating a pressure gradient; the argon pressure in the entrance cell is maintained at 800 mbar with an electronic flow regulator. With 150 μJ of driving energy (measured before the HCF entrance window), a supercontinuum spanning 220 nm to 2550 nm is generated by soliton self-compression. After collimation with a spherical mirror with a UV-enhanced aluminium coating, the probe beam is attenuated by reflection from two wedged glass windows before being refocused to a spot size of approximately 38 μm FWHM on the sample. The reflected probe beam is then reimaged with 6x magnification onto the 25 µm entrance slit of a CCD spectrometer (Avantes ULS2048XL, detection range 200–1160 nm with 1715 pixels) outside the vacuum chamber by another spherical mirror, thus sampling only the central region of the probe. A half-wave plate before the probe generation HCF rotates the probe polarisation to be opposite to that of the pump; a polariser just before the spectrometer thus passes the probe but suppresses scatter from the pump.
Pump-probe measurements are acquired in a single-shot pump-on/pump-off scheme. The chopper wheel is synchronised to switch the pump pulse at half the laser repetition rate. The pump-probe delay is scanned in steps of 1 fs by a motorised delay stage in the probe arm. 2000 probe shots are acquired at each delay (with the pump on for half of these).
We calculate the differential reflectivity as \(\Delta R/R=({I}_{{{\rm{on}}}}-{I}_{{{\rm{off}}}})/{I}_{{{\rm{off}}}}\), where \({I}_{{{\rm{on}}}}\) and \({I}_{{{\rm{off}}}}\) are the spectra with the pump on and off, respectively. To reduce the effect of detector noise, we average all spectra in five-pixel bins along the wavelength axis, resulting in wavelength samples separated by approximately 3 nm. Because of minor alignment drifts between pump-probe scans, we numerically correct the delay axis of each scan by overlapping the fastest delay-dependent feature in our data (see Supplementary Fig. 7). To remove fast oscillations caused by interference with residual pump scatter, we then re-bin along the delay axis, resulting in a spacing of 3 fs between delay points. Beyond removing the fast oscillations this does not affect the observed dynamics (see Supplementary Fig. 8). We then further de-noise the data through a principal-component analysis (see Supplementary Fig. 9). We retain the four most significant components, which captures all the salient features in our experimental data.
Dynamics Fitting
The differential reflectivity reported in Fig. 2a is first converted into an absolute reflectivity using the reflectivity of the M1 phase extracted from ellipsometry. The reflectivity at each time step as a function of probe photon energy is then fit using the dielectric function of the rutile metallic phase; we begin by fitting the long time delays (200 fs) at which the reflectivity agrees well with the thermal metallic state, and then move backwards in pump-probe delay, updating our initial guess with the output from the previous time delay. The full dielectric function has 14 free parameters and clearly overfits the data. Instead, we systematically fix a number of parameters to agree with the rutile metallic phase values and then allow the remaining parameters to vary freely. We examine all permutations of this fitting procedure (1969 combinations). The seven parameters shown in Fig. 2c are found to be the optimal minimal set which can accurately describe the phase transition for the following reasons: (1) they are the set that minimizes the error when allowing seven parameters to vary. (2) When allowing fewer parameters to vary, the optimal parameters are always from this set of seven parameters. (3) They show self-consistency, particularly in the dynamics of the Tauc-Lorentz gap term (Fig. 2i) and the Gaussian transition (Fig. 2f, g), in which the gap term should be expected to track the edge of the next lowest transition. (4) Adding additional parameters induces unphysical behaviour in the time-behaviour, namely sudden jumps. (5) Qualitatively, they are the minimal set which reproduces all salient features of the data after 20 fs delay. Additional parameters do not introduce improvements discernible by eye. (6) They are physically motivated by the known physics of the system as they relate exclusively to the \({\pi }^{*}\), \({d}_{\parallel }\) and Drude terms which are known to exhibit the largest changes through the phase transition.
Time-dependent simulations of a dimer chain
These calculations are fully described in Zhang et al. 31. We consider the \({d}_{\parallel }\) singlet and one \(\begin{array}{c}{\pi }^{*}\end{array}\) orbital to simplify the theoretical model without losing the important physics of VO2. Since the Peierls instability mainly occurs in the \({c}_{{{\rm{R}}}}\) axis, we model VO2 as a quasi-one-dimensional system, for which the displacement \(\begin{array}{c}{{{\bf{X}}}}\equiv ({X}_{1},{X}_{2})\end{array}\) is introduced to capture the dimerizing displacement along \({c}_{{{\rm{R}}}}\) axis and the band-splitting displacement in the perpendicular plane, respectively. The total Hamiltonian is given by
$$\begin{array}{c}H={H}_{{{\rm{e}}}}+{H}_{{{\rm{e}}}-{{{\boldsymbol{X}}}}}+\varPhi \left({{{\bf{X}}}}\right).\end{array}$$
(2)
Here the pure electronic part reads
$${H}_{{{\rm{e}}}}= -{\sum}_{i}{\sum}_{a=1,2}{\sum}_{\sigma=\uparrow,\downarrow }{t}_{a}{c}_{a,\sigma,i}^{{{\dagger}} }{c}_{a,\sigma,i+1}-{t}_{12}{\sum}_{i}{\sum}_{\sigma=\uparrow,\downarrow }{c}_{1,\sigma,i}^{{{\dagger}} }{c}_{2,\sigma,i}+{{\rm{H.c.}}}\\ +{\sum}_{i}{\sum}_{a=1,2}{{{{\rm{\varepsilon }}}}}_{a}{n}_{a,i}+\frac{U}{2}{\sum}_{i}{n}_{i}({n}_{i}-1),$$
(3)
where \(\begin{array}{c}a={{\mathrm{1,2}}}\end{array}\) denotes the \({d}_{\parallel }\) and \({\pi }^{*}\) orbital, respectively, \({t}_{a}\) is the nearest-neighbour intra-orbital hopping, \({t}_{12}\) is the onsite inter-orbital hopping, and \(\begin{array}{c}{\varepsilon }_{a}\end{array}\) is the onsite energy potential. We have
$$\begin{array}{c}{n}_{a,i}={\sum}_{\sigma=\uparrow,\downarrow }{n}_{a,\sigma,i},\end{array}$$
(4)
$${n}_{i}={\sum}_{a=1,2}{\sum}_{\sigma=\uparrow,\downarrow }{n}_{a,\sigma,i},$$
and \(U\) is the onsite Hubbard repulsive interaction. The lattice distortion is modelled through the classical potential energy3
$$\begin{array}{c}\varPhi (X)=L\left[\frac{\alpha }{2}({X}_{1}^{2}+{X}_{2}^{2})+\frac{{\beta }_{1}}{4}{(2{X}_{1}{X}_{2})}^{2}+\frac{{\beta }_{2}}{4}{({X}_{1}^{2}-{X}_{2}^{2})}^{2}+\frac{\gamma }{6}{({X}_{1}^{2}+{X}_{2}^{2})}^{3}\right],\end{array}$$
(5)
which is obtained from the Landau functional for improper ferroelectrics expanded up to the sixth order in the lattice displacements. Here \(L\) is the number of lattice sites. For the electron-lattice coupling, we have
$$\begin{array}{c}{H}_{{{\rm{e}}}-{{{\boldsymbol{X}}}}}=-g{X}_{1}{\sum}_{i}{(-1)}^{i}{n}_{1,i}-\frac{\delta }{2}{X}_{2}^{2}{\sum}_{i}({n}_{1,i}-{n}_{2,i}),\end{array}$$
(6)
where the first term with coupling constant \(g\) describes the dimerization induced by the displacement \({X}_{1}\) along \({c}_{{{\rm{R}}}}\) axis while the second term with strength \(\delta\) represents the crystal field splitting generated by \({X}_{2}\). Here we take the half-bandwidth as the unit of energy and set \(\begin{array}{c}{t}_{1}={t}_{2}=0.5\end{array}{{{\rm{eV}}}}\). The inter-orbital hopping is chosen as \(\begin{array}{c}{t}_{12}=0.1\end{array}{{{\rm{eV}}}}\). We also assume the centre of gravity for these two bands to be the same and set \(\begin{array}{c}{\varepsilon }_{1}={\varepsilon }_{2}=0\end{array}\). The Hubbard interaction is \(U=0.6\) eV, and the parameters for the lattice potential are set as \({{{\rm{\alpha }}}}=0.155\) eV/Å2, \(\begin{array}{c}{\beta }_{1}=17.5\end{array}\) eV/Å4, \(\begin{array}{c}{\beta }_{2}=2{\beta }_{1}\end{array}\), and γ = 672.2 eV/ Å6. Finally, we choose the electron-lattice coupling strength as g = 5.28 eV/Å and δ = 20 eV/Å2. The total system is at quarter filling. Though most of these parameters have not been constrained experimentally, they are reasonable for this type of system and yield good agreement in their derived quantitites. For instance the equilibrium positions of the displacements are then found to be X1 ≈ 0.205 Å and X2 ≈ 0.165 Å, close to the experimental values and with appropriate ratio31. The transition temperature from the M1 phase to the R phase is found to a factor two of the true value31, and the lattice natural frequency within a factor of two from previous DFT calculations28.
Simulation of the time evolution
We use the tensor network methods to simulate the time evolution of the system induced by light pulse, which couples to the electronic degrees of freedom through the Peierls substitution \(\begin{array}{c}{t}_{\alpha }\to {t}_{\alpha }{e}^{{{\rm{i}}}\frac{e}{\hbar }{A}_{{{\rm{pump}}}}(t)}\end{array}\). Here \({A}_{{{\rm{pump}}}}\left(t\right)\) is the vector potential for electric field \(\begin{array}{c}{E}_{{{\rm{pump}}}}(t)={E}_{0,{{\rm{pump}}}}{e}^{-{(t-{t}_{0,{{\rm{pump}}}})}^{2}/2{\sigma }_{{{\rm{pump}}}}^{2}}\cos [{\omega }_{{{\rm{pump}}}}(t-{t}_{0,{{\rm{pump}}}})]\end{array}\). We start from the equilibrium M1 phase at zero temperature with \(\begin{array}{c}{X}_{1}\approx 0.205\end{array}\) Å and \(\begin{array}{c}{X}_{2}\approx 0.165\end{array}\) Å, which minimizes the internal energy \(\begin{array}{c}{\varPhi }_{{{\rm{eff}}}}=\varPhi ({{{\boldsymbol{X}}}})+\langle {H}_{{{\rm{e}}}-{{{\boldsymbol{X}}}}}\rangle+\langle {H}_{{{\rm{e}}}}\rangle \end{array}\). The time evolution of the system is decomposed into two parts, i.e., the electronic and lattice degrees of freedom. For the evolution of electronic state \(\begin{array}{c}|\psi \rangle \end{array}\), we use the Born-Oppenheimer approximation within each time step \(\delta t\), i.e., the lattice distortions are approximated as fixed, while the electronic degrees of freedom are dynamic. The corresponding equation of motion is given by the Schrödinger equation and can be written as
$$\begin{array}{c}\left|\psi \left(t+\delta t\right) \right\rangle={e}^{-{{{\rm{i}}}}{{H}}[t,X(t)]\delta t/{{\hbar }}}\left|\psi \left(t\right) \right\rangle,\end{array}$$
(7)
which can be simulated by the infinite time-evolving block decimation method. On the other hand, we use the classical approximation for the lattice dynamics and invoke the Ehrenfest theorem for the motion of lattice degrees of freedom
$$\begin{array}{c}M\frac{{{{\rm{d}}}}^{2}{X}_{i}}{{{\rm{d}}}{t}^{2}}={F}_{i}(t)-\xi \frac{{{\rm{d}}}{X}_{i}}{{{\rm{d}}}t},\end{array}$$
(8)
where \(M\) is the effective mass of ions, which is set as 1082.41 eV·fs2/Å2 in this work, and ξ = 13.16 eV·fs/Å is the damping coefficient used to model the effects of disordering11. The forces \({F}_{i}\) are obtained through the Hellmann-Feynman theorem and explicitly read
$${F}_{1}= \frac{g}{2}{\sum}_{i=1,2}\cos ({Qi})\left\langle \psi |{n}_{1,i}|\psi \right\rangle -\alpha {X}_{1}-2{\beta }_{1}{X}_{1}{X}_{2}^{2}-{\beta }_{2}{X}_{1}\left({X}_{1}^{2}-{X}_{2}^{2}\right)\\ -\gamma {X}_{1}{\left({X}_{1}^{2}+{X}_{2}^{2}\right)}^{2}$$
(9)
and
$${F}_{2}= \frac{\delta }{2}{X}_{2}{\sum}_{i=1,2}\left\langle \psi |\left({n}_{1,i}-{n}_{2,i}\right)|\psi \right\rangle -\alpha {X}_{2}-2{\beta }_{1}{X}_{1}^{2}{X}_{2}+{\beta }_{2}{X}_{2}\left({X}_{1}^{2}-{X}_{2}^{2}\right)\\ -\gamma {X}_{2}\left({X}_{1}^{2}+{X}_{2}^{2}\right).$$
(10)
Time-dependent optical conductivity
Given the knowledge of the electronic wave function \(\begin{array}{c}| \psi (t)\rangle \end{array}\) under the action of an external field \(A(t)\), the temporal evolution of the current, defined as
$$\begin{array}{c}\left\langle J\left(t\right)\right\rangle=\left\langle \psi \left(t\right){|J}\left(t\right)|\psi (t)\right\rangle \end{array}$$
(11)
with
$$\begin{array}{c}J\left(t\right)=\frac{\delta H\left(t\right)}{\delta A\left(t\right)}=-{{{\rm{i}}}} {\sum}_{a,\sigma,i}{t}_{a}\left[{e}^{{{{\rm{i}}}}\frac{e}{{{\hbar }}}A\left(t\right)}{c}_{a,\sigma,i}^{{{\dagger}} }{c}_{a,\sigma,i+1}-{{{\rm{H}}}}.{{{\rm{c}}}}.\right]\end{array}$$
(12)
can be readily obtained. To calculate the optical conductivity for a nonequilibrium system induced by the pump pulse, we employ the pump-probe based method proposed in Ref. 39, where the temporal evolution of the system is traced twice in order to identify the response of the system with respect to the later probe pulse. The procedures are as follows. First, the time-evolution process induced by the pump pulse \(\begin{array}{c}{A}_{{{\rm{pump}}}}(t)\end{array}\) in the absence of probe pulse is evaluated, which describes the nonequilibrium development of the system, and we obtain the current \(\begin{array}{c}\langle {J}_{{{\rm{pump}}}}(t)\rangle \end{array}\). Second, in addition to the pump pulse, we also introduce a narrow probe pulse \(\begin{array}{c}{A}_{{{\rm{probe}}}}(t)\end{array}\) centered at time \(\begin{array}{c}{t}_{*}\end{array}\), which leads to the current \(\begin{array}{c}\langle {J}_{{{\rm{total}}}}(t)\rangle \end{array}\). The subtraction of \(\begin{array}{c}\langle {J}_{{{\rm{pump}}}}(t)\rangle \end{array}\) from \(\begin{array}{c}\langle {J}_{{{\rm{total}}}}(t)\rangle \end{array}\) produces the variation of the current due to the presence of probe pulse, i.e., \(\begin{array}{c}\langle {J}_{{{\rm{probe}}}}(t)\rangle \end{array}\), with which the time-dependent optical conductivity at time \(\begin{array}{c}{t}_{*}\end{array}\) can be calculated through
$$\begin{array}{c}\sigma (\omega )=\frac{{J}_{{{\rm{probe}}}}(\omega )}{{{{\rm{i}}}}(\omega+{\rm{i}}\eta )L{A}_{{{\rm{probe}}}}(\omega )},\end{array}$$
(13)
where \({J}_{{{\rm{probe}}}}(\omega )\) and \({A}_{{{\rm{probe}}}}(\omega )\) are the Fourier transformation of \(\begin{array}{c}\langle {J}_{{{\rm{probe}}}}(t)\rangle \end{array}\) and \(\begin{array}{c}{A}_{{{\rm{probe}}}}(t)\end{array}\), respectively. Numerically, a damping factor \(\begin{array}{c}{e}^{-\eta t}\end{array}\) is introduced in the Fourier transformations, and the same \(\eta\) is included in the denominator of the above equation.
Density functional theory simulations
The DFT calculations were carried out using a plane-wave basis. We use the projector-augmented wave method40, which is implemented in the Vienna Ab initio Simulation Package (VASP)41. The exchange and correlation were described using a tuned PBE0 hybrid functional42,43 with 7% Hartree-Fock exchange, same as ref. 33. For the vanadium pseudopotential, 13 electrons (3s23p63d34s2) were treated as valence electrons. For the oxygen pseudopotential, six electrons (2s22p4) were treated as valence electrons. The cutoff energy of the plane-wave was 400 eV. For structural optimizations, we used a Γ-centered 3 × 3 × 3 grid for the M1 and M0 phases (12 atoms per unit cell) and a 4 × 4 × 6 grid for the R phase (six atoms per unit cell). The electronic self-consistent calculations were converged to 10−4 eV between successive iterations, and the structural relaxations were converged to 10−3 eV between two successive ionic steps. The densities of states were calculated using a Γ-centered 7 × 9 × 7 grid for the M1 and M0 phase and a Γ-centered 9 × 9 × 15 grid for the R phase. The crystal structures of the M1, M0 and R phases are shown in Supplementary Fig. 10.