Getting Coupled-Cluster Accuracy for the Price of MP2

September 22, 2025 · Anton Morgunov

TL;DR for Theoretical Chemists

$\Delta$ MP2 calculation of core-electron binding energy calculated in a large basis set (or extrapolated to the CBS limit) corrected by the difference between $\Delta$ CCSD and $\Delta$ MP2 energy in the small basis set recovers $\Delta$ CC CBS values within 0.02 eV. A meta-lesson: when benchmarking new methods, CEBEs for ionizations of different elements should be analyzed separately to look for element-specific trends.

TL;DR for Chemists

You can calculate core-electron binding energy for 2nd row elements with the accuracy matching that of the most expensive methods (within 0.10-0.15 eV of experimental values) at significantly lower computational cost. Also, yet another example of Simpson's paradox in the wild.

TL;DR for the General Public

Imagine wanting a bespoke, custom-tailored suit (the most accurate quantum calculation) but only having the budget for an off-the-rack one (a cheaper method). Our work provides a set of precise, inexpensive tailoring instructions (a small correction) that makes the cheap suit fit almost identically to the bespoke one. this trick allows us to accurately model chemical systems that were previously too expensive to simulate

Primary Contributions

Landscape & The Gap

You’re probably familiar with UV-Vis spectroscopy, which measures electronic transitions between valence and virtual orbitals. In a similar fashion, X-Ray Absorption Spectroscopy (XAS) reports excitations of electrons from core orbitals (e.g. 1s orbitals for 2nd row elements), and X-ray Photoelectron Spectroscopy (XPS) measures energies (Core Electron Binding Energies, CEBEs) required to fully ionize those core orbitals. X-ray spectroscopy has several advantages:

electrons in core orbitals (unlike those in valence orbitals) are localized, so electronic transitions contain information about local environments of individual atoms
CEBEs are not just element-specific, they’re sensitive to the electronic environment (just like chemical shifts in the NMR spectroscopy). As a result, XPS can be used to infer oxidation states and coordination numbers of active centers in catalysis.

While downstream applications (e.g. ultrafast chemical dynamics) employ XAS more frequently, the ability to accurately assess energy of the core orbital is required both for XAS (even if implicitly) and XPS, making the latter a more foundational challenge.

Why can't we just use TDDFT?

Ejection of a core electron results in a significant redistribution of electron density, so traditional linear response methods such as TDDFT exhibit errors of 10+ eV, whereas proper interpretation of experimental spectra requires errors below 0.2 eV. Errors can be reduced to 1-3 eV with the use of functionals specifically optimizedsuch optimizations often mean fitting the functional to experimental data, which, as you might imagine, is a slippery slope for core spectroscopy.

Alternatively, one can use coupled-cluster based methods (EOM-CC) within Core-Valence Separation (CVS) approximationCVS is needed to avoid calculations of transitions from valence orbitals. Unfortunately, CVS-EOM-CCSD only brings the mean absolute error (MAE) down to 1.75 eV, and you need CVS-EOM-CCSDT in quadruple-zeta basis to reduce it to 0.15 eV. CVS-EOM-CCSDTQ further reduces MAE to 0.07 eV. Accurate, but incredibly expensive!CCSD scales as $O(N^6)$ , CCSDT as $O(N^8)$ , CCSDTQ as $O(N^{10})$ , where $N$ is the number of basis functions, which is roughly $30n$ and $50n$ for triple and quadruple-zeta basis sets, where $n$ is the number of atoms. and mind you, this is just the cost of a single iteration of CC, you might need 10-100 to reach convergence

An entirely different approach is to explicitly optimize the wavefunction of the core-ionized state to properly account for orbital relaxation effects. The CEBE can then be calculated as the difference between energies of core-ionized and ground states. Remarkably, even $\Delta$ HFHartree-Fock is the cheapest and simplest method in quantum chemistry. It's almost trivially naive: it assumes that movement of electrons doesn't affect each other. calculates CEBEs within 1 eV, and $\Delta$ MP2MP2 is relatively cheap, non-iterative way of correcting HF brings errors down to 0.5 eV.

Why not just massively parallelize CC?

The coupled-cluster method is considered the golden standard of computational chemistry. So yes, it's expensive, but why not just write some CUDA kernels and let GPUs go brrr?

Let's say you have a system with a core orbital $k$ , valence orbitals $i,j$ and an empty orbital $v_a$ . These energies are $\epsilon_k < \epsilon_i < \epsilon_j < \epsilon_a$ . Let's say you want to find an energy after ionizing (removing one electron from) the orbital $k$ . When you solve coupled-cluster equations (within, say, CCSD), you'll have to calculate so-called double transitionsthese transitions have nothing to do with exciting the molecule, it happens to be part of the normal process of calculating energy with CCSD $t_{ij}^{ak}$ of the form:

t_{ij}^{ak} \propto \frac{1}{\epsilon_a + \epsilon_k - \epsilon_i - \epsilon_j}

because $\epsilon_k<\epsilon_{i,j}$ , a combination of energies may (and often does) exist such that the sum in the denominator is near-zero, so $t_{ij}^{ak}$ explodes, and the whole procedure diverges.

Zheng and Cheng (2019) have shown that if you manually exclude such transitions and apply a few corrections, you can get accurate CEBE predictions. Arias-Martinez et. al (2022) proposed a few more systematic improvements and benchmarked the methods for 18 small organic molecules.

The Key Insight

To recap: we can get accurate CEBEs with $O(N^{10})$ methods that might suffer from convergence issues. Is there any chance we can get $\Delta$ CC grade predictions from cheaper methods?

The answer is yes. If you extrapolate the $\Delta$ MP2 CEBEs to the complete basis set (CBSCBS limit is the true prediction you're supposed to get with a method on a true wave function, which is a linear combination of an infinitely-dimensional basis. We can't work with infinite basis sets in practice, so we have to extrapolate the results we get from basis sets of different sizes.) limit and add a ( $\Delta$ CC- $\Delta$ MP2) correction evaluated in a small basis, you can quantitatively recover $\Delta$ CC energies in the CBS limit.

Evidence and Impact

What This Figure Actually Shows

The y-axis is the absolute value of the difference between predicted and experimental CEBE (smaller values is smaller error). x-axis shows a few methods. The gold-standard $\Delta$ CCSD (extrapolated to the CBS limit, denoted by $\infty$ symbol) scores an average (over 94 CEBEs) error of 0.123 eVerror bars show standard deviations of the MAE, in this case roughly 0.15 eV. The CBS-extrapolated $\Delta$ MP2 scores 0.28 eV, but if you add the correction (our method, shown by $\delta$ ), it's practically equivalent to the CCSD predictions. $\delta(D)$ is evaluated in a small, but still decent basis. 3-21G and STO-3G are laughably cheap

Why This Is a Meaningful Improvement

Basically, instead of doing $\Delta$ CC calculations in a large basis set, you do a $\Delta$ MP2 in a large basis, and a $\Delta$ CC in a small one.

Method	Basis	Scaling	Practical Runtime
$\Delta$ MP2	small	$O(N^5)$ once	1 s
$\Delta$ MP2	big	$O(N^5)$ once	1 min
$\Delta$ CCSD	small	$O(N^6)$ iterative	30 s
$\Delta$ CCSD	big	$O(N^6)$ iterative	2.4 hrs

So instead of hours, you're done in 2 minutes.

Insight

effectively, all of this rests on an observation that if you plot MP2 and CCSD energies as a function of basis set size, you'll get two curves that have the same shape, but are vertically offset. Meaning that $\Delta$ CC- $\Delta$ MP2 difference is the same in the CBS limit, in a large basis, and in a small basis.

The paper has quite a few more interesting results on the nuances:

in theory, you perform CBS extrapolation from calculations in $2,3,4,5,6$ -zeta basis sets. $5,6$ are often prohibitively expensive, and without them, adding 2-zeta results often makes extrapolations (on average) worse. However, for carbon-based core orbitals, the energies are 5x more accurate0.05 eV for CCSD extrapolated from $2,3,4$ -zeta energies as opposed to 0.26 extrapolated from $3,4$ -zeta if you include the 2-zeta resultsin other words, if you evaluate by looking at all CEBEs without controlling for the element on which a core orbital is ionized, you might miss element-specific trends, which is a classical example of the Simpson's paradox. This is an artifact of how the basis sets are constructed, but, nonetheless, is a practically important artifact: both if you want to predict CEBEs and if you want to benchmark a new method.
we investigate how results change if you vary the size of the "small" basis and the size of the "large" basis (or which basis sets are included in the extrapolation). For example, even MP2 in $4$ -zeta corrected with $2$ -zeta basis is pretty accurate!

Beyond the Paper: Meta-Lessons

All analysis must be automated

When I started to bring my first results (the errors of different methods or some plots) to the weekly discussions, they were always taken axiomatically correct. In other words, no one double checked the accuracy of my calculations or plots, all discussion was predicated on data being correct and centered around the implications of that data. While I appreciated the trust, given that this was my first theoretical project, I couldn't help but panic: what if I make a small mistake when collecting values from the output filethese are usually at least 3k line text files logging the progress of the calculation and all final results and take the wrong number? What if I make some mistake when selecting values for plots or tables? This is especially concerning when your results are good—how do you prove it was an honest mistake, and not data manipulation?

I quickly decided on a solution: every single piece of data manipulation should be written as a script that ingests from the source (in this case output file) and ends up with a final table or figure to be used in the paper (or any internal meetings). Now, obviously, you can still make a mistake in your script, but:

the mistake will be applied to all input values, which raises the probability of you noticing something is off
even if you never notice the mistake and someone finds it after you publish, it'll be very clear that it was a very subtle bugotherwise you would have noticed it yourself and an honest mistakeintentional data manipulation would require very obvious and illogical changes to the script algorithm

As a nice side bonus, this approach also significantly simplifies your research process.

Decided to add a few more data points? Recreating all tables and figures is just running a single .py file. (e.g. perform_analysis.py in the CEBE repo)
Made a plot for $\Delta$ MP2 and $\Delta$ CC values but want to also add $\Delta$ HF? Assuming you wrote a generic enough parser, all you need is adding another element to a listthis is actually the story of Fig. 2 in the paper. it was easy to add $\Delta$ HF and it turns out it's surprisingly accurate for O, N, and F-based CEBEs. If adding $\Delta$ HF required manually collecting all energies manually, I probably would have hesitated doing that because it'd be reasonable to expect $\Delta$ HF to be wildly inaccurate. another good lesson here—check if your assumptions hold whenever you can!. And you'll get figures in exactly the same style as you had before, no need to manually adjust font size, positioning, colors, etc.

Now, that assuming is doing a lot of heavy lifting: the magnitude of benefits depends on the quality of the code you write, which might seem daunting; however, there's no better way to figure out how to do it than to actually start doing it. I think I refactored/rewrote my scripts for the CEBE project from scratch at least 3 times. And if I were to write it today, 18 months later, I'd do it completely differently. And that is great!

Every figure should tell a clear story

During the preparation of the manuscript, Prof. Troy van Voorhis gave me a great rule that I tried to live by ever since:

Insight

every single figure in the paper should convey a clear and concise idea. the standard is that if you show it to anyone, they should be able to figure out the intended message without reading the paper.

let's take Fig. 1 as an example.

let me bold and assume that the conclusions you draw are:

whatever leftmost basis is, it's bad
blue methods are a bit better than purple method
both are significantly better than green method, except for the leftmost basis, there the green bar is surprisingly accurate.
the cyan/blue don't seem to differ much

which is pretty much exactly the same a domain expert would conclude, except they would say $\Delta$ HF instead of the green method or $\Delta$ CCSD(T) instead of the blue method.

This all might seem too much of a common sense take, but in practice maximizing clarity of a figure often means sacrificing details or some nuances. For example, initially I intended to show extrapolated $\Delta$ MP2 and $\Delta$ CC energies on the same figure, which would have a benefit of showing how much the CBS extrapolation reduces the error, but would also make the figure too loaded. As a result, the extrapolated values were taken out into a separate Fig 4.

Access & Citation

that's all for today, hope i made you curious enough to check out the paper (or at least the figures).

the code repository above contains all the data and scripts needed to recreate all tables and figures from the paper.

I'm quite proud that other members of the Batista Group have followed suit and started to include figure reproduction scripts in their projects (e.g. CardioGenAI by Dr. Kyro, or Quantum to Classical Transfer Learning by Dr. Smaldone). If you're doing research, consider joining this little trend of ours.

Cite As

@article{mp2cebe,
author = {Morgunov, Anton and Tran, Henry K. and Meitei, Oinam Romesh and Chien, Yu-Che and Van Voorhis, Troy},
title = {MP2-Based Composite Extrapolation Schemes Can Predict Core-Ionization Energies for First-Row Elements with Coupled-Cluster Level Accuracy},
journal = {The Journal of Physical Chemistry A},
volume = {128},
number = {33},
pages = {6989-6998},
year = {2024},
doi = {10.1021/acs.jpca.4c01606},
}

Getting Coupled-Cluster Accuracy for the Price of MP2

September 22, 2025 · Anton Morgunov

Paper Preprint Code

TL;DR for Theoretical Chemists

TL;DR for Chemists

TL;DR for the General Public

Primary Contributions

Landscape & The Gap

electrons in core orbitals (unlike those in valence orbitals) are localized, so electronic transitions contain information about local environments of individual atoms
CEBEs are not just element-specific, they’re sensitive to the electronic environment (just like chemical shifts in the NMR spectroscopy). As a result, XPS can be used to infer oxidation states and coordination numbers of active centers in catalysis.

Why can't we just use TDDFT?

Why not just massively parallelize CC?

The coupled-cluster method is considered the golden standard of computational chemistry. So yes, it's expensive, but why not just write some CUDA kernels and let GPUs go brrr?

t_{ij}^{ak} \propto \frac{1}{\epsilon_a + \epsilon_k - \epsilon_i - \epsilon_j}

The Key Insight

To recap: we can get accurate CEBEs with $O(N^{10})$ methods that might suffer from convergence issues. Is there any chance we can get $\Delta$ CC grade predictions from cheaper methods?

Evidence and Impact

What This Figure Actually Shows

Why This Is a Meaningful Improvement

Basically, instead of doing $\Delta$ CC calculations in a large basis set, you do a $\Delta$ MP2 in a large basis, and a $\Delta$ CC in a small one.

Method	Basis	Scaling	Practical Runtime
$\Delta$ MP2	small	$O(N^5)$ once	1 s
$\Delta$ MP2	big	$O(N^5)$ once	1 min
$\Delta$ CCSD	small	$O(N^6)$ iterative	30 s
$\Delta$ CCSD	big	$O(N^6)$ iterative	2.4 hrs

So instead of hours, you're done in 2 minutes.

Insight

The paper has quite a few more interesting results on the nuances:

in theory, you perform CBS extrapolation from calculations in $2,3,4,5,6$ -zeta basis sets. $5,6$ are often prohibitively expensive, and without them, adding 2-zeta results often makes extrapolations (on average) worse. However, for carbon-based core orbitals, the energies are 5x more accurate0.05 eV for CCSD extrapolated from $2,3,4$ -zeta energies as opposed to 0.26 extrapolated from $3,4$ -zeta if you include the 2-zeta resultsin other words, if you evaluate by looking at all CEBEs without controlling for the element on which a core orbital is ionized, you might miss element-specific trends, which is a classical example of the Simpson's paradox. This is an artifact of how the basis sets are constructed, but, nonetheless, is a practically important artifact: both if you want to predict CEBEs and if you want to benchmark a new method.
we investigate how results change if you vary the size of the "small" basis and the size of the "large" basis (or which basis sets are included in the extrapolation). For example, even MP2 in $4$ -zeta corrected with $2$ -zeta basis is pretty accurate!

Beyond the Paper: Meta-Lessons

All analysis must be automated

the mistake will be applied to all input values, which raises the probability of you noticing something is off
even if you never notice the mistake and someone finds it after you publish, it'll be very clear that it was a very subtle bugotherwise you would have noticed it yourself and an honest mistakeintentional data manipulation would require very obvious and illogical changes to the script algorithm

As a nice side bonus, this approach also significantly simplifies your research process.

Decided to add a few more data points? Recreating all tables and figures is just running a single .py file. (e.g. perform_analysis.py in the CEBE repo)
Made a plot for $\Delta$ MP2 and $\Delta$ CC values but want to also add $\Delta$ HF? Assuming you wrote a generic enough parser, all you need is adding another element to a listthis is actually the story of Fig. 2 in the paper. it was easy to add $\Delta$ HF and it turns out it's surprisingly accurate for O, N, and F-based CEBEs. If adding $\Delta$ HF required manually collecting all energies manually, I probably would have hesitated doing that because it'd be reasonable to expect $\Delta$ HF to be wildly inaccurate. another good lesson here—check if your assumptions hold whenever you can!. And you'll get figures in exactly the same style as you had before, no need to manually adjust font size, positioning, colors, etc.

Every figure should tell a clear story

During the preparation of the manuscript, Prof. Troy van Voorhis gave me a great rule that I tried to live by ever since:

Insight

let's take Fig. 1 as an example.

let me bold and assume that the conclusions you draw are:

whatever leftmost basis is, it's bad
blue methods are a bit better than purple method
both are significantly better than green method, except for the leftmost basis, there the green bar is surprisingly accurate.
the cyan/blue don't seem to differ much

which is pretty much exactly the same a domain expert would conclude, except they would say $\Delta$ HF instead of the green method or $\Delta$ CCSD(T) instead of the blue method.

Access & Citation

that's all for today, hope i made you curious enough to check out the paper (or at least the figures).

the code repository above contains all the data and scripts needed to recreate all tables and figures from the paper.

Cite As

@article{mp2cebe,
author = {Morgunov, Anton and Tran, Henry K. and Meitei, Oinam Romesh and Chien, Yu-Che and Van Voorhis, Troy},
title = {MP2-Based Composite Extrapolation Schemes Can Predict Core-Ionization Energies for First-Row Elements with Coupled-Cluster Level Accuracy},
journal = {The Journal of Physical Chemistry A},
volume = {128},
number = {33},
pages = {6989-6998},
year = {2024},
doi = {10.1021/acs.jpca.4c01606},
}

More in Research

December 14, 2025

RetroCast and SynthArena: building the infrastructure for the next breakthrough in chemistry

A breakdown of Procrustean Bed for AI-Driven Retrosynthesis: A Unified Framework for Reproducible Evaluation

a deep dive into how the metrics for ai chemical synthesis are broken, rewarding models for "solving" routes with impossible, hallucinatory chemistry. we present the data and the open-source tools to fix it.

retrosynthesisinfrastructurebenchmarkingopen-sourcesoftware-engineering

September 25, 2025·J. Chem. Inf. Model.

Upsampling the Signal: Active Learning with Proxy Spaces

A breakdown of ChemSpaceAL: An Efficient Active Learning Methodology Applied to Protein-Specific Molecular Generation

we built an ai steering system for molecular design that efficiently guides a generative model to invent potent, protein-specific molecules, even rediscovering existing drugs from scratch

active-learningdrug-discoverygenerative-aisoftware-engineering

Featured Work

February 7, 2026

Hell is an Engineering Patch for the Prisoner's Dilemma

Deriving the necessity of eternal punishment from the Prisoner's Dilemma. How infinite repeated games, discount factors, and the Folk Theorem explain the structural utility of Hell in fostering human cooperation

game-theorypsychologyphilosophyrationality

January 1, 2026

Vibe coding killed Cursor

Cursor is dying because cost-optimization forces models into tunnel vision. RAG agents fail because they only see what they search for. The superior workflow for 2026 is massive context windows (gemini 2.5 pro) and manual control. Stop letting agents hide code from the model.

coding-agentsDX