Category: crystal_structure_mining

  • More simple experiments with crystal data. The pyramidalisation of nitrogen.

    We are approaching 1 million recorded crystal structures (actually, around 716,000 in the CCDC and just over 300,00 in COD). One delight with having this wealth of information is the simple little explorations that can take just a minute or so to do. This one was sparked by my helping a colleague update a set of interactive lecture demos dealing with stereochemistry. Three of the examples included molecules where chirality originates in stereogenic centres with just three attached groups. An example might be a sulfoxide, for which the priority rule is to assign the lone pair present with atomic number zero. The issue then arises as to whether this centre is configurationally stable, i.e. does it invert in an umbrella motion slowly or quickly.  My initial intention was to see if crystal structures could cast any light at all on this aspect.

    pyramidal
    Central atom has three bonded atoms as C, of which either all three must themselves have four attached atoms, or one can have just three attached atoms as shown above, along with acyclic character for the three bonds attached to the central atom, R ≤ 0.1, not disordered and no errors.

    Using the search definition above for R3N one gets the result below. It shows a hot spot for an angle subtended at the nitrogen of ~111°, indicating a pyramidal nitrogen. But how easily is that perturbed? (which is almost like asking how easily can it invert its configuration?).

    R3N, all sp3 attached carbons

    A perturbation can be applied by changing just one of the attached carbons as having three attached atoms of its own (sp2 hybridised). The response is that the hot spot moves to 120° (below). Of course now this includes compounds such as amides and the like. But we have learnt that it takes just one such attached sp2 hybridised carbon to planarize an adjacent nitrogen.

    R3N-1sp2-2sp3

    The control experiment will now be to apply the same test to a P. The hot spot moves from ~99° (P with three sp3 carbons attached) to ~103° (P with two sp3 and one sp2). This reminds us that the overlap and energy-match between a p-orbital on carbon to an adjacent p-orbital on nitrogen is good, whereas the same overlap/energy match to a p-orbital on P is significantly less so.
    R3P-sp3

    R3P-1sp2-2sp3

    One gets the same result when the central atom is S; the hotspot moves from ~102° to ~105°. Unfortunately, not enough compounds are known for a tri-substituted oxygen compounds to see how this element responds.

    R3S-sp3R3S-1sp2-2sp3

    My point in illustrating these statistics is to show how much text-book chemistry can be recovered simply by a few quick explorations of crystal structures. One could even argue that much introductory chemistry could be taught by reference to the statistics of such structures.


    Acknowledgments

    This post has been cross-posted in PDF format at Authorea.

  • Amides and inverting the electronics of the Bürgi–Dunitz trajectory.

    The Bürgi–Dunitz angle describes the trajectory of an approaching nucleophile towards the carbon atom of a carbonyl group. A colleague recently came to my office to ask about the inverse, that is what angle would an electrophile approach (an amide)? Thus it might approach either syn or anti with respect to the nitrogen, which is a feature not found with nucleophilic attack. amide My first thought was to calculate the wavefunction and identify the location and energy (= electrophilicity) of the lone pairs (the presumed attractor of an electrophile). But a better more direct approach soon dawned. A search of the crystal structure database. Here is the search definition, with the C=O-E angle, the O-E distance and the N-C=O-E torsion defined (also specified for R factor < 5%, no errors and no disorder). search   The first plot is of the torsion vs the distance, for E = H-X (X=O,F, Cl) amides

    1. The first observation is to note the prominent "hotspot" at a torsion of 180° and a (hydrogen bonding) distance of ~1.60-1.65Å. Amides, so it seems, prefer the electrophile (a proton) to approach anti to the nitrogen
    2. There is a smaller hotspot at a torsion of 0° and a rather longer distance of ~1.8Å corresponding to syn approach.
    3. And finally a barely discernible (but real) one at ~90°, corresponding to the proton attaching itself to the carbonyl π-bond.
    4. A plot of the angles involved reveals that the anti hotspot occurs at ~100° whilst the syn hotspot is about 120°.amides-angles
    5. whilst replacing the proton as electrophile by any metal results in a distinct change.amides-angles1amides-angles2
    6. Syn approach now holds the (red) hotspot, and the angle opens up to ~135°, whilst the anti approach covers a wider angle range of 130-150°
    7. A third hotspot region occurs for the 90° torsion, again metal-π-bond interactions.

    The above is a very general statistical survey. As with most bonding effects, one really should investigate every example to discover any perturbing circumstances or structural motifs that might distort the outcome. But for a ten minute exercise in response to a fascinating question from a colleague, it's not bad! And it certainly nicely inverts the usual Bürgi–Dunitz view of carbonyl groups.

  • Trigonal bipyramidal or square pyramidal: Another ten minute exploration.

    This is rather cranking the handle, but taking my previous post and altering the search definition of the crystal structure database from 4- to 5-coordinate metals, one gets the following.

    Fe ...
    Fe …
    Co ...
    Co …
    Ni ...
    Ni …
    Cu ...
    Cu …

    Trigonal bipyramidal coordination has angles of 90, 120 and 180°. Square pyramidal has no 120° angles, and the 180° angles might be somewhat reduced. Thus the Fe and Co series have plenty of 120, whereas the Ni and Cu series hardly any. The Ni series has many 160° values. It is clearly a serious issue that attempting any correlation with the spin states is going to be a lot of really hard work (I might next do another simple search where bond lengths can be shown to very closely correlate with low/medium/high spin states). I will not be trying a more finely grained analysis of the above plots; I just wanted to point out how very simple and quick they are to generate.


    Acknowledgments

    This post has been cross-posted in PDF format at Authorea.

  • Tetrahedral or square planar? A ten minute exploration.

    I love experiments where the insight-to-time-taken ratio is high. This one pertains to exploring the coordination chemistry of the transition metal region of the periodic table; specifically the tetra-coordination of the series headed by Mn-Ni. Is the geometry tetrahedral, square planar, or other? One can get a statistical answer in about ten minutes.
    Tet-SP.jpgThe (CCDC database) search definition required is shown above. The central atom defines the column of the period table, it is specified to have precisely four other atoms bonded to it, which can be any other element. These four bonds are specified as acyclic (to avoid any bias introduced by rings). And two angles are defined subtending the central atom. And off we go, defining on the way that the hits must be refined to an R-factor of < 0.05, have no disorder, and no errors.

    Mn, (Tc), Re
    Mn, (Tc), Re
    Fe, Ru, Os
    Fe, Ru, Os
    Co, Rh, Ir
    Co, Rh, Ir
    Ni, Pd, Pt
    Ni, Pd, Pt

    Square planar coordination will manifest with pairs of angles of either 90° or 180°, whilst tetrahedral coordination will reveal only 109°.

    1. Both the Mn and the Fe series show a (red) hotspot at the tetrahedral value.
    2. The Co series shows a tetrahedral hot spot AND a somewhat less abundant square planar double-hot spot for the combination 90/180 and 180/90.
    3. The Ni series reveals the hottest spots to correspond to square planar, but with a significant tetrahedral cluster.

    This quick survey can be followed up by more detailed explorations of the clusters. For example, can one go to the literature and find out the typical spin state for e.g. the Ni series in each of the geometries. Unfortunately, the CCDC database does not record what the spin state of any individual compound is; one will have to go to the original literature to find out. What a shame that the linkage between two quite different properties is (as far as I know) not available in any easily searchable form. Alternatively, one can narrow down the searches to individual searches of row 1, 2 or 3 of the transition series and then compare the behaviour. The possibilities are considerable.

    Then there are the outliers in each plot. Some (many?) may prove to be due to faulty data (whilst we have specified no errors, they can still occur) but others may be due to an unusual structural feature, or perhaps even an as yet unrecognized phenomenon! Set as a student experiment, one might ask each student to explore say 3 outliers and express an opinion as to what causes them to deviate. Enjoy!


    Acknowledgments

    This post has been cross-posted in PDF format at Authorea.

  • Artemisinin: are stereo-electronics at the core of its (re)activity?

    Around 100 tons of the potent antimalarial artemisinin is produced annually; a remarkable quantity given its very unusual and fragile looking molecular structure (below). When I looked at this, I was immediately struck by a thought: surely this is a classic molecule for analyzing stereoelectronic effects (anomeric and gauche). Here this aspect is explored.

    artemisinin

    I start by listing the bonds around which interesting things might happen:

    1. C3-C4 has the gauche motif of a 1,2-diol
    2. Carbons 7 and 4 are anomeric centres, with the focus on bonds 1-7/7-6 and 6-4/4-5
    3. Bond 1-2 has the potential for a so-called α-effect, where the lone pairs on adjacent hetero-atoms are buttressed.

    The crystal structure is shown below, annotated with pertinent bond lengths (trivial atom numbering). The dihedral 2-3-4-6 and 2-3-4-6 are respectively -51 and 72° (hence a double gauche at the 3-4 bond).

    Click for 3D
    Click for 3D

    First, an exploration of what might be happening around C4. The following is a search of the Cambridge crystal structure database, plotted for the two C-O bond lengths common to C4.

    artemisinin1 artemisinin Here, DIST1 is C4-O6 and DIST2 is C4-O5. Notice the very pronounced asymmetry; at the red hotspot above, the most frequent occurrence is ~1.39 and 1.46Å respectively; artemisinin is more or less at that hotspot. This can be quantified by the NBO E(2) energies for the interaction of an oxygen lone pair antiperiplanar to the C-O σ* bond;

    1. Lp(O6)-σ*(C4-O5) = 21.2 kcal/mol which helps to account for the short C4-O6 and the long C4-O5 bonds.
    2. whereas the reverse donation of Lp(O5)-σ*(C4-O6) is merely 4.8 kcal/mol (normally the two donations are more or less equal, and hence so at the two C-O bond lengths).
    3. At the second anomeric centre of C7, Lp(O1)-σ*(C7-O6) = 19.9 kcal/mol
    4. whereas the reverse donation of Lp(O6)-σ*(C7-O1) is 5.7 kcal/mol, again highly asymmetric, as are the C-O bond lengths (1.413/1.441Å).
    5. Next, the gauche effect at C3-C4. The C4-H to C3-O2 donation is 6.4 kcal/mol, again contributing to the longer C-O length of 1.447Å.

    Where such stereoelectronic interactions are asymmetric, one might expect enhanced reactivity. A good example of this are two stereoisomeric of a 7-ring herbicide[cite]10.1039/P29890001929[/cite] where one anomer with equal anomeric C-O lengths is a stable soil-persistent species, whereas the other with asymmetric lengths has a very short soil residency due to rapid hydrolysis. It might be tempting to speculate that some aspect of the activity of artemisinin may be due to such stereoelectronic asymmetries.

    Finally, because it is virtually free to do so in a computational sense, I show the computed VCD spectrum[cite]10.6084/m9.figshare.997360[/cite] (covering the possibility that it is measured at some point). The calculated[cite]10.6084/m9.figshare.997463[/cite] optical rotation ([α]589 is +93° (obs ~+76°). Whilst the absolute configuration is not in any doubt, it is always nice to have further confirmations.

    artemisinin


    This post has DOI: 10.14469/hpc/12766


  • The conformational preference of s-cis amides.

    Amides with an H-N group are a component of the peptide linkage (O=C-NH). Here I ask what the conformation (it could also be called a configuration) about the C-N bond is. A search of the following type can be defined:

    cis-amide

    The dihedral shown is for H-N-C=O (but this is equivalent to the C-C-N-C dihedral, which is also often called the dihedral angle associated with the peptide group). I have also added a distance, from a C-H to the carbonyl oxygen. Other search constraints include T ≤ 175K, R < 0.05, no disorder, no errors, that neither N-C bonds are part of a ring and that the two carbons marked T4 both have four connected bonds. The search results in 619 hits (January 2013 version of the CCDC database), and these are displayed below.

    cis-amide-search-heat

    The horizontal axis reveals the highest concentration (red) at ~2.4Å due to a syn-co-planar alignment of the C-H bond with the plane of the C=O bond in the s-cis conformer (the significantly smaller hot-spot at ~3.9A may be due to an anti-co-planar alignment of this C-H bond).

    s-cis-amide

    The vertical axis shows a clear preference for a dihedral of 179° (in fact no hits with a dihedral of less than 14o° were found) and this can only arise from the s-cis conformation in which the H-N bond is oriented antiperiplanar to the axis of the C=O bond. This preference can be rationalised by filled/empty NBO-orbital interactions, which include:

    1. Antiperiplanar interaction between the N-H as donor and the C=O as a σ-acceptor (E(2) = 4.1 kcal/mol)
    2. Antiperiplanar interaction between the N-H as acceptor and C-H as donor (E(2) = 4.7 kcal/mol)
    Click for 3D
    H-N/C=O. Click for 3D
    Click for 3D.
    Click for 3D.

    This latter overlap conspires to bring the C-H hydrogen close to the oxygen (~2.35Å, DIST1 in the diagram above). So one might be entitled to ask: is this a hydrogen bond? There are (at least) two ways of testing this.

    1. The NBO E(2) interaction energy between the oxygen in-plane lone pair and the H-C as acceptor is 0.8 kcal/mol. For hydrogen bonds, such E(2) energies more or less resemble the actual H-bond strengths, i.e. a strong H-bond has an E(2) energy of ~ 8 kcal/mol; and a medium O…H-C hydrogen bond weighs in at around 3 kcal/mol.  So this one is very weak. This is due to poor overlap resulting from the small ring size (5).
    2. The NCI (non-covalent-interaction) surface does reveal a feature in the CH…O region, but the colour coding (which indicates how attractive/repulsive this is) is both pale blue (attractive) and yellow (repulsive). Again this is only consistent with a very weak overall H-bond.
    NCI surface. Click for 3D.
    NCI surface. Click for 3D.

    I end by reminding that the s-cis H-N-C=O conformation is a very common feature in peptides (the CCDC database comprises mostly small molecules, not larger peptides and proteins) arising from really quite subtle orbital interactions.


    Acknowledgments

    This post has been cross-posted in PDF format at Authorea.

  • The conformation of acetaldehyde: a simple molecule, a complex explanation?

    Consider acetaldehyde (ethanal for progressive nomenclaturists). What conformation does it adopt, and why? This question was posed of me by a student at the end of a recent lecture of mine. Surely, an easy answer to give? Read on …

    acetaldehyde

    There really are only two possibilities, the syn and anti. Well, I have discovered it is useful to start with a search of the Cambridge data base. With R=H or C, X unspecified,  acyclic and T ≤ 175K, two searches were performed. The first identified the torsion around O=C-C-H. This clearly shows a maximum at 120° (with twice the probability), and a smaller one at 0°. This matches syn; the anti conformation above would be expected to have peaks at 60° and 180°; the latter in particular is singularly missing.

    acetaldehyde-180

    An alternative search is to define the distance between the oxygen and the H. For the syn conformer, distances of ~2.5 and 3.1Å are expected; for the anti conformer, 2.7 and 3.3Å. Again, syn matches better. Remember, searches based on the position of a hydrogen are less reliable than most, so these distributions provide only a statistical indication.

    acetaldehyde-dist

    Now for a (ωB97XD/6-311G(d,p) calculation of the rotational barrier. The minima occur at torsions of 0, 120 and 240°, matching syn, although the barrier is very low.

    acet-rot

    Now to try to find explanations. The standard one finds this in three effects:

    1. Donation from two C-H bonds (R=H above) into the π*C=O NBO orbital (in the manner that was used to explain the cis-orientation of the two methyl groups in cis-butene). 
    2. Donation from the single co-planar C-H bond into the σ*C=O NBO orbital (blue bonds above)
    3. Pauli bond-bond repulsions between two filled NBOs. 

    Effect 1 has an NBO perturbation energy E(2) of 7.0 kcal/mol for the syn conformer and 6.45 for the anti. The explanation is the π*C=O NBO "leans outward", overlapping better with the C-H bonds in the syn than in the anti.  the One up to the syn! Effect 2 has values of 1.3 for the syn and 4.1 for the anti. The latter now has the edge. But wait, there are other (smaller) interactions. The syn has an antiperiplanar orientation of the two C-H bonds shown above (X=H,red), E(2) = 3.3 vs 0.6 for the corresponding syn-planar orientation in the anti-conformation. It's now a tie; neck-and-neck.

    Effect three suggests that the disjoint NLMO steric exchange energy is 54.34 for the anti and 53.88 (i.e. lower) for the syn. It is vaguely disappointing that no absolutely clear-cut explanation emerges. But then the difference (in total free energy) is only 1.4 kcal/mol. But even this small difference in energy can manifest in fairly clear-cut conformational preferences obtained from crystal structures. Ultimately of course, all effects in chemistry are reducible to the sum of lots of small effects (in other words unpredictable until one does the sum). 

    I cannot end without mentioning the largest of all the NBO interactions, namely the in-plane lone pair on the oxygen as donor and the aldehyde proton C-H as acceptor (X=H). This has values of 29.3 for syn and 28.8 kcal/mol for anti. This manifest (inter alia) in a greatly reduced C-H vibrational wavenumber (ν 2982 for syn, 2900 cm-1 for anti) compared to the methyl C-H values (~3043-3164).

    So this tiny little molecule ended up a little less obvious than might have seemed at the outset. One can find interesting things in even the tiniest of things! 


    HC...C-H alignment. Click for  3D.
    HC…C-H alignment. Click for 3D.
    HC...C-H alignment. Click for  3D.
    O=C*…C-H alignment. Click for 3D.

    Acknowledgments

    This post has been cross-posted in PDF format at Authorea.

  • σ-π-Conjugation: seeking evidence by a survey of crystal structures.

    The electronic interaction between a single bond and an adjacent double bond is often called σ-π-conjugation (an older term for this is hyperconjugation), and the effect is often used to e.g. explain why more highly substituted carbocations are more stable than less substituted ones. This conjugation is more subtle in neutral molecules, but following my use of crystal structures to explore the so-called gauche effect (which originates from σ-σ-conjugation), I thought I would have a go here at seeing what the crystallographic evidence actually is for the σ-π-type.

    sigma-pi-conjugation

    The basic two molecules are shown above; in effect propene 1 and butene 2. The latter was in fact the topic of another post, in which I attempted to show that the close H…H contact in cis-butene (2.1Å) was in effect an unwelcome consequence of the σ-π-conjugation of any of the four "outward leaning" C-H bonds of the methyl groups acting as donors (red-blue below) overlapping with the similarly "outward leaning" π* orbital of the alkene (purple-orange below; blue and purple overlap positively).

    C-H/alkene interaction. Click for  3D.
    NBO orbitals for C-H/alkene interaction. Click for 3D.

    So how general might this be? To find out, I performed the following search on the Cambridge crystal database: cis-butene-search

    1. The search defines an alkene, bearing two cis-substituents each with at least one C-H bond. The substituents are both sp3 carbon, and the attachment bond to the alkene is defined as acyclic
    2. The H…H distance uses normalised terminal hydrogen positions (to try to correct for the normally over-short C-H bond lengths found by X-ray).
    3. Other constraints were R factor < 0.05, no disorder, no errors and (perhaps most importantly) T < 150K to try to reduce thermal libration.

    I should qualify all of this by reminding that hydrogen positions in crystal structures are notoriously prone to errors. Nevertheless, with 624 hits using the above search, one might hope for statistical significance of a real effect.

    Search result for close H...H contacts in cis-butenes.
    Search result for close H…H contacts in cis-butenes.

    For this sample, the most frequent H…H distance emerged as 2.1Å. This can only result from having the C-H bonds lie coplanar with the C=C alkene, as is shown above. The value is also remarkably close to the H…H distance for cis-butene itself (both computationally and as determined using electron diffraction). This does I feel provide a strong indication that σ-π-conjugation is manifesting in these systems.

    Re-defining the search for propenes 1 as above gives 1656 hits, with a maximum in the distribution at 2.35Å corresponding to a syn-orientation of the C=C and the C-H bonds. The smaller maximum at about 2.75Å arises from a gauche-orientation between the C=C and C-H (in effect you have to halve this number, since there are twice as many possibilities for this to occur than for the syn). The "inward leaning" gauche C-H bond overlaps less well with the "outward leaning" π* orbital of the alkene.

    Propene.
    Search result for close H…H contacts in propenes.

    These aspects are perhaps better seen in the orbital overlaps shown below.

    Click for 3D.
    Click for 3D.

    I will follow-up this theme with esters and amides next.


    Acknowledgments

    This post has been cross-posted in PDF format at Authorea.

  • The gauche effect: seeking evidence by a survey of crystal structures.

    I previously blogged about anomeric effects involving π electrons as donors, and my post on the conformation of 1,2-difluorethane turned out one of the most popular. Here I thought I would present the results of searching the Cambridge crystal database for examples of the gauche effect. The basic search is defined belowCCDC-search

    Here, we define a four-atom torsion (TOR1), the two central carbon atoms having two groups R which can be only H or C. These two carbons are also defined as acyclic. The restrictions of the search as defined above also include R-factor < 0.05, not disordered and no errors. These combine to reduce the number of hits significantly (although not dissimilar distributions are obtained for less restricted searches). Each search takes only a few seconds, and one can rattle through many permutations very quickly.

    So here come the results. First, QA=4M=F. All but one of the examples has a torsion in the region of 60°, the classic gauche effect!

    F-C-C-F
    F-C-C-F

    Next, QA=O, 4M=F. Rather more hits, and the effect is almost as clear-cut. I should point out that the apparent "exceptions" to the gauche conformation may arise from structural restrictions, and each really would have to be inspected individually for the reasons (which I do not attempt here). 

    OCCF
    OCCF

    With QA=4M=O,  one has many more instances. The effect is pretty convincing (it may be that hydrogen bonding may also control the conformation).

    O-C-C-O
    O-C-C-O

    Now for QA=4M=Cl. The distribution is slanted more to the anti conformation, but there are still quite a few gauche.

    Cl-CC-Cl
    Cl-CC-Cl

    With QA=4M=S, the conformations are now almost all anti; the gauche effect is no more! 

    S-C-C-S
    S-C-C-S

    And for QA=4M=Br, it has also almost vanished (there is only one instance for I, and that too is antiperiplanar).

    Br-C-C-Br
    Br-C-C-Br

    I now return to an earlier post in which I speculated that a cyano group might participate in the anomeric effect. Well here it is in the gauche effect; QA=CN, 4M = any of N,O,F,Cl,S. Quite a few gauche orientations for this pseudo-halogen!

    Neg-C-C-CN
    Neg-C-C-CN

    Another group that can act as a powerful acceptor of electrons from a donor is QA=N(Me)3+.. With 4M= N, O, F, Cl, here  the population of gauche conformers is large. QA=CF3 is a similar group.

    Neg-C-C-NMe3
    Neg-C-C-NMe3
    Neg-C-C-CF3
    Neg-C-C-CF3

    One can envisage other combinations. Thus QA= C=C, 4M = any of  N, O, F, Cl. An alkene seems one of the more powerful gauche effect participants!

    alkene-C-C-Neg
    alkene-C-C-Neg

    And alkynes, perhaps slightly less so.

    Alkyne-C-C-Neg
    Alkyne-C-C-Neg

    What about metals (QA = any metal, 4M = any of N, O, F, Cl, S). Well, not particularly biased either way, but clearly one in which the identity of the metal may matter.

    Metal-C-C-electronegative
    Metal-C-C-electronegative

    I should end with inverting the model. If QA is electropositive (any group to the left of carbon, or below it in the periodic table) and 4M is electronegative, than they align almost exclusively anti-periplanar and not gauche. But notice how relatively few examples there are.  Synthetic chemists, please make more such molecules!

    Electropositive-C-C-Electronegative
    Electropositive-C-C-Electronegative

    If you thought the gauche effect was restricted to just a few molecules, think again!


    Acknowledgments

    This post has been cross-posted in PDF format at Authorea.