User Tools

Site Tools


crawdad:programming:project11

An "Out of Core" SCF Procedure

By far the greatest memory hog in the SCF procedure is the storage of the two-electron repulsion integrals. In earlier SCF programming projects, we have chosen to read these integrals from a file on disk and store them in a one-dimensional array. However, even if we take full advantage of the inherent eight-fold permutational symmetry of the Mulliken-notation integrals, we encounter a memory bottleneck even for relatively small basis sets. The table below summarizes the required size of the one-dimensional integral array for different basis sets for a water molecule (ignoring spatial symmetry):

Basis Set # Basis Functions Memory (MB)
STO-3G 7 0.003248
cc-pVDZ 24 0.3612
aug-cc-pVDZ 41 2.969
cc-pVTZ 58 11.72
aug-cc-pVTZ 92 73.22
cc-pVQZ 115 118.0
aug-cc-pVQZ 172 885.5
cc-pV5Z 201 1649
aug-cc-pV5Z 287 6832

Thus, even a relatively small number of basis functions outstrips the typical physical memory available on workstations or supercomputer nodes.

The purpose of this project is to consider and implement an “out-of-core” SCF algorithm, that is, an approach that minimizes the core-memory requirements of the program by reading the two-electron repulsion integrals from disk in batches when only they are needed.1)

The Fock Matrix Build

At the heart of the SCF procedure is the expensive Fock-matrix term:

$$F_{ij} = H^{\rm core}_{ij} + \sum_{kl}^{\rm AO} D_{kl} \left[ 2 (ij|kl) - (ik|jl) \right],
$$

where we use i, j, k, and l to denote AO-basis indices. As described in Project #3, a simple algorithm for evaluating this matrix is:

for(i=0; i < nao; i++)
  for(j=0; j < nao; j++) {
    F[i][j] = H[i][j];
    for(k=0; k < nao; k++)
      for(l=0; l < nao; l++) {
        ij = INDEX(i,j);
        kl = INDEX(k,l);
        ijkl = INDEX(ij,kl);
        ik = INDEX(i,k);
        jl = INDEX(j,l);
        ikjl = INDEX(ik,jl);
 
        F[i][j] += D[k][l] * (2.0 * TEI[ijkl] - TEI[ikjl]);
     }
  }

This algorithm hinges on the fact that all the two-electron integrals are immediately available in the TEI array (which takes advantage of permutational symmetry). In Project #7 you made use of the following code, which reads the integrals into the TEI array in batches (sometimes referred to in PSI as “buffers”):

  iwl_buf_init(&InBuf, PSIF_SO_TEI, 1e-14, 1, 0);
  do {
    iwl_buf_fetch(&InBuf);
    lblptr = InBuf.labels;
    valptr = InBuf.values;
    lastbuf = InBuf.lastbuf;
    for(idx=4*InBuf.idx; InBuf.idx < InBuf.inbuf; InBuf.idx++) {
      i = abs((int) lblptr[idx++]);
      j = (int) lblptr[idx++];
      k = (int) lblptr[idx++];
      l = (int) lblptr[idx++];
      ij = INDEX(i,j);
      kl = INDEX(k,l);
      ijkl = INDEX(ij,kl);
      TEI[ijkl] = (double) valptr[InBuf.idx];
    }
  } while (!lastbuf);
  iwl_buf_close(&InBuf, 1);

The basic idea behind an out-of-core Fock-build code is to read the two-electron integrals in batches and make use of each batch immediately before moving to the next batch. Thus, the above code would be moved into the SCF iterative procedure and modified appropriately to compute the contribution of each individual integral to the Fock matrix:

  iwl_buf_init(&InBuf, PSIF_SO_TEI, 1e-14, 1, 0);
  do {
    iwl_buf_fetch(&InBuf);
    lblptr = InBuf.labels;
    valptr = InBuf.values;
    lastbuf = InBuf.lastbuf;
    for(idx=4*InBuf.idx; InBuf.idx < InBuf.inbuf; InBuf.idx++) {
      i = abs((int) lblptr[idx++]);
      j = (int) lblptr[idx++];
      k = (int) lblptr[idx++];
      l = (int) lblptr[idx++];
 
      integral = (double) valptr[InBuf.idx];
 
      /* Compute the contribution of this integral to the Fock matrix */
 
    }
  } while (!lastbuf);
  iwl_buf_close(&InBuf, 1);

Thus a given integral, (ij|kl), would contribute to at least two Fock matrix elements as:

$$F_{ij} \leftarrow +2 * D_{kl} (ij|kl)
$$

and

$$F_{ik} \leftarrow - D_{jl} (ij|kl).
$$

Handling Permutational Symmetry

The most difficult aspect of the out-of-core algorithm is the fact that file contains only the permutationally unique integrals,, such that:

$$i \geq j,\ \ \ \ \ k \geq l,\ \ \ \ \ \ {\rm and}\ \ \ \ \ \ ij \geq kl,
$$

where

$$ij \equiv i(i+1)/2 + j\ \ \ \ \ \ {\rm and} \ \ \ \ \ \ kl \equiv k(k+1)/2 + l.
$$

Thus, when determining the contribution of a given integral to various elements of the Fock matrix, one must consider all possible unique permutations of the indices, i, j, k, and l. Note, however, that coincidences among the indices can limit the number of possibilities. For example, if one encountered the integral (22|11), it would contribute to a total of four Fock matrix elements, viz.

$$F_{22} \leftarrow 2 D_{11} (22|11),\ \ \ F_{21} \leftarrow - D_{21} (22|11),\ \ \ F_{11} \leftarrow 2 D_{22} (22|11),\ \ \ {\rm and}\ \ \ F_{12} \leftarrow -D_{12} (22|11).
$$

All such cases must be included in the algorithm to obtain a correct Fock matrix.

Additional Reading

  • J. Almlof, K. Faegri, and K. Korsell, “Principles for a Direct SCF Approach to LCAO-MO Ab Initio Calculations,” J. Comp. Chem. 3, 385-399 (1982).
1) An alternative approach is the so-called “direct” SCF, in which the two-electron integrals are re-computed in each SCF iteration rather than stored on disk.
crawdad/programming/project11.txt · Last modified: 2011/07/05 21:40 by crawdad