Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
4acf8ca
2025PKUCourseHW5: Case: 1 - Change rank_seed_offset to static const
Silver-Moon-Over-Snow Mar 28, 2026
c54aac5
Merge pull request #1 from Silver-Moon-Over-Snow/Silver-Moon-Over-Sno…
Silver-Moon-Over-Snow Mar 28, 2026
902a6a3
Merge branch 'deepmodeling:develop' into develop
Silver-Moon-Over-Snow Apr 7, 2026
42417ad
Merge branch 'deepmodeling:develop' into develop
Silver-Moon-Over-Snow Apr 8, 2026
7649894
Merge branch 'deepmodeling:develop' into develop
Silver-Moon-Over-Snow May 30, 2026
f93ba7c
feat: add DiagoPPCG solver (Projection Preconditioned Conjugate Gradi…
Silver-Moon-Over-Snow May 30, 2026
0496c6c
fix: stabilize DiagoPPCG - potrf retry, sygvd double-call, and orthon…
Silver-Moon-Over-Snow May 30, 2026
30276e4
fix: stabilize PPCG for nband>1 - Krylov fallback, M save/restore, CG…
Silver-Moon-Over-Snow May 30, 2026
f06ad8a
fix: use full complex inner product for gradient/vector projections
Silver-Moon-Over-Snow May 30, 2026
a4729c8
fix: remove extra chol_qr_active after update_one_block in BLOCK_SUBS…
Silver-Moon-Over-Snow May 31, 2026
49f70c2
revert: remove unintended 'static' from rank_seed_offset in sto_wf.cpp
Silver-Moon-Over-Snow May 31, 2026
3713712
fix: use real-only initial wavefunctions in PPCG unit test
Silver-Moon-Over-Snow May 31, 2026
c2a32db
fix: disable 3-block [psi,w,p] subspace to prevent M-matrix ill-condi…
Silver-Moon-Over-Snow Jun 3, 2026
8340fd7
fix: use rr_step=1 for CG to prevent Cholesky band mixing
Silver-Moon-Over-Snow Jun 3, 2026
126fdc8
fix: remove orth_cholesky before rayleigh_ritz in CG RR path
Silver-Moon-Over-Snow Jun 5, 2026
5c287c5
fix: add initial Rayleigh-Ritz to CG strategy
Silver-Moon-Over-Snow Jun 5, 2026
22ce9d2
temp: remove CG test, keep only BLOCK_SUBSPACE
Silver-Moon-Over-Snow Jun 5, 2026
c7bba69
chore: remove .claude/settings.json, add to .gitignore
Silver-Moon-Over-Snow Jun 5, 2026
f7a1ea0
fix: use exact quadratic root in line_minimize for CG strategy
Silver-Moon-Over-Snow Jun 5, 2026
187c1f0
fix: use subspace diagonalization for CG non-RR eigenvalues; re-enabl…
Silver-Moon-Over-Snow Jun 6, 2026
c1840ef
fix: revert BLOCK_SUBSPACE to use_p=false to avoid M singularity
Silver-Moon-Over-Snow Jun 6, 2026
765064a
Merge branch 'develop' into ppcg
Silver-Moon-Over-Snow Jun 6, 2026
30577b9
fix: normalize w/p to unit S-norm before building small subspace
Silver-Moon-Over-Snow Jun 6, 2026
2204715
fix: fall back to 2-block subspace when p is bad instead of Krylov Hw
Silver-Moon-Over-Snow Jun 6, 2026
5a428fc
test: expand PPCG unit tests with 6 test cases across diverse matrices
Silver-Moon-Over-Snow Jun 11, 2026
2bfe9c3
Merge branch 'develop' into ppcg
Silver-Moon-Over-Snow Jun 11, 2026
2047224
test: add 23 PPCG unit tests + 2 performance benchmarks
Silver-Moon-Over-Snow Jun 11, 2026
6ab66f1
Merge branch 'develop' into ppcg
Silver-Moon-Over-Snow Jun 16, 2026
c8e34da
Stabilize DiagoPPCG LAPACK fallback paths
Silver-Moon-Over-Snow Jun 16, 2026
23ed821
Trigger CI rerun
Silver-Moon-Over-Snow Jun 16, 2026
7fa4262
Remove local Claude ignore rule
Silver-Moon-Over-Snow Jun 17, 2026
2824e76
Fix PPCG Hermitian subspace LAPACK usage
Silver-Moon-Over-Snow Jun 17, 2026
4643b43
Make PPCG usable from PW solver
Silver-Moon-Over-Snow Jun 22, 2026
8dac3d7
Link PPCG into hsolver PW tests
Silver-Moon-Over-Snow Jun 22, 2026
698df04
Use complex Rayleigh-Ritz in PPCG line minimization
Silver-Moon-Over-Snow Jun 22, 2026
7518187
Preserve band identity in PPCG line minimization
Silver-Moon-Over-Snow Jun 23, 2026
fb4f10d
Restore stable PPCG line minimization
Silver-Moon-Over-Snow Jun 25, 2026
a9791f5
Allow complex PPCG line-search steps
Silver-Moon-Over-Snow Jun 25, 2026
be4034d
Add optional PPCG residual trace
Silver-Moon-Over-Snow Jun 25, 2026
5ca458d
Document PPCG PW integration
Silver-Moon-Over-Snow Jun 26, 2026
3e77c35
Unify PPCG LAPACK calls through ct::kernels layer
Silver-Moon-Over-Snow Jun 26, 2026
3aff00c
Link PPCG test with container kernels
Silver-Moon-Over-Snow Jun 26, 2026
79607ca
Avoid PPCG dependency on hegvd wrapper change
Silver-Moon-Over-Snow Jun 26, 2026
d6407cd
Make PPCG tests C++11 compatible
Silver-Moon-Over-Snow Jun 26, 2026
8377710
Revert PPCG ATen LAPACK refactor
Silver-Moon-Over-Snow Jun 26, 2026
38d287c
Preserve hsolver test CMake EOF style
Silver-Moon-Over-Snow Jul 1, 2026
ef80502
Refactor PPCG LAPACK calls through ATen kernels
Silver-Moon-Over-Snow Jul 2, 2026
f32fd11
Refactor PPCG block subspace solver
Silver-Moon-Over-Snow Jul 2, 2026
da1f555
Link EXX info in hsolver PW test
Silver-Moon-Over-Snow Jul 2, 2026
e305e77
Quiet PPCG quick benchmark output
Silver-Moon-Over-Snow Jul 2, 2026
d7627b2
Harden PPCG generalized eigensolver fallback
Silver-Moon-Over-Snow Jul 2, 2026
2c8bf32
Use Cholesky reduction for PPCG generalized solves
Silver-Moon-Over-Snow Jul 2, 2026
7f566b5
Use BLAS and pool reductions in PPCG projections
Silver-Moon-Over-Snow Jul 2, 2026
b75d9d8
Complete PPCG pool reductions
Silver-Moon-Over-Snow Jul 2, 2026
377fe68
Merge branch 'develop' into ppcg
Silver-Moon-Over-Snow Jul 3, 2026
7e2c8b5
Parallelize local PPCG vector operations
Silver-Moon-Over-Snow Jul 3, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions docs/advanced/input_files/input-main.md
Original file line number Diff line number Diff line change
Expand Up @@ -980,7 +980,7 @@
### pw_diag_thr

- **Type**: Real
- **Description**: Only used when you use ks_solver = cg/dav/dav_subspace/bpcg. It indicates the threshold for the first electronic iteration, from the second iteration the pw_diag_thr will be updated automatically. For nscf calculations with planewave basis set, pw_diag_thr should be <= 1e-3.
- **Description**: Only used when you use ks_solver = cg/dav/dav_subspace/bpcg/ppcg. It indicates the threshold for the first electronic iteration, from the second iteration the pw_diag_thr will be updated automatically. For nscf calculations with planewave basis set, pw_diag_thr should be <= 1e-3.
- **Default**: 0.01

### diago_smooth_ethr
Expand All @@ -999,8 +999,8 @@
### pw_diag_nmax

- **Type**: Integer
- **Availability**: *basis_type==pw, ks_solver==cg/dav/dav_subspace/bpcg*
- **Description**: Only useful when you use ks_solver = cg/dav/dav_subspace/bpcg. It indicates the maximal iteration number for cg/david/dav_subspace/bpcg method.
- **Availability**: *basis_type==pw, ks_solver==cg/dav/dav_subspace/bpcg/ppcg*
- **Description**: Only useful when you use ks_solver = cg/dav/dav_subspace/bpcg/ppcg. It indicates the maximal iteration number for cg/david/dav_subspace/bpcg/ppcg method.
- **Default**: 50

### pw_diag_ndim
Expand Down Expand Up @@ -1115,6 +1115,7 @@
- bpcg: The BPCG method, which is a block-parallel Conjugate Gradient (CG) method, typically exhibits higher acceleration in a GPU environment.
- dav: The Davidson algorithm.
- dav_subspace: The Davidson algorithm without orthogonalization operation, this method is the most recommended for efficiency. pw_diag_ndim can be set to 2 for this method.
- ppcg: The projection preconditioned conjugate-gradient method, currently available for CPU plane-wave calculations.

For numerical atomic orbitals basis,

Expand Down
2 changes: 1 addition & 1 deletion docs/advanced/scf/hsolver.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

Method of explicit solving KS-equation can be chosen by variable "ks_solver" in INPUT file.

When "basis_type = pw", `ks_solver` can be `cg`, `bpcg` or `dav`. The default setting `cg` is recommended, which is band-by-band conjugate gradient diagonalization method. There is a large probability that the use of setting of `dav` , which is block Davidson diagonalization method, can be tried to improve performance.
When "basis_type = pw", `ks_solver` can be `cg`, `bpcg`, `dav`, `dav_subspace`, or `ppcg`. The default setting `cg` is recommended, which is a band-by-band conjugate-gradient diagonalization method. The `dav` and `dav_subspace` settings use Davidson-style subspace diagonalization and can be tried to improve performance. The `ppcg` setting uses the projection preconditioned conjugate-gradient method and is currently available for CPU plane-wave calculations.

When "basis_type = lcao", `ks_solver` can be `genelpa` or `scalapack_gvx`. The default setting `genelpa` is recommended, which is based on ELPA (EIGENVALUE SOLVERS FOR PETAFLOP APPLICATIONS) (https://elpa.mpcdf.mpg.de/) and the kernel is auto choosed by GENELPA(https://github.com/pplab/GenELPA), usually faster than the setting of "scalapack_gvx", which is based on ScaLAPACK(Scalable Linear Algebra PACKage)

Expand Down
7 changes: 4 additions & 3 deletions docs/parameters.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -521,6 +521,7 @@ parameters:
* bpcg: The BPCG method, which is a block-parallel Conjugate Gradient (CG) method, typically exhibits higher acceleration in a GPU environment.
* dav: The Davidson algorithm.
* dav_subspace: The Davidson algorithm without orthogonalization operation, this method is the most recommended for efficiency. `pw_diag_ndim` can be set to 2 for this method.
* ppcg: The projection preconditioned conjugate-gradient method, currently available for CPU plane-wave calculations.

For numerical atomic orbitals basis,

Expand Down Expand Up @@ -942,7 +943,7 @@ parameters:
category: Plane wave related variables
type: Real
description: |
Only used when you use ks_solver = cg/dav/dav_subspace/bpcg. It indicates the threshold for the first electronic iteration, from the second iteration the pw_diag_thr will be updated automatically. For nscf calculations with planewave basis set, pw_diag_thr should be <= 1e-3.
Only used when you use ks_solver = cg/dav/dav_subspace/bpcg/ppcg. It indicates the threshold for the first electronic iteration, from the second iteration the pw_diag_thr will be updated automatically. For nscf calculations with planewave basis set, pw_diag_thr should be <= 1e-3.
default_value: "0.01"
unit: ""
availability: ""
Expand All @@ -966,10 +967,10 @@ parameters:
category: Plane wave related variables
type: Integer
description: |
Only useful when you use ks_solver = cg/dav/dav_subspace/bpcg. It indicates the maximal iteration number for cg/david/dav_subspace/bpcg method.
Only useful when you use ks_solver = cg/dav/dav_subspace/bpcg/ppcg. It indicates the maximal iteration number for cg/david/dav_subspace/bpcg/ppcg method.
default_value: "50"
unit: ""
availability: "basis_type==pw, ks_solver==cg/dav/dav_subspace/bpcg"
availability: "basis_type==pw, ks_solver==cg/dav/dav_subspace/bpcg/ppcg"
- name: pw_diag_ndim
category: Plane wave related variables
type: Integer
Expand Down
1 change: 1 addition & 0 deletions source/source_hsolver/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ list(APPEND objects
diago_david.cpp
diago_dav_subspace.cpp
diago_bpcg.cpp
diago_ppcg.cpp
para_linear_transform.cpp
hsolver_pw.cpp
hsolver_lcaopw.cpp
Expand Down
18 changes: 18 additions & 0 deletions source/source_hsolver/diago_ppcg.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#include "diago_ppcg.h"

#include "ppcg/diago_ppcg_lapack.hpp"
#include "ppcg/diago_ppcg_ops.hpp"
#include "ppcg/diago_ppcg_subspace.hpp"
#include "ppcg/diago_ppcg_orth.hpp"
#include "ppcg/diago_ppcg_cg.hpp"
#include "ppcg/diago_ppcg_diag.hpp"

namespace hsolver {

// =============================================================================
// Explicit template instantiation (CPU only; extend for GPU as needed)
// =============================================================================
template class DiagoPPCG<std::complex<float>, base_device::DEVICE_CPU>;
template class DiagoPPCG<std::complex<double>, base_device::DEVICE_CPU>;

} // namespace hsolver
228 changes: 228 additions & 0 deletions source/source_hsolver/diago_ppcg.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,228 @@
#ifndef DIAGO_PPCG_H
#define DIAGO_PPCG_H

#include "source_base/module_device/types.h"

#include <vector>
#include <functional>
#include <cmath>
#include <algorithm>
#include <limits>
#include <stdexcept>
#include <numeric>
#include <complex>
#include <type_traits>

namespace hsolver {

// -----------------------------------------------------------------------------
// DiagoPPCG: Projection Preconditioned Conjugate Gradient solver
// -----------------------------------------------------------------------------
//
// Supports two algorithmic strategies:
// CONJUGATE_GRADIENT — band-by-band Polak-Ribiere CG with line minimization
// (File 2 approach).
// BLOCK_SUBSPACE — block subspace diagonalization (File 1 approach).
//
// BLOCK_SUBSPACE is the production path used by ks_solver=ppcg.
// CONJUGATE_GRADIENT is kept as an explicit fallback strategy.
// -----------------------------------------------------------------------------

enum class PpcgStrategy { BLOCK_SUBSPACE, CONJUGATE_GRADIENT };

namespace base_device = ::base_device;

template <typename T, typename Device>
class DiagoPPCG
{
public:
// -------------------------------------------------------------------------
// Type aliases
// -------------------------------------------------------------------------
using Real = typename std::conditional<
std::is_same<T, std::complex<double>>::value, double,
float>::type;
using HPsiFunc = std::function<void(T*, T*, int, int)>;
using SPsiFunc = std::function<void(T*, T*, int, int)>;

// -------------------------------------------------------------------------
// Constructor
// -------------------------------------------------------------------------
DiagoPPCG(const Real& diag_thr,
const int& diag_iter_max,
const int& sbsize,
const int& rr_step,
const bool gamma_g0_real,
const PpcgStrategy strategy = PpcgStrategy::BLOCK_SUBSPACE);

// -------------------------------------------------------------------------
// Main entry point
//
// Returns average number of subspace iterations per band.
// -------------------------------------------------------------------------
double diag(const HPsiFunc& hpsi_func,
const SPsiFunc& spsi_func,
int ld_psi,
int nband,
int dim,
T* psi_in,
Real* eigenvalue_in,
const std::vector<double>& ethr_band,
const Real* prec);

private:
// -------------------------------------------------------------------------
// Data members
// -------------------------------------------------------------------------
int maxiter_;
int sbsize_;
int rr_step_;
Real diag_thr_;
bool gamma_g0_real_;
PpcgStrategy strategy_;

// Problem dimensions (set in diag())
int ld_psi_ = 0;
int n_band_ = 0;
int n_dim_ = 0;

// Cached S-operator (null if identity).
SPsiFunc spsi_func_;

// Working storage (column-major: ld_psi_ rows, n_band_ columns).
std::vector<T> hpsi_;
std::vector<T> spsi_;
std::vector<T> w_; // residual / preconditioned residual
std::vector<T> sw_; // S * w
std::vector<T> hw_; // H * w
std::vector<T> p_; // previous search direction (for block subspace)
std::vector<T> sp_; // S * p
std::vector<T> hp_; // H * p

// Polak-Ribiere state (CONJUGATE_GRADIENT strategy)
std::vector<T> grad_old_; // previous gradient
std::vector<T> z_old_; // previous preconditioned residual
std::vector<Real> beta_denom_;

// -------------------------------------------------------------------------
// Internal helpers
// -------------------------------------------------------------------------
static inline int idx(int row, int col, int ld)
{
return row + col * ld;
}

void validate_input(const T* psi_in, const Real* eigenvalue_in,
const std::vector<double>& ethr_band,
const Real* prec) const;

void force_g0_real(T* x, int ncol) const;

// S-application (identity fallback if spsi_func is null).
void apply_h(const HPsiFunc& hpsi_func, T* psi_in, T* hpsi_out,
int ncol) const;
void apply_s(const SPsiFunc& spsi_func, T* psi_in, T* spsi_out,
int ncol) const;
void apply_s_current(T* psi_in, T* spsi_out, int ncol) const;

// Inner product <x|y> (real part only).
Real gamma_dot(const T* x, const T* y) const;
T complex_dot(const T* x, const T* y) const;

// Gram matrix: out[i, j] = <a_i | b_j>.
void gram(const T* a, const T* b,
int ncol_a, int ncol_b,
std::vector<T>& out, int ld_out) const;

// Gather / scatter columns.
void copy_cols(const T* src, const std::vector<int>& cols,
std::vector<T>& dst) const;
void scatter_cols(T* dst, const std::vector<int>& cols,
const std::vector<T>& src) const;

// Project x onto vectors orthogonal to the S-orthonormal basis.
void project_against(const T* basis, const T* sbasis,
const std::vector<int>& basis_cols,
std::vector<T>& x, std::vector<T>& sx,
const std::vector<int>& x_cols) const;

// x[c] /= max(prec, eps) for each active column c.
void divide_by_preconditioner(const std::vector<int>& active_cols,
const Real* prec,
std::vector<T>& x) const;

// -------------------------------------------------------------------------
// Block-subspace strategy helpers (File 1 style)
// -------------------------------------------------------------------------
struct SmallSubspace
{
std::vector<T> k; // K matrix (projected H)
std::vector<T> m; // M matrix (projected S)
std::vector<Real> eval; // eigenvalues
std::vector<Real> w_scale;
std::vector<Real> p_scale;
};

void lock_epairs(const std::vector<T>& residual,
const std::vector<double>& ethr_band,
std::vector<int>& active_cols) const;

void build_small_subspace(const T* psi,
const std::vector<int>& cols,
bool use_p,
SmallSubspace& subspace) const;

void solve_small_generalized(int dim, SmallSubspace& subspace) const;

void update_one_block(T* psi,
const std::vector<int>& cols,
int l,
bool use_p,
const SmallSubspace& subspace);

void right_solve_upper(const std::vector<T>& r, int n,
std::vector<T>& x) const;

bool is_s_orthonormal(const T* psi, const T* spsi, int ncol) const;

void s_gram_schmidt(T* psi, T* hpsi, T* spsi, int ncol) const;

void chol_qr_active(T* psi, const std::vector<int>& active_cols);

void rayleigh_ritz(T* psi, Real* eigenvalue,
std::vector<int>& active_cols,
const std::vector<double>& ethr_band);

Real trace_of_active_projected(const T* psi,
const std::vector<int>& active_cols) const;

// -------------------------------------------------------------------------
// Conjugate-gradient strategy helpers (File 2 style)
// -------------------------------------------------------------------------
void calc_gradient(const Real* prec,
const T* hpsi,
const T* spsi,
const T* psi,
const Real* eigenvalue,
std::vector<T>& grad) const;

void orth_gradient(const T* psi, const T* spsi,
std::vector<T>& grad) const;

void update_polak_ribiere(const std::vector<T>& grad,
std::vector<T>& p,
std::vector<T>& grad_old,
std::vector<T>& z_old,
std::vector<Real>& beta_denom,
const Real* prec) const;

void line_minimize(T* psi, T* hpsi, T* spsi,
const T* p, const T* hp, const T* sp,
int ncol) const;

void orth_cholesky(T* psi, T* hpsi, T* spsi, int ncol) const;
};

} // namespace hsolver

#endif // DIAGO_PPCG_H
Loading
Loading