Open Access

Semi-algebraic geometry of common lines

Research in the Mathematical Sciences20141:14

DOI: 10.1186/s40687-014-0014-5

Received: 11 April 2014

Accepted: 8 October 2014

Published: 2 December 2014

Abstract

Purpose

Cryo-electron microscopy is a technique in structural biology for determining the 3D structure of macromolecules. A key step in this process is detecting common lines of intersection between unknown embedded image planes. We wish to characterize such common lines in terms of the unembedded geometric data detected in experiments.

Methods

We use techniques from spherical geometry, real algebraic geometry, and linear algebra.

Results

We show that common lines are the solutions to a system of polynomial equalities and inequalities, i.e., they form a semi-algebraic set. These polynomials are low degree, and we explicitly derive them in this paper.

Conclusions

The polynomials we derive provide the desired intrinsic characterization of common lines. We discuss possible applications of these polynomials to reconstruction algorithms that are robust to the high levels of noise present in cryo-electron images.

Keywords

Cryo-EM Common lines Semi-algebraic geometry

Background

Cryo-electron microscopy (cryo-EM) is a technique used to discover the structure of small molecules, usually proteins in the context of structural biology research [1].

A basic outline of cryo-EM is presented in Figure 1. First, a sample is prepared by freezing many different copies of the molecule in a thin layer of ice. A stream of electrons then passes through the sample and is detected by cameras that produce N noisy 2D cryo-EM images I1,…,I N . The primary goal is to reconstruct the 3D structure of the molecule from the 2D images that are acquired. For a more detailed overview, see [2], Section 1.
Figure 1

Cryo-EM obtains a 3D structure from noisy 2D images I 1 ,…, I N .

Problem 1(Reconstruction problem: structural biology).

Given N two-dimensional experimental cryo-EM images I1,…,I N , reconstruct a three-dimensional model of the original molecule.

Mathematical model

We briefly describe the mathematical model for cryo-EM, following [3], Section 0. We work in the three-dimensional space 3 equipped with the usual inner product. The molecule is modeled by a function ϕ : 3 that represents its electronic density at various spatial locations (Figure 2a). An actual cryo-EM experiment obtains a single image of many copies of the molecule, but we instead assume that each image is a picture of the same molecule from different microscope orientations (Figure 2b). To model a microscope orientation, we use the following concept:
Figure 2

Cryo-EM mathematical model.

Definition 1.

A frame F for 3 is an ordered orthonormal basis (a,b,c) such that the determinant of the matrix [ a b c] is +1 or, equivalently, that c=a×b, where × is the standard cross product on 3 .

Remark 1.

A frame F for 3 is uniquely determined by the vectors (a,b). For the rest of the paper, we identify frames (a,b,c) with pairs of orthonormal vectors (a,b).

For us, a microscope orientation is a frame F=(a,b). We think of the span of the vectors a and b as the embedded image plane of this orientation, and the vector c=a×b as the ‘viewing’ direction (Figure 2c).

A cryo-EM experiment produces N images which we denote I1,…,I N (see Figure 2b). We will write F i =(a i ,b i ) for the microscope orientation of image I i . The embedded image plane spanned by a i ,b i can be canonically identified with the plane P i = 2 . We think of P i as the unembedded image plane of I i . We model the image I i as a real valued function on P i = 2 . The value of the image I i at the point (x,y) is the integral of ϕ along a line perpendicular to the embedded image plane span{a i ,b i } (see Figure 2d and Equation 1). This is the X-ray transform of ϕ onto the frame F i , given by
I i : P i = 2 , I i ( x , y ) = ϕ ( xa i + yb i + zc i ) dz ,
(1)

where c i =a i ×b i . As in [3], to solve this reconstruction problem, we assume that the X-ray projections I i and I j of ϕ from different microscope orientations F i and F j are different. This is equivalent to requiring the molecule ϕ to admit no non-trivial symmetry as a function on 3 .

In terms of this mathematical model, the goal of cryo-EM reconstruction (Problem 1) becomes to recover the function ϕ from the N X-ray projections I1,…,I N . A commonly used approach for this problem is to first recover the N projection orientations F1,…,F N ([3], Section 0.1). Note that the detected image I i is a function on the plane P i = 2 , and a cryo-EM experiment does not directly provide information about the microscope orientation F i used to compute I i .

Once the original microscope orientations are known, the unembedded image data I1,…,I N can be placed in the original positions from where these X-ray projections were computed. Then the X-ray transform can be inverted to yield an approximation of ϕ. Thus, although the ultimate goal is to solve Problem 1, we instead discuss solutions to the following problem.

Problem 2(Reconstruction problem: microscope orientations).

Given N X-ray projections I1,…,I N of a molecule ϕ : 3 , computed from the N unknown microscope orientations F1,…,F N , recover these orientations up to global rotation.

Remark 2.

By ‘up to global rotation’ we mean that instead of recovering the molecule ϕ, we instead recover the molecule ϕ rotated by an element R in O(3), the group of 3×3 orthogonal matrices. The matrix R may be a proper (det R=+1) or improper (det R=−1) rotation, so we expect chiral ambiguity in the reconstructed molecule.

Common lines and reconstruction

One approach for solving Problem 2 is to exploit common lines of intersection between the embedded image planes, which we now describe. A cryo-EM experiment produces images I i and I j from orientations F i =(a i ,b i ) and F j =(a j ,b j ). These frames define isometric embeddings ι i and ι j (Figure 3) of the unembedded image planes P i and P j into 3 , given by
ι i ( x , y ) = xa i + yb i , ι j ( x , y ) = xa j + yb j .
(2)
Figure 3

Common line of F i and F j .

The images are functions on P i and P j , and we know that they were obtained as X-ray projections onto the unknown embedded image planes ι i (P i ) and ι j (P j ) (Figure 3b). We assume that each embedded image plane ι i (P i ) is distinct and, further, that each pair of such planes intersects in a distinct line. Such a configuration of microscope orientations is called generic. The microscope orientations will be generic if they are sampled uniformly from the space of all frames as, for example, assumed in the eigenvector relaxation algorithm developed in [2], Section 3.

The embedded image planes ι i (P i ) and ι j (P j ) intersect in a line L, see Figure 3b, and this line corresponds to the unembedded lines ij P i and ji P j , see Figure 3a. Since these unembedded lines both came from L 3 , we have a natural choice ψ ij : ij ji of one of the two possible isometries between ij and ji . Proceeding in this fashion, the N microscope orientations F1,…,F N produce N 2 = N ( N 1 ) / 2 common line pairs {( ij , ji ,ψ ij )}. This is the common lines data realized by the frames F1,…,F N . It will be useful for us to distinguish such common lines data obtained from frames.

Definition 2.

A common line pair for P i and P j is a pair of lines ij P i and ji P j , together with a choice of isometry ψ ij : ij ji . A collection of common line pairs {( ij , ji ,ψ ij )}, for every P i and P j , is common lines data for P1,…,P N . We say common lines data is valid if it is realized by some generic frames F1,…,F N .

Despite the fact that common lines data is the information in the unembedded planes P i , it is a fact that, when N≥3, valid common lines data determines its realizing frames, up to global rotation. Further, algorithms have long been known (e.g., [4], Section 2.1) that recover a set of realizing frames from valid common lines data.

This is relevant to cryo-EM reconstruction, because although the microscope orientations are unknown, it is possible to detect the common lines data the orientations realize from the images I1,…,I N [5]. Thus, we have the following common lines approach for the cryo-EM reconstruction problem (Problem 2). We first detect the common lines data realized by the unknown microscope orientations. Next, from the valid common lines data, we reconstruct a set of realizing frames. Since valid common lines data determines its realizing frames up to global rotation, the reconstructed frames are related to the original microscope orientations by a global rotation, and so in principle, one has solved the reconstruction problem.

Angular reconstitution

In this section, we describe the angular reconstitution algorithm, due to van Heel [4], and also independently Vainshtein and Goncharov [6], which recovers a set of realizing frames from valid common lines data.

Our input is valid common lines data {( ij , ji ,ψ ij )} for P1,…,P N (Figure 4). Note that recovering a frame F i is equivalent to recovering the embedding ι i of P i , which will be easier to visualize. Since we are only reconstructing up to global rotation, the first step is to embed P1 in an arbitrary position in 3 (Figure 5a). Next, we use the isometry ψ12 between 12 and 21 to dock P2 to ι1(P1) (Figure 5b). This docking is ambiguous (Figure 5c) since we are free to rotate ι2(P2) about its line of intersection with ι1(P1). We resolve this ambiguity by docking P3 with ι1(P1) and matching up 23 and 32 in ι2(P2) and ι3(P3), respectively (Figure 5d). We continue in this fashion, docking each subsequent plane P i with ι1(P1) and resolving the rotational ambiguity by comparing against the remaining frames.
Figure 4

Common lines data for N =3 planes.

Figure 5

Angular reconstitution. (a) Place P1. (b) Dock P2 via ψ12. (c) Rotational ambiguity for ι2(P2). (d) Resolve ambiguity by docking P3 to ι1(13).

Noise and valid common lines data

We discussed in the ‘Common lines and reconstruction’ section that valid common lines data determines its realizing frames up to global rotation. Common lines based approaches for cryo-EM reconstruction (Problem 2) assume that we can accurately detect the valid common lines realized by the unknown microscope orientations. Unfortunately, cryo-EM images are very noisy (Figure 6), so we cannot expect to correctly identify common lines data.
Figure 6

Raw cryo-EM images of β -galactosidase. Image by Richard Henderson, personal communication.

Misdetected common lines pose a problem because they lead to inconsistencies when attempting to recover realizing frames. For example, in Figure 5, we resolved the ambiguity of ι2(P2) by docking P3 to ι1(P1) and using the common lines l23 and l32 (Figure 5c). However, we could have equally well resolved the ambiguity of ι2(P2) by docking P4 and using the common lines l24 and l42. Thus, if we, for example, incorrectly identify the common lines in P4, we will have two contradictory embeddings ι2(P2) with no obvious way of determining which is correct. More generally, the angular reconstitution algorithm makes many choices: for example, which plane to begin reconstruction with and how to resolve docking ambiguities. The final reconstructed frames depend on all these choices. By definition, valid common lines data is precisely the data which has a single consistent (up to global rotation) set of realizing frames. The development of common lines reconstruction algorithms that are robust to this kind of error is an active area of research.

Methods

We wish to understand the set C N of all valid common lines data for N planes P1,…,P N . First, we derive necessary and sufficient conditions for common lines data to be valid. These conditions are polynomial equations and inequalities, which means that C N is a semi-algebraic set, and allows us to study C N as a geometric space. In particular, we compute the dimension of C N and show that there is a bijection between C N and the space of generic frames, up to global rotation.

Main Theorem.

The set C N of all valid common lines data for N frames is a 3N−3 dimensional semi-algebraic subset of the 2 N 2 dimensional space of all common lines data and is in bijection with the space of N generic frames modulo O(3). The defining equations for C N are given by N 3 polynomial inequalities arising from the spherical triangle inequalities and 6 N 4 polynomial equalities arising from the spherical law of cosines.

The meaning of this theorem is as follows: as we discussed in the ‘Common lines and reconstruction’ section, one way to obtain valid common lines data is from the embedded frames F1,…,F N . The theorem provides an intrinsic definition of this valid common lines data, namely, the defining polynomials for C N . This is a definition for valid common lines only in terms of the data {(l ij ,l ji ,ψ ij )} on unembedded planes P1,…,P N and without reference to any embedded frames F1,…,F N .

We briefly describe the idea behind our proofs. Suppose we have valid common lines data
( 12 , 21 , ψ 12 ) , ( 13 , 31 , ψ 13 ) , ( 23 , 32 , ψ 23 ) .
(3)
The angles between these unembedded common lines determine a triangle on the unit sphere in 3 (Figure 7), and so the angles α between 12 and 13, β between 21 and 23, and γ between 31 and 32 must satisfy the spherical triangle inequalities. These inequalities are analogs of the plane triangle inequality, i.e., necessary and sufficient conditions for a spherical triangle to exist with the specified edge lengths. In other words, a necessary and sufficient condition for common lines data to be valid for N=3 is that it satisfies the spherical triangle inequality, a fact already observed both by the cryo-EM ([7], pp. 198–199) and mathematics ([8], Equations 11 and 12) communities.
Figure 7

Common lines in P 1 , P 2 , and P 3 determine a spherical triangle.

We prove our results for N>3 by similarly appealing to spherical trigonometry. Specifically, given common lines data { ij , ji ,ψ ij } for N planes, we require that for each triple 1≤i<j<kN, the common lines data ( ij , ji ,ψ ij ), ( ik , ki ,ψ ik ), and ( jk , kj ,ψ jk ) satisfy the spherical triangle inequalities. Now, reducing to the N=3 case gives us realizing embeddings ι i , ι j , ι k for each triple (i,j,k) of indices. To reconstruct a collection of N consistent frames, all these triple reconstructions must be compatible. We show that this compatibility condition is a polynomial condition arising from the spherical law of cosines. These defining equations are given by polynomials which are explicitly derived and listed in the ‘Defining polynomials’ section.

Results and discussion

We proceed to describe in detail the results in the Main Theorem. We will derive the necessary and sufficient conditions for common lines data for N≥3 to be valid. These will be explicit polynomial equations and inequalities only in the unembedded information {(l ij ,l ji ,ψ ij )} and will provide an intrinsic definition for valid common lines without reference to the frames F1,…,F N . We defer all proofs to Appendix Appendix 1 Proofs.

Projective coordinates

To obtain defining equations for C N , it will be convenient for us to work with projective coordinates, which we briefly review. Suppose V is a vector space and is a line in V through the origin. We can represent by choosing any non-zero vector v. In other words, lines can be identified with equivalence classes of vectors under scaling. We denote the equivalence class of a vector v by [ v], and by definition, [ v]= [ w] if and only v=λ w, for some λ≠0. The space of all lines through the origin in V is the projective space ( V ) . If V=U×W and (u,w)V, then we write [ u:w] for the corresponding equivalence class in ( U × W ) .

Coordinates for common lines

Suppose now that ( ij , ji ,ψ ij ) is a common line pair for P i and P j . Choose a non-zero vector v ij =(x ij ,y ij ) on the line ij P i and consider the pair (v ij ,ψ ij (v ij ))P i ×P j . Note that different choices of a vector along ij will simply scale (v ij ,ψ ij (v ij )) by a non-zero multiple, so the projective pair [ v ij :ψ ij (v ij )] in ( P i × P j ) is uniquely determined by ( ij , ji ,ψ ij ).

Conversely, if v ij : v ji ( P i × P j ) satisfies v ij 2=v ji 2, then choosing representatives (v ij ,v ji ), we obtain a common line pair (span{v ij },span{v ji },ψ ij ), where ψ ij is the unique isometry that sends v ij v ji . Note that we obtain the same common line pair regardless of which representing vectors we choose.

Thus, from now on, we identify common line pairs with elements v ij : v ji ( P i × P j ) satisfying v ij 2=v ji 2. We also apply this identification to the following common lines data:

Remark 3.

We identify common lines data for P1,…,P N with collections
v ij : v ji 1 i < j N ( P i × P j ) = 3 N 2

that satisfy v ij 2=v ji 2 for all pairs.

In coordinates, we say that the frames F i and F j realize the common line pair [v ij :v ji ] if the associated embeddings (Equation 2) bring together this common line pair, i.e., for any choice of representative (v ij ,v ji ), we have
ι i ( v ij ) = ι j ( v ji ) .

By definition, valid common lines data is a collection ([v ij :v ji ]) of common lines data for which there exist frames F1,…,F N such that for all 1≤i<jN, the frames F i and F j realize [ v ij :v ji ].

Necessary and sufficient conditions

In this section, we derive equations and inequalities that are necessary and sufficient for common lines data (([ v ij :v ji ]) to be valid. We first discuss necessary conditions. Recall from the ‘Methods’ section that for any triple of indices i,j,k the angles between the common line pairs [ v ij :v ji ], [ v ik :v ki ], and [ v jk :v kj ] determine a spherical triangle (Figure 7), and so these angles must satisfy the spherical triangle inequalities. The spherical triangle inequalities state that a non-degenerate spherical triangle of edge lengths α, β, and γ, all in (0,π), exists if and only if
β + γ > α , α + γ > β , α + β > γ , α + β + γ < 2 π.
(4)

Remark 4.

Fix common lines data v ij : v ji ( 3 ) N 2 and a triple of indices (i,j,k). If we choose representatives (v ij ,v ji ), (v ik ,v ki ), and (v jk ,v kj ), we can write
α ijk = cos 1 v ij · v ik v ij v ik , β ijk = cos 1 v ji · v jk v ji v jk , γ ijk = cos 1 v ki · v kj v ki v kj .

The angles α ijk ,β ijk , and γ ijk depend on the representatives we have chosen; however, whether or not the spherical triangle inequalities (Equation 4) are satisfied by α ijk , β ijk , γ ijk is independent of this choice. Thus, we can make the following definition:

Definition 3.

Fix common lines data v ij : v ji ( 3 ) N 2 and a triple of indices (i,j,k). We say (i,j,k)satisfies the triangle inequalities if, for any choice of representatives (v ij ,v ji ), (v ik ,v ki ), and (v jk ,v kj ), the angles α ijk ,β ijk , and γ ijk satisfy Equation 4.

This definition allows us to state our first result.

Proposition 1.

Fix common lines data v ij : v ji ( 3 ) N 2 and suppose that the triple (i,j,k) satisfies the spherical triangle inequalities. Then, there exist generic frames F i ,F j , and F k that realize the common line pairs [v ij :v ji ], [ v ik :v ki ], and [ v jk :v kj ]. Moreover, if G i ,G j , and G k are another set of frames that realize these same pairs, then there exists an isometry in O(3) that maps (F i ,F j ,F k )(G i ,G j ,G k ).

For a proof of this proposition, see Appendix Appendix 1 Proofs. When ([v ij :v ji ]) is fixed, and the common lines [v ij :v ji ], [v ik :v ki ] and [v jk :v kj ] are realized by F i , F j , and F k , we will say that these frames realize the triple (i,j,k).

This proposition is a necessary and sufficient condition for realizing frames to exist for a triple (i,j,k), and so we have recovered a necessary and sufficient conditions for N=3. For N>3, this proposition states that each triple of indices (i,j,k) must satisfy the spherical triangle inequality, but this condition is no longer sufficient.

Example 1.

Consider the common lines data for P1,P2,P3, and P4 given by
v 12 , v 13 , v 14 = v 21 , v 23 , v 24 = v 31 , v 32 , v 34 = v 41 , v 42 , v 43 = 1 , 0 T , 2 / 2 , 2 / 2 T , 0 , 1 T .
The angles between these common lines are given by
( α 123 , β 123 , γ 123 ) = π 4 , π 4 , π 4 , ( α 124 , β 124 , γ 124 ) = π 2 , π 2 , π 4 , ( α 134 , β 134 , γ 134 ) = π 4 , π 2 , π 2 , ( α 234 , β 234 , γ 234 ) = π 4 , π 4 , π 4 .
Observe that each of these triples satisfies the spherical triangle inequality. However, this data cannot be realized by frames F1,F2,F3, and F4, and so this common line data is not valid. To see why, suppose such frames existed and, for each pair i,j, set Λ ij =ι i (v ij )=ι j (v ji ). The points Λ12,Λ13, and Λ23 determine a spherical triangle with edge lengths (α123,β123,γ123) (Figure 8a), and the angle of this spherical triangle at the vertex between edges α123 and β123 is exactly the angle θ12 between the planes ι1(P1) and ι2(P2). From the spherical law of cosines, we can compute this angle
cos θ 12 = cos γ 123 cos α 123 cos β 123 sin α 123 sin β 123 = 2 1 .
Figure 8

Inconsistent reconstruction from invalid common lines data.

Similarly, the points Λ12,Λ14, and Λ24 determine a spherical triangle with edge lengths (α124,β124,γ124) (Figure 8b), and the angle of this triangle between edges α124 and β124 is again the angle θ12 between the planes ι1(P1) and ι2(P2). However, in this triangle we have cos θ 12 = 2 / 2 , which is a contradiction.

We now provide an explanation for why the contradiction in Example 1 arose that will lead us to necessary and sufficient conditions for reconstruction when N>3. Suppose the frames F1,…,F N realize the common lines data ( [ v ij : v ji ] ) ( 3 ) N 2 and choose unit vector representatives (v ij ,v ji ) for all the common line pairs. If we consider the intersection of the embedded planes ι i (P i ) with the unit sphere in 3 , we obtain N great circles. Each pair of these great circles has a distinguished point of intersection ι i (v ij )=ι j (v ji ) which we denote by Λ ij . Denoted by T(i,j,k), the triangle obtained by taking Λ ij , Λ ik , and Λ jk as vertices (Figure 9).
Figure 9

Triangle T ( i , j , k ) on the surface of the sphere.

Consider the second triangle T(i,j,m) (Figure 10). The two triangles T(i,j,k) and T(i,j,m) share a vertex, Λ ij , and the edges of both triangles at this vertex lie in ι i (P i ) and ι j (P j ). It follows that the angle Z in T(i,j,k) and Z in T(i,j,m) at this common vertex must be compatible: the angles are either the same (Figure 10a) or supplementary (Figure 10b), depending on the arrangement of the vertices. We can express this requirement in terms of the common lines data by using the spherical law of cosines
( cos γ 123 cos α 123 cos β 123 ) sin α 124 sin β 124 = σ ( cos γ 124 cos α 124 cos β 124 ) sin α 123 sin β 123 ,
(5)
Figure 10

T ( i , j , k ), in green, shares edges with T ( i , j , m ).

where σ determines whether Z=Z or Z=πZ. In this light, the contradiction in Example 1 arose because the angles at Λ12 in T(1,2,3) and T(1,2,4) were not compatible.

Remark 5.

Fix common lines data v ij : v ji ( 3 ) N 2 and two triples (i,j,k) and (i,j,m) that agree in two indices. If we choose representatives (v ij ,v ji ), (v ik ,v ki ), (v jk ,v kj ), (v im ,v mi ), and (v jm ,v mj ) for these common lines, the necessary angle equality (Equation 5) described above is
L ijk , ijm = ( ( v ij · v ij ) ( v ki · v kj ) ( v ij · v ik ) ( v ji · v jk ) ) det v ij , v im det v ji , v jm σ ( ( v ij · v ij ) ( v mi · v mj ) ( v ij · v im ) ( v ji · v jm ) ) det v ij , v ik det v ji , v jk ,
(6)
where
σ = sign det v ij , v ik det v ij , v im det v ji , v jk det v ji , v jm .

Whether or not Li j k,i j m=0 is independent of the representatives we choose, we can make the following definition.

Definition 4.

Fix common lines data v ij : v ji ( 3 ) N 2 and two triples (i,j,k) and (i,j,m) that agree in two indices. We say (i,j,k) and (i,j,m) satisfy the spherical law of cosines compatibility if Li j k,i j m=0.

The spherical law of cosines compatibility is necessary for common lines data to be valid, and we will see it is sufficient as well. We first show that if this law of cosines compatibility between (i,j,k) and (i,j,m) is satisfied, then we can glue together realizing frames for these triples in a compatible fashion.

Lemma 1.

Fix common lines data v ij : v ji ( 3 ) N 2 and suppose that the triples (i,j,k) and (i,j,m) satisfy the spherical law of cosines compatibility. Then, if F i ,F j , and F k are any realizing frames for (i,j,k), and G i ,G j ,G m are any realizing frames for (i,j,m), then there exists a unique isometry in O(3) that sends F i G i and F j G j .

For a proof, see Appendix Appendix 1 Proofs.

We now can show that the law of cosines compatibility is sufficient for reconstruction.

Theorem 1.

Fix common lines data v ij : v ji ( 3 ) N 2 and suppose that every triple (i,j,k) satisfies the spherical triangle inequality and, further, that every pair of triples (i,j,k) and (i,j,m) that agree in two indices satisfies the spherical law of cosines compatibility; then, there exist generic frames F1,…,F N , unique up to isometry in O(3), realizing ([v ij :v ji ]).

For a proof, see Appendix Appendix 1 Proofs.

Geometry of valid common lines

We now use the necessary and sufficient conditions derived above to deduce some geometric properties about the set C N of all valid common lines. The main result in this section is that C N is in bijection with the space of generic frames, up to global rotation. In particular, this implies that the dimension of C N is 3N−3.

We first explicitly describe how to obtain valid common lines from a set of generic realizing frames F1,…,F N , as in the ‘Common lines and reconstruction’ section. For each pair i,j, choose a vector Λ ij in the one-dimensional vector space ι i (P i )∩ι j (P j ). Since 3 has the canonical structure of an inner product space, we have the corresponding orthogonal projections ι i T : 3 P i and ι j T : 3 P j . Consider the vectors
v ij , v ji = ι i T Λ ij , ι j T Λ ij P i × P j .
By construction the pair [v ij :v ji ]= [x ij :y ij :x ji :y ji ] is a common line pair realized by the frames F i and F j . In coordinates, we have
x ij a i + y ij b i = Λ ij = x ji a j + y ji b j .
(7)

Repeating this process for all pairs 1≤i<jN, we obtain valid common lines data ([v ij :v ji ])C N that is realized by F1,…,F N . This algorithmically gives a map G C N , where is the subset of N generic frames in F N . It will be useful to express this function via explicit polynomial mappings. We first describe a set of coordinates on the Grassmannian Gr(3,2N), whose points are the three-dimensional subspaces of 2 N .

Grassmannian and Plücker coordinates

If W 2 N is a three-dimensional subspace of 2 N , and we choose a basis w 1 , w 2 , w 3 2 N for W, we can represent the point in Gr(3,2N) corresponding to W by the vector of all 3×3 minors of the 3×2N matrix
w 1 , w 2 , w 3 T .

These minors are the Plücker coordinates of the subspace W. If we choose a different basis for W, the vector of 3×3 minors will only change by a non-zero scalar. Since Plücker coordinates are only defined up to scaling, we interpret the Grassmannian Gr(3,2N) as a subvariety of the projective space 2 N 3 1 .

Given a collection of N frames F1,…,F N , we can form the 3×2N matrix
F = [ F 1 F N ] = [ a 1 , b 1 , , a N , b N ] .

We consider the rational map ρ : F N −−→ Gr ( 3 , 2 N ) that takes a collection of frames F1,…,F N to the Plücker coordinates of F. A rational map is a map that is defined almost everywhere in the domain. In this case, ρ is not defined if the rank of F is ≤2, since, in this case, the rows of F do not determine a three-dimensional subspace of 2 N .

Plücker coordinates for common lines

As described above, given a pair of frames F i ,F j for i<j, we can compute the associated common line pair [ v ij :v ji ] by choosing any vector Λ ij in ι i (P i )∩ι j (P j ). In particular, we can choose Λ ij =(a i ×b i )×(a j ×b j ), where × is the standard vector cross product on 3 . Then, the following identity from the vector algebra, called the vector quadruple product, expresses Λ ij in terms of the frames F i and F j
det a j , b j , a i b i det a j , b j , b i a i = ( a i × b i ) × ( a j × b j ) = det a i , b i , b j a j det a i , b i , a j b j .
Comparing this with Equation 7, we see that the coordinates of the common line pair [ v ij :v ji ] are given by determinants of certain 3×3 matrices. Explicitly, we have
v ij = det a j , b j , b i det a j , b j , a i , v ji = det a i , b i , b j det a i , b i , a j .
Observe that these 3×3 determinants are certain 3×3 minors of the matrix F. The minors that appear are those that belong to only two frames F i and F j , in other words, minors that choose any three of {a i ,b i ,a j ,b j } for columns. The minors not appearing as coordinates of a common line pair are those that choose three columns from three distinct frames
det { a i , b i } , { a j , b j } , { a k , b k } .
(8)
Thus, the coordinates on the Grassmannian Gr(3,2N) are the common line coordinates, together with these ‘bad’ minors Equation 8. If we consider the projection where we discard the ‘bad’ minors, we obtain the rational map
Gr ( 3 , 2 N ) −−→ 1 i < j N ( P i × P j ) = ( 3 ) N 2 .
Explicitly, for i<j, this projection maps
: det a j , b j , b i : det a j , b j , a i : det a i , b i , b j : det a i , b i , a j : [ v ij : v ji ] .

Note that this rational map is not defined whenever the four 3×3 minors appearing in the common line pair [v ij :v ji ] simultaneously vanish. This cannot happen with generic frames, so this projection is an actual map when restricted to ρ ( G ) Gr ( 3 , 2 N ) . The image of this map is the set of valid common lines C N , and the map is in fact a bijection.

Theorem 2.

The restriction π of the projection Gr ( 3 , 2 N ) −−→ ( 3 ) N 2 to ρ ( G ) Gr ( 3 , 2 N ) is a bijection onto C N .

For a proof, see Appendix Appendix 1 Proofs.

As we discussed above, the point ρ(F)Gr(3,2N) only determines the row space of the matrix F= [ F1, …, F N ]. A different basis for this row space is given by multiplying F on the left by a matrix A in O(3) or, equivalently, by the following action
A · ( F 1 , , F N ) = ( AF 1 , , AF N ) .

This is the diagonal action of O(3) on the space of frames F N . We observe that this O(3) action is the only ambiguity between the space of frames and the Plücker embedding of these frames in Gr(3,2N). Since common lines data corresponds to points in Gr(3,2N), we have recovered the fact that common lines data only determines its realizing frames up to O(3).

Corollary 1.

The dimension of C N as a semi-algebraic set is 3N−3.

For a proof, see Appendix Appendix 1 Proofs.

Conclusions

The polynomial equations defining C N provide the intrinsic definition for valid common lines we set out to find. We briefly discuss potential applications.

Future work

Thinking of valid common lines data in geometric terms provides some insight about inconsistencies during reconstruction due to noise. The space of all common lines data has dimension N(N−1), and since valid common lines are in bijection with the space of N frames up to global rotation, we have that the dimension of C N is 3N−3. Since C N is a space of small dimension in the ambient space, it follows that the reconstruction inconsistencies described in the ‘Noise and valid common lines data’ section are guaranteed to occur. In effect, the most basic version of the angular reconstitution algorithm reconstructs the microscope orientations F1,…,F N using only 2N−3 out of the N 2 common line pairs and arbitrarily ignores inconsistencies within these pairs. The set C N is precisely the set of common lines data for which this algorithm will produce the same output regardless of which common line pairs are used, but as described above, we do not expect experimental data to lie in C N .

Developing common lines reconstruction algorithms that are robust to noise is an active area of research. We are interested in exploring a geometric approach to noise reduction, which we briefly describe. In principle, noisy experimental data l ij , l ji , ψ ij that lies outside of C N ‘came from’ some noiseless valid common lines data in C N . Since the set C N is the set of solutions of a system of polynomials, it is theoretically possible to project noisy common lines to the set of noiseless common lines C N via constrained polynomial optimization. We hope to develop effective projection algorithms along these lines to reduce the impact of noise in reconstruction.

In the ‘Results and discussion’ section, we obtain defining polynomials for valid common lines data by appealing to spherical geometry. It is also possible to interpret valid common lines in terms of Gram matrices, as in [8] for the case N=3. With this interpretation, for N>3, one can attempt to find defining polynomials by eliminating certain variables from the defining equations of low rank Gram matrices. The algebraic set corresponding to this elimination is the quotient of C N by the natural action of SO(2) N in each image plane. We have not yet been able to solve this elimination problem using direct approaches available in the computational algebra software Macaulay2 [9]. We are interested in further studying these related defining polynomials, since they suggest the possibility of applying matrix completion techniques to the denoising projection described above.

Defining polynomials

In the ‘Necessary and sufficient conditions’ section, we derived the defining equations for C N in terms of spherical geometry. For the benefit of the reader, we now explicitly describe these conditions as multi-homogeneous polynomials in the variables ([v ij :v ji ]).

Suppose v ij : v ji ( 3 ) N 2 is fixed, and that v ij 2=v ji 2 for all 1≤i<jN. The spherical triangle inequalities for the common line pairs [v ij :v ji ], [v ik :v ki ] and [v jk :v kj ] described in Equation 4 are equivalent ([8], Equation 11) to
v ij 2 v ik 2 v jk 2 v jk 2 ( v ij · v ik ) 2 v ik 2 ( v ji · v jk ) 2 v ij 2 ( v ki · v kj ) 2 + 2 ( v ij · v ik ) ( v ji · v jk ) ( v ki · v kj ) > 0 .
To express the spherical law of cosines compatibilities Li j k,i j m (Equation 6), set
a = ( v ij 2 ( v ki · v kj ) ( v ij · v ik ) ( v ji · v jk ) ) , b = ( v ij 2 ( v mi · v mj ) ( v ij · v im ) ( v ji · v jm ) ) , d 1 = det [ v ij , v im ] det [ v ji , v jm ] , d 2 = det [ v ij , v ik ] det [ v ji , v jk ] .
Then, Li j k,i j m=0 if and only if
a 2 d 1 2 2 d 1 d 2 ab + b 2 d 2 2 = 0 .
Thus, the set C N is defined as a semi-algebraic subset of ( 3 ) N 2 by the following equations and inequalities:
  1. 1.

    The N 2 equations v ij 2=v ji 2, see the ‘Coordinates for common lines’ section.

     
  2. 2.

    For each of the N 3 triples (i,j,k), the spherical triangle inequality, see Proposition 1.

     
  3. 3.

    For each of the 6 N 4 ways to choose two triples of distinct indices (i,j,k) and (i,j,m) the spherical law of cosines compatibility, see Lemma 1.

     

Appendix 1 Proofs

Proof of Proposition 1.

Fix representatives (v ij ,v ji ), (v ik ,v ki ), and (v jk ,v kj ). Since the lengths α ijk , β ijk , and γ ijk strictly satisfy the triangle inequalities, there is a non-degenerate spherical triangle with these edge lengths. Denote the vertex of this triangle opposite the edge of length α ijk by Λ jk , the vertex opposite the edge β ijk by Λ ik and the vertex opposite the edge γ ijk by Λ ij . Since this triangle is non-degenerate, we know that Λ ij ,Λ ik , and Λ jk are linearly independent. Thus, we have embeddings ι i ,ι j ,ι k given by
ι i : P i 3 , ι j : P j 3 , ι k : P k 3 , v ij Λ ij , v ji Λ ij , v ki Λ ik , v ik Λ ik , v jk Λ jk , v kj Λ jk .
Observe that these embeddings are isometric by construction, and so F i =(ι i (x),ι i (y)), F j =(ι j (x),ι j (y)), and F k =(ι k (x),ι k (y)) are frames. Since Λ ij ,Λ ik , and Λ jk are vertices of a non-degenerate spherical triangle, these three frames are in generic position. Moreover, by construction we have
ι i ( v ij ) = ι j ( v ji ) , ι i ( v ik ) = ι k ( v ki ) , ι j ( v jk ) = ι k ( v kj ) ,

and so F i ,F j , and F k realize the required common line pairs.

Now, suppose G i ,G j , and G k also realize the common line pairs [v ij :v ji ],[v ik :v ki ] and [v jk :v kj ]. Let ι i G , ι j G , and ι k G be the embeddings corresponding to these frames and set Λ ij G = ι i G ( v ij ) , Λ ik G = ι i G ( v ik ) , and Λ jk G = ι j G ( v jk ) . Since (i,j,k) strictly satisfies the triangle inequalities, these three vectors are linearly independent and thus define a spherical triangle with edge lengths (α ijk ,β ijk ,γ ijk ). This triangle is congruent to the triangle with vertices Λ ij , Λ ik , and Λ jk constructed above, and so there exists an element in O(3) that maps Λ ij G Λ ij , Λ ik G Λ ik , and Λ jk G Λ jk and thus maps (G i ,G j ,G k )(F i ,F j ,F k ). □

Proof of Lemma 1.

Fix unit length representatives (v ij ,v ji ), (v ik ,v ki ), (v jk ,v kj ), (v im ,v mi ), and (v jm ,v mj ). Let ι i F , ι j F , and ι k F be the embeddings corresponding to F i , F j , and F k and let ι i G , ι j G , and ι m G be the embeddings corresponding to G i , G j , and G m . Write
Λ ij F = ι i F ( v ij ) , Λ ik F = ι i F ( v ik ) , Λ jk F = ι j F ( v jk ) , Λ ij G = ι i G ( v ij ) , Λ ik G = ι i G ( v ik ) , Λ jk G = ι j G ( v jk ) .
We wish to show that the map A : 3 3 defined by
Λ ij F Λ ij G , Λ ik F Λ ik G , Λ jk F Λ jk G ,
which sends F i G i and F j G j , is an isometry in O(3). Since (i,j,k) is realized by F i , F j , and F k , it satisfies the spherical triangle inequality, and thus the vectors Λ ij F , Λ ik F , Λ jk F are linearly independent and the map A is uniquely determined. We have that
Λ ij F · Λ ij F = Λ ij G · Λ ij G , Λ ik F · Λ ik F = Λ ik G · Λ ik G , Λ jk F · Λ jk F = Λ jk G · Λ jk G ,
and further that
Λ ij F · Λ ik F = ι i F ( v ij ) · ι i F ( v ik ) = v ij · v ik = ι i G ( v ij ) · ι i G ( v ik ) = Λ ij G · Λ ik G , Λ ij F · Λ jk F = ι j F ( v ji ) · ι j F ( v jk ) = v ji · v jk = ι j G ( v ji ) · ι j G ( v jk ) = Λ ij G · Λ jk G .

It follows that we only need to show that Λ ik F · Λ jk F = Λ ik G · Λ jk G to conclude that A is an isometry.

We first discuss the relative orientation of the common line pairs. The product det [ v ij ,v ik ]det [ v ij ,v im ] is positive if the shortest rotation from v ij to v ik in P i is in the same direction as the shortest rotation from v ij to v im . In this case, we say v ik and v im lie on the same side of v ij . This product is negative if the shortest rotation from v ij to v ik is in the opposite direction of the shortest rotation of v ij to v im , and in this case, we say v ik and v im lie on opposite sides of v ij . Similarly, the sign of the product det [v ji ,v jk ]det [v ji ,v jm ] determines if v jk and v jm lie on the same, or opposite, sides of v ji in P j .

Since we consider isometric embeddings of P i and P j , we can make the same statements for the embedded versions of these vectors: Λ ik G and Λ im G = ι i G ( v im ) lie on the same side of Λ ij G in the plane ι i G ( P i ) if det [v ij ,v ik ]det [v ij ,v im ] is positive, and these vectors lie on opposite sides of Λ ij G if this product is negative. We can similarly say whether the vectors Λ jk G and Λ jm G = ι j G ( v jm ) lie on the same or opposite sides of Λ ij G in the plane ι j G ( P j ) .

Next, consider the spherical triangle T, with vertices Λ ij G , Λ ik G , and Λ jk G , and the triangle T, with vertices Λ ij G , Λ im G , and Λ jm G . The triangles T and T share the vertex Λ ij G , and we write Z for the angle of T at this vertex and Z for the angle of T at this vertex.

Suppose first that Λ ik G and Λ im G both lie on the same side of Λ ij G in ι i G ( P i ) , and Λ jk G and Λ jm G both lie on the same side of Λ ij G in ι j G ( P j ) . In this case, the triangles T and T sit inside each other, so Z and Z are the same (cf. Figure 10a). On the other hand, if Λ ik G and Λ im G lie on opposite sides of Λ ij G , and Λ jk G and Λ jm G also lie on opposite sides of Λ ij G , then the triangle T lies opposite of T across Λ ij G , so the vertical angles Z and Z are equal. These two cases occur if and only if the quantity,
σ = sign det v ij , v ik det v ij , v im det v ji , v jk det v ji , v jm ,

is +1. Similarly, σ=−1 if and only if one of the pairs Λ ik G , Λ im G or Λ jk G , Λ jm G lies on the same side of Λ ij G , while the other pair lies on opposite sides of Λ ij G . In this case the triangles T and T sit side by side, so the angles Z and Z are supplementary (cf. Figure 10b).

It follows that cosZ=σ cosZ, and so applying the spherical law of cosines in T yields
Λ ik G · Λ jk G ( v ij · v ik ) ( v ji · v jk ) det [ v ij , v ik ] det [ v ji , v jk ] = σ cos Z .
On the other hand, the law of cosines in T gives
cos Z = v mi · v mj ( v ij · v im ) ( v ji · v jm ) det [ v ij , v im ] det [ v ji , v jm ] ,
and finally, since Li j k,i j m=0 we have
σ cos Z = v ki · v kj ( v ij · v ik ) ( v ji · v jk ) det [ v ij , v ik ] det [ v ji , v jk ] .

Thus, we have that Λ ik G · Λ jk G = v ki · v ki = Λ ik F · Λ jk F , and so A is an isometry, as desired. □

Proof of Theorem 2.

By Proposition 1, we first obtain realizing frames F1, F2, and F3 for the triple (1,2,3). For all remaining indices i, we construct realizing frames G1, G2, and G i from the triple (1,2,i). By Lemma 1, there exists a unique map A i O(3) that maps F1G1 and F2G2. If det A i =−1, we can replace the realizing frames G1, G2, and G i by L(G1), L(G2), and L(G i ), respectively, where L is an arbitrary isometry in O(3) with det L=−1, and replace A i by LA i . It follows that we can assume A i has det =+1. We set F i = A i 1 G i .

Now we need to check that the F i are realizing frames. We will write ι i F , ι j F , and ι k F for the embeddings determined by F i , F j , and F k and similarly for other sets of reconstructed frames. Thus, we need to verify that ι i F ( v ij ) = ι j F ( v ji ) for all pairs i,j. To this end, suppose that F i = A i 1 G i was reconstructed from G1, G2, and G i and F j = A j 1 D j was reconstructed from D1, D2, and D j . The triple (1,i,j) also strictly satisfies the triangle inequality, so we have generic realizing frames H1, H i , and H j . By Lemma 1, we have isometries B i :(G1,G i )(H1,H i ) and B j :(D1,D j )(H1,H j ). These maps and frames fit into the following diagram:

First, note that det B i =±1 and det B j =±1, and in fact, we claim that det B i =det B j . To see why, write Λ 12 G = ι 1 G ( v 12 ) = ι 2 G ( v 21 ) and similarly for all other common line pairs. Then, we have
det B i sign det Λ 12 G , Λ 1 i G , Λ 2 i G = sign Λ 12 H , Λ 1 i H , Λ 2 i H , det B j sign det Λ 12 D , Λ 1 j D , Λ 2 j D = sign Λ 12 H , Λ 1 j H , Λ 2 j H .
Further, if σ=sign(det [v12,v1i]det [v12,v1j]det [v21,v2i]det [v21,v2j]), we have
sign det Λ 12 H , Λ 1 i H , Λ 2 i H = σ sign det Λ 12 H , Λ 1 j H , Λ 2 j H , sign det Λ 12 G , Λ 1 i G , Λ 2 i G = σ sign det Λ 12 G , Λ 1 j G , Λ 2 j G , sign det Λ 12 D , Λ 1 i D , Λ 2 i D = σ sign det Λ 12 D , Λ 1 j D , Λ 2 j D .
On the other hand, det A i A j 1 = 1 , so
sign det Λ 12 G , Λ 1 i G , Λ 2 i G = σ sign det Λ 12 D , Λ 1 j D , Λ 2 j D ,
and thus det B i =det B j . Note that the diagram above commutes, since both the top path and bottom path are morphisms in O(3) of the same determinant that send F1H1. Then, since H1, H i , and H j realize the common line pair (v ij ,v ji ), we have
ι i F ( v ij ) = A i 1 ι i G ( v ij ) = A i 1 B i 1 ι i H ( v ij ) = A i 1 B i 1 ι j H ( v ji ) = A j 1 B j 1 ι j H ( v ji ) = A j 1 ι j D ( v ji ) = ι j F ( v ji )

and thus F1,…,F N realize the common lines data ([v ij :v ji ]).

Finally, suppose that F 1 , , F N is another collection of realizing frames for ([v ij :v ji ]). Fix a triple (i,j,k) and observe that since both F i , F j , F k and F i , F j , F k are realizing frame for (i,j,k) by Proposition 1, there is an isometry R ijk that sends F i , F j , F k ( F i , F j , F k ) . Note that for any i,j,k,m, the two isometries R ijk and R ijm are equal since they agree on F i and F j . This implies that R ijk F m = F m for all m, and thus there is a single isometry F 1 , , F N ( F 1 , , F N ) . □

Proof of Theorem 2.

First, observe that the minors corresponding to a common line pair [ v ij :v ji ] are non-zero for points in ρ ( G ) , since otherwise F i and F j would define the same plane. It follows that the rational projection Gr ( 3 , 2 N ) −−→ ( 3 ) N 2 is defined everywhere on ρ ( G ) .

By definition, any valid common lines data ([v ij :v ji ])C N has some realizing frames F1,…,F N , and so is the image of π(ρ(F)) and thus π ( ρ ( G ) ) = C N . It only remains to verify that this projection is injective. This follows from Theorem 1. If π(ρ(F))=π(ρ(G)), then we know that the realizing frames F and G are related by an isometry in O(3). But then the rows of the matrices F and G define the same linear subspace, and so ρ(F)=ρ(G). □

Proof of Corollary 1.

We will compute dimensions with respect to a dense subset of and a dense subset of ρ ( G ) × O ( 3 ) . Let V G be the complement of the semi-algebraically homeomorphic to an open subset of SO(3) N we have dim V = dim G = 3 N , and thus dim ρ ( V ) = dim ρ ( G ) = 3 N 3 . By Theorem 2, we have a semi-algebraic bijection between ρ ( G ) and C N , so we conclude that dim C N =3N−3. □

Declarations

Acknowledgments

The author is greatly thankful to Shamgar Gurevich, for initially suggesting the cryo-EM problem and for his continued support, as well as to Bernd Sturmfels, who suggested studying defining equations in cryo-EM, provided helpful guidance, and invited the author to the Mathematical Sciences Research Institute (MSRI) in Berkeley, California. Much of this work took place at MSRI during the spring of 2013, and the author greatly appreciates helpful technical discussions with Luke Oeding, Kristian Ranestad, Yoel Shkolnisky, Amit Singer, and Frank Sottile. The author’s visit to MSRI was supported by the National Science Foundation under grants DMS-0838210 and DMS-0932078.

Authors’ Affiliations

(1)
Department of Mathematics, University of Wisconsin

References

  1. Wang L, Sigworth FJ: Cryo-EM and single particles. Physiology 2006,21(1):13–18. 10.1152/physiol.00045.2005MATHView ArticleGoogle Scholar
  2. Singer A, Shkolnisky Y: Three-dimensional structure determination from common lines in cryo-EM by eigenvectors and semidefinite programming. SIAM J. Imag. Sci 2011,4(2):543–572. 10.1137/090767777MATHMathSciNetView ArticleGoogle Scholar
  3. Hadani R, Singer A: Representation theoretic patterns in three dimensional Cryo-Electron Microscopy I: the intrinsic reconstitution algorithm. Ann. Math 2011,174(2):1219–1241. doi:10.4007/annals.2011.174.2.11. Accessed 14 January 2013 doi:10.4007/annals.2011.174.2.11. Accessed 14 January 2013 10.4007/annals.2011.174.2.11MATHMathSciNetView ArticleGoogle Scholar
  4. Van Heel M: Angular reconstitution: a posteriori assignment of projection directions for 3 reconstruction. Ultramicroscopy 1987,21(2):111–123. 10.1016/0304-3991(87)90078-7View ArticleGoogle Scholar
  5. De Rosier D, Klug A: Reconstruction of three dimensional structures from electron micrographs. Nature 1968,217(5124):130–134. 10.1038/217130a0View ArticleGoogle Scholar
  6. Vainshtein B, Goncharov A: Determination of the spatial orientation of arbitrarily arranged identical particles of unknown structure from their projections. In Soviet Physics Doklady, vol. 31, . American Institute of Physics, New York; 1986.Google Scholar
  7. Van Heel M, Orlova E, Harauz G, Stark H, Dube P, Zemlin F, Schatz M: Angular reconstitution in three-dimensional electron microscopy: historical and theoretical aspects. Scanning Microsc 1997, 11: 195–210.Google Scholar
  8. Singer A, Coifman RR, Sigworth FJ, Chester DW, Shkolnisky Y: Detecting consistent common lines in cryo-EM by voting. J. Struct. Biol 2010,169(3):312–322. doi:10.1016/j.jsb.2009.11.003. Accessed 25 February 2014 doi:10.1016/j.jsb.2009.11.003. Accessed 25 February 2014 10.1016/j.jsb.2009.11.003View ArticleGoogle Scholar
  9. Grayson, DR, Stillman, ME: Macaulay2, a software system for research in algebraic geometry. Available at ., [http://www.math.uiuc.edu/Macaulay2/] Grayson, DR, Stillman, ME: Macaulay2, a software system for research in algebraic geometry. Available at .

Copyright

© Dynerman; licensee Springer. 2014

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.