Polyhedral Complex Derivation from Piecewise Trilinear Networks

1NAVER AI Lab,  2AI Institute of Seoul National University

Stanford Bunny and Dragon meshes by
our analytical extraction from InstantNGP (Müller et al., 2022)

Abstract

Recent advancements in visualizing deep neural networks provide insights into their structures and mesh extraction from Continuous Piecewise Affine (CPWA) functions. Meanwhile, developments in neural surface representation learning incorporate non-linear positional encoding, addressing issues like spectral bias; however, this poses challenges in applying mesh extraction techniques based on CPWA functions. Focusing on trilinear interpolating methods as positional encoding, we present theoretical insights and an analytical mesh extraction, showing the transformation of hypersurfaces to flat planes within the trilinear region under the eikonal constraint. Moreover, we introduce a method for approximating intersecting points among three hypersurfaces contributing to broader applications. We empirically validate correctness and parsimony through chamfer distance and efficiency, and angular distance, while examining the correlation between the eikonal loss and the planarity of the hypersurfaces.

TL;DR

Analytical mesh extraction from an SDF's neural surfaces with trilinear-interpolating positional encoding methods (e.g., InstantNGP) through the eikonal constraint of hypersurface planarity (Theorem 4.5) with the \( \epsilon \)-error toleration technique (Theorem 4.7).

Motivation

Previous works assumed they extract meshes from Continuous Piecewise Affine (CPWA) functions. It implies that the functions have piecewise linearity, i.e., ReLU neural networks. However, if you use trilinear interpolation for positional encoding, the networks are no longer linear almost everywhere. In a trilinear space within a unit cubic grid, trilinear interpolation function \( \tau \) projects a line to a curve except the lines on the grid. Notably, the diagonal line \( \mathrm{t} = (t,t,t)^\intercal \) where \( t \in [0, 1] \) is projected to a cubic Bézier curve (ref. Proposition A.2). Note that the intersection of two curved hypersurfaces is not a line, even multiple curved lines can be found. Henceforth, Thales's theorem (Section 3.4) is no longer employed for identifying the point of intersection between a curved edge and a curved hypersurface. We deal with this problem in light of the discovery of planary constraints from the eikonal equation (Theorem. 4.5) and the analytical approximation (Theorem. 4.7).

The above interactive figure depicts a hypersurface \(\tau(\mathrm{x})=0\) where \(\tau(\mathrm{x_0})=\tau(\mathrm{x_7})=0\) with some curvature. The trilinear coefficients are \((0, 0.4 + \delta, 0.1 - \delta, 0.5, -0.5, -0.1, -0.4, 0)\) for the grid features \(P_{0 \dots 7}\), with a distortion of \(\delta = 1\).

Overview

A schematic illustration of our mesh extraction algorithm. (1) The first step involves finding sets of vertices and edges from the grid of positional embedding, where piecewise trilinear interpolation performs based on this grid. As previously demonstrated in our paper (Section 3), each neuron in the MLP corresponds to a hypersurface that contributes to the formation of the mesh. By applying the eikonal constraint, we ensure that each hypersurface is sufficiently smooth. We then systematically subdivide the existing set of edges using these hypersurfaces, iterating through each neuron until we reach the network's output. You can think of this process as being akin to sculpting or carving. (2) The curved edge subdivision is performed iterating over each neuron of MLP dealing with linear (2a), bilinear (2b), and trilinear (2c) cases. These are schematic illustration: imagine that trilinear subdivision is occurred in the next iteration after initial 2a or 2b cases of the grid edges. The empty circles indicate new vertices, subdividing the corresponding edges. (3) The skeletonization finds the vertices and edges consisting of the zero level-set, and (4) the extracting faces perform sorting polygon vertices using face normals.

Theory

Theorem 4.5 (Hypersurface and eikonal constraint)

Let \( \tau(\mathrm{x}; \Theta_\mathrm{x}) \) be a trilinear interpolation function parameterized by \( \Theta_\mathrm{x} \), neighboring features in a cubic grid, for a given coordinate \( \mathrm{x} \). A hypersurface \( \tau(\mathrm{x}) = 0 \) is on the two diagonal points \( \tau(\mathrm{x}_0)=\tau(\mathrm{x}_7)=0 \) while \( \tau(\mathrm{x}_{1\dots 6}) \ne 0 \) for the other six points consisting of a cube. Assumed that it satisfies the eikonal constraint \( \|\nabla \tau(\mathrm{x}) \|_2 = 1 \) for all \( \mathrm{x} \in [0, 1]^3 \). Then, the hypersurface of \( \tau(\mathrm{x})=0 \) is a plane.

Proof sketch. Since the eikonal equation is first-order partial differential, the eikonal constraint makes hypersurfaces smooth, which is a plane. To be the second derivatives are equal to zero, we can show that \( \tau(\mathrm{x}_1) + \tau(\mathrm{x}_2) + \tau(\mathrm{x}_4) = 0 \), \( \tau(\mathrm{x}_3) + \tau(\mathrm{x}_5) + \tau(\mathrm{x}_6) = 0 \), \( \tau(\mathrm{x}_1) + \tau(\mathrm{x}_6) = 0 \), \( \tau(\mathrm{x}_2) + \tau(\mathrm{x}_5) = 0 \), and \( \tau(\mathrm{x}_3) + \tau(\mathrm{x}_4) = 0 \) are the planary condition for the six points. Notice that 3-bit representations for the integer indices indicate relative positions in a cube. For the derivation, please see the proof in the Appendix A.

The above interactive figure depicts a hyperplane \(\tau(\mathrm{x})=0\) where \(\tau(\mathrm{x_0})=\tau(\mathrm{x_7})=0\) having no curvature. The trilinear coefficients for this hyperplane are \((0, 0.4, 0.1, 0.5, -0.5, -0.1, -0.4, 0)\) for the grid features \(P_{0 \dots 7}\) satisfying the planary condition.


Theorem 4.7 (Intersection of two hypersurfaces and a diagonal plane)

\( \def\tnn{\tilde{\nu}} \def\vx{\mathrm{x}} \def\mP{\mathrm{P}} \def\mQ{\mathrm{Q}} \def\mX{\mathrm{X}} \)Let \( \tnn \) be (piece-wise) trilinear networks, \( \tnn_0(\vx)=0 \) and \( \tnn_1(\vx)=0 \) be two hypersurfaces passing two points \( \vx_0 \) and \( \vx_7 \) such that \( \tnn_0(\vx_0)=\tnn_1(\vx_0)=0 \) and \( \tnn_0(\vx_7)=\tnn_1(\vx_7)=0 \), \( \mP_i := \tnn_0(\vx_i) \), \( \mP_\alpha = \big[\mP_0;~\mP_{1};~\mP_{4};~\mP_{5} \big] \), \( \mP_\beta = \big[\mP_2;~\mP_{3};~\mP_{6};~\mP_{7}\big] \), \( \mQ_i := \tnn_1(\vx_i) \), likewise for \( \mQ_\alpha \) and \( \mQ_\beta \), and \( \mX = [(1-x)^2;~ x(1-x);~ (1-x)x;~ x^2] \). Then, \( x, z \in [0, 1] \) of the intersection point of the two hypersurfaces and a diagonal plane of \( x=z \) is the solution of the following quartic equation: \[ \mX^\intercal \big( \mP_\alpha \mQ_\beta^\intercal -\mP_\beta \mQ_\alpha^\intercal \big) \mX = 0 \] while \[ y = \frac{\mX^\intercal\mP_\alpha}{\mX^\intercal(\mP_\alpha - \mP_\beta)} \hspace{2em}(\mP_\alpha \neq \mP_\beta). \]

Proof sketch. Theorem. 4.7 handles where the eikonal equation doesn't make perfect hyperplanes. First, we observed the characteristics of trilinear hypersurfaces as illustrated in Figure 6 in the Appendix. Among 64 edge scenarios within a trilinear region, a diagonal plane (e.g., \(x=z\) intersects with the hypersurface in most cases. Moreover, we can take advantage of eliminating one variable by assuming the curved edge lies on the diagonal plane. In this light, Theorem. 4.7 rearranges two trilinear equations to get the above solution. As we mentioned in the paper, the new vertices lie on at least two hypersurfaces, and the new edges exist on the same hypersurface, while the eikonal constraint minimizes any associated error. This error is tolerated by the hyperparameter \(\epsilon=10^{-4}\) (Secion 6.1) using the \(\epsilon\)-tolerate sign vector in Definition 3.4.

Chamfer distances

Chamfer distances (CD) for the Stanford dragon varying the number of vertices and the model sizes. Given that our method extracts vertices from the intersection points of hypersurfaces, it enables parsimonious representations using the lower number of vertices having better CD. As depicted, ours are located in the lower-left region compared with the baselines.

Our method inherently holds an advantage over MC in capturing optimal vertices and faces, especially in cases where the target surfaces exhibit planar characteristics. When dealing with curved surfaces, the smaller grid size of the Large model views a curved surface in a smaller interval, and it tends to approach the plane. From the plotting of Small to Large, our CDs (colored crosses) are decreasing to zero, consistently with better CEs, confirming this speculation. The complete results of the Large models for the Stanford 3D Scanning repository can be found in Tables 6 to 8 in Appendix F.

Discussions

Q1. Is it possible to perfectly learn an SDF that fully satisfies the eikonal equation?

The above proof sketch of Theorem 4.7 describes the rationale behind our \(\epsilon\)-tolerate approach. Moreover, empirically, the flatness error (if it goes to zero, the hypersurface must be a plane), derived from the proof of Theorem 4.5, is effectively controlled by the learning rate of the eikonal loss. Note that in practice, the eikonal loss should be applied primarily to positions near the surface for effective learning. If the target position is clearly distant from any surface, it can be disregarded. This focus is crucial because our interest lies in the surfaces themselves, rather than in obtaining exact signed distance functions for the entire space. This approach is sufficient for our extraction method.

Q2. What is the significance of tropical geometry analysis for the algorithm?

While Berzins (2023) did not explicitly mention it, the use of tropical geometry in deep neural networks (Zhang et al., 2018) allows us to mathematically describe each neuron's contribution as a folded hyperplane from a tropical geometry perspective. A key implication of this concept is that it facilitates for-loop iterations over neurons across layers in Algorithm 3, avoiding the need to visit each exponentially growing linear region.

References

  • Berzins, A. (2023). Polyhedral complex extraction from relu networks using edge subdivision. ICML.
  • Humayun, A. et al. (2023). SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundary. CVPR.
  • Lei, J., & Jia, K. (2020). Analytic Marching: An Analytic Meshing Solution from Deep Implicit Surface Networks. ICML.
  • Zhang, L., Naitzat, G., & Lim, L. H. (2018). Tropical Geometry of Deep Neural Networks. ICML.

Acknowledgments

I would like to express my sincere appreciation to my brilliant colleagues, Sangdoo Yun, Dongyoon Han, and Injae Kim, for their contributions to this work. Their constructive feedback and guidance have been instrumental in shaping the work. The NAVER Smart Machine Learning (NSML) platform (Kim et al., 2018) had been used for experiments.

BibTeX

@article{Kim2024tropical,
  author = {Kim, Jin-Hwa},
  title = {{Polyhedral Complex Derivation from Piecewise Trilinear Networks}},
  journal = {arXiv preprint arXiv:2402.10403}
  year = {2024}
}