Relighting a 3DGS scene — material decomposition for fuzzy points

§ 0 · The setup§ 0 · 前提

SH bakes lighting in; we need it out SH 把光烤进去了，我们要把它扒出来

A baseline 3DGS scene encodes per-Gaussian view-dependent color as 48 SH coefficients (foundations §4). Those coefficients are fits to the radiance you happened to observe under the original lighting. If the apple was photographed under an overcast sky, its SH stores "apple lit by overcast." Put the same scene under a sunset, and the SH still says "overcast" — the asset doesn't know what materials the apple is made of.

Computer graphics has known how to do this right for decades. The image equation is

基线 3DGS 把每颗高斯的视角相关颜色编码成 48 个 SH 系数（foundations §4）。这些系数是在原始光照下你恰好观测到的辐射度的拟合。如果苹果是在阴天下拍的，它的 SH 存的是"阴天下的苹果"。把同一个场景搬到日落下，SH 仍然在说"阴天"——这份资产根本不知道苹果是什么材质做的。

计算机图形学几十年前就知道怎么把这件事做对了。图像方程是：

L_o(\mathbf{x},\,\boldsymbol{\omega}_o) \;=\; \int_{\Omega} f_r(\mathbf{x},\,\boldsymbol{\omega}_i,\,\boldsymbol{\omega}_o)\, L_i(\mathbf{x},\boldsymbol{\omega}_i)\,(\boldsymbol{\omega}_i \cdot \mathbf{n})\,\mathrm{d}\boldsymbol{\omega}_i

where \(L_o\) is the outgoing radiance you see, \(L_i\) is the incident lighting, \(f_r\) is the surface's BRDF (a function of surface point, in/out direction), and \(\mathbf{n}\) is the surface normal. Inverse rendering — the goal of relighting — is recovering \(f_r, \mathbf{n}, L_i\) separately from a set of training views.

For meshes this is a 30-year-old problem with a solid playbook. For 3DGS the question becomes: what does it mean to give a fuzzy point a normal, an albedo, a roughness? And how do you separate baked lighting from material without ambiguity? The next sections walk through it.

其中 \(L_o\) 是你看到的出射辐射度，\(L_i\) 是入射光照，\(f_r\) 是表面的 BRDF（关于位置、入/出方向的函数），\(\mathbf{n}\) 是表面法向。逆向渲染——重打光的目标——就是从一组训练视图里分别还原出 \(f_r, \mathbf{n}, L_i\)。

在网格上这是一个三十年老问题，有成熟的套路。在 3DGS 上问题变成了：给一颗毛茸茸的点定义法向、漫反射率、粗糙度，到底意味着什么？怎样把烤进去的光照和材质分开、不引入歧义？接下来几节就在讲这个。

Why the ambiguity bites harder for 3DGS. A mesh has a defined surface — its triangles. A 3DGS scene is a soup of overlapping fuzzy ellipsoids. "What is the surface normal of this Gaussian?" is not a well-posed question until you've imposed extra structure. Most relighting work begins by giving each Gaussian a learnable normal vector — and adding regularization to keep neighboring Gaussians' normals consistent.

为什么 3DGS 上歧义更严重？网格有定义良好的表面——它的三角面片。3DGS 场景是一坨互相叠加的毛茸茸椭球。"这颗高斯的表面法向是什么？"在没引入额外结构之前根本是个良性都说不上的问题。大多数重打光工作的第一步就是：给每颗高斯一个可学的法向量，再加正则项让相邻高斯的法向保持一致。

§ 1 · Decomposition§ 1 · 解耦

Five buffers instead of one color 用五张缓冲替代一张颜色

A relightable 3DGS replaces each Gaussian's SH with a tuple of physically meaningful parameters. The common choice (GaussianShader, Relightable-3DGS) is:

\(\mathbf{a} \in [0,1]^3\) — albedo, the diffuse "true color" under white light. Three floats.
\(\mathbf{n} \in S^2\) — surface normal. Two floats (stored as azimuth/elevation or stereographic).
\(r \in [0,1]\) — roughness. Specular's "sharpness" — low = mirror, high = matte. One float.
\(m \in \{0,1\}\) — metallic flag. Metallic surfaces have colored specular and no diffuse. Roughly one float.
\(\mathbf{E}\) — a shared environment map across the whole scene. Typically a small (16×8 or 32×16) HDR latitude–longitude image, or 9 SH coefficients of irradiance. The only piece not per-Gaussian.

Per Gaussian: 7 floats instead of SH's 48. The environment map is shared — total budget for a typical scene is a few KB extra. Less storage than baseline SH, with strictly more expressive power.

可重打光的 3DGS 把每颗高斯的 SH 换成一组物理意义明确的参数。常见选择（GaussianShader、Relightable-3DGS）是：

\(\mathbf{a} \in [0,1]^3\) —— 反照率 (albedo)，白光下的漫反射"真色"。3 个浮点。
\(\mathbf{n} \in S^2\) —— 表面法向。2 个浮点（用方位角/俯仰角或者立体投影存储）。
\(r \in [0,1]\) —— 粗糙度。镜面反射"锐利程度"——低 = 镜面，高 = 哑光。1 个浮点。
\(m \in \{0,1\}\) —— 金属度标志。金属表面有带色镜反、无漫反。大约 1 个浮点。
\(\mathbf{E}\) —— 整个场景共享的环境贴图。通常是一张小的 (16×8 或 32×16) HDR 经纬图，或者 9 个 SH 系数的辐照度。整个表示里唯一不是逐高斯的部分。

每颗高斯 7 个浮点，而不是 SH 的 48。环境贴图共享——典型场景的额外预算只有几 KB。比基线 SH 用的存储更少，但表达力严格更强。

Interactive · the five buffers on a sphere 交互 · 球面上的五张缓冲

One sphere, decomposed into albedo, normal, roughness, metallic, and the incident environment. Drag any slider on the right to recompose the final render on the left in real time. This is essentially what a Cook-Torrance shader does on the GPU. 同一颗球，分解成 albedo、法向、粗糙度、金属度，加上入射环境。拖动右边任意滑块，左边的最终渲染会实时重新合成。本质上就是 GPU 上 Cook-Torrance shader 在做的事。

Albedo R反照率 R Albedo G反照率 G Albedo B反照率 B Roughness粗糙度 Metallic金属度 Light dir光照方向

§ 2 · The BRDF§ 2 · BRDF

Cook-Torrance in two lines 两行写完 Cook-Torrance

Every 3DGS relighting paper uses some variant of the same micro-facet model. The diffuse term is Lambert. The specular term is a microfacet GGX:

每篇 3DGS 重打光论文用的都是同一套微面元模型的变体。漫反射项是 Lambert。镜面项是微面元 GGX：

f_r \;=\; \frac{(1 - m)\,\mathbf{a}}{\pi} \;+\; \frac{D(\mathbf{h},\,r)\,F(\boldsymbol{\omega}_o,\mathbf{h})\,G(\boldsymbol{\omega}_i,\boldsymbol{\omega}_o, r)}{4\,(\boldsymbol{\omega}_i \cdot \mathbf{n})\,(\boldsymbol{\omega}_o \cdot \mathbf{n})}

\(D\) is the GGX normal distribution function. \(F\) is the Schlick Fresnel. \(G\) is a geometric self-shadowing term. \(\mathbf{h}\) is the half-vector between incoming and outgoing direction. None of this is new to graphics — what's new is computing it differentiably, per Gaussian, with gradients flowing back to the parameters \((\mathbf{a}, \mathbf{n}, r, m)\) and to the environment map \(\mathbf{E}\).

The integral over the upper hemisphere \(\Omega\) is approximated by a split-sum technique borrowed from real-time graphics (Karis 2013): pre-filter the environment by roughness once, then evaluate as a product of two lookups. This is what keeps relightable 3DGS real-time at render-time despite the per-Gaussian shading work.

\(D\) 是 GGX 法线分布函数。\(F\) 是 Schlick Fresnel。\(G\) 是几何自遮蔽项。\(\mathbf{h}\) 是入射方向和出射方向的半向量。这些对图形学界来说毫无新意——新的部分是把它做成可微的、按颗高斯算的，让梯度能反传到 \((\mathbf{a}, \mathbf{n}, r, m)\) 以及环境贴图 \(\mathbf{E}\)。

上半球 \(\Omega\) 上的积分用实时图形学借来的split-sum技巧近似（Karis 2013）：把环境按粗糙度做一次预滤波，求值时就是两次查表的乘积。这正是可重打光 3DGS 在渲染时仍能保持实时的原因，尽管每颗高斯都要做一遍着色计算。

Interactive · light direction → specular response 交互 · 光照方向 → 镜面响应

A sphere with fixed material; only the light direction moves. The bright "highlight" is the GGX specular lobe. Crank roughness up; the lobe spreads. This is exactly the per-Gaussian response GaussianShader computes — projected into the 2D splat. 一颗材质固定的球，只有光照方向在动。那块亮斑就是 GGX 镜面瓣。把粗糙度推大，瓣会摊开。这就是 GaussianShader 给每颗高斯算的响应——再投影到 2D splat 上。

Light direction光照方向 Roughness粗糙度

§ 3 · GaussianShader§ 3 · GaussianShader

The minimal relightable system 最小可用的重打光系统

Jiang, Tu, Liu, Gao, Long, Wang, Ma · CVPR 2024 · arXiv:2311.17977

GaussianShader was the first widely-cited 3DGS relighting paper. Its design choices are now the default:

Per-Gaussian normal, learned with smoothness regularization. Each Gaussian stores a normal direction \(\mathbf{n}_i\) as part of its parameters. A regularizer encourages nearby Gaussians (within a small radius) to have similar normals — without this, the optimizer happily makes every Gaussian have a different normal.
Per-Gaussian diffuse + specular split. The forward rendering equation \(c_i = \text{diffuse}_i + \text{specular}_i(\boldsymbol{\omega}_o)\). The diffuse part is constant across views; the specular part depends on viewing direction (and is the only view-dependent piece, replacing SH).
Shared learned environment. A small (32×16) HDR latlong image, optimized jointly with the Gaussians. This is the "light source" of the scene.
Specular-only SH residual. For finer-grained view dependence not captured by the GGX lobe (subsurface, microfacet aniso), a low-degree SH residual sits on top of the analytical specular. Optional, but the paper shows it helps.

Training cost: ~2× baseline 3DGS, because the GGX evaluation is non-trivial per pixel. Render cost at inference: ~1.5× baseline. Quality: matches baseline 3DGS on the captured views and looks plausible when you swap the environment map. "Plausible" is doing work — there's no ground-truth relit version of a real captured scene, so the field benchmarks against synthetic Blender scenes where ground truth is available, and on those GaussianShader hits 26+ dB PSNR under arbitrary new lighting.

GaussianShader 是第一篇被广泛引用的 3DGS 重打光论文。它的设计选择现在已经成了默认配方：

每颗高斯一个法向，配合平滑正则项学。每颗高斯把法向 \(\mathbf{n}_i\) 也存进参数里。一项正则鼓励小半径内的邻居高斯共享相似的法向——没这一项，优化器会高高兴兴地给每颗高斯不同的法向。
每颗高斯漫反射 + 镜面拆分。前向渲染方程是 \(c_i = \text{diffuse}_i + \text{specular}_i(\boldsymbol{\omega}_o)\)。漫反射跨视角是常数；镜面项依赖视角（也是唯一的视角相关项，取代了 SH）。
共享的可学环境贴图。一张小的 (32×16) HDR 经纬图，跟高斯一起联合优化。这就是场景的"光源"。
仅作用于镜面的 SH 残差。GGX 瓣捕捉不到的更精细视角相关效应（次表面、微面元各向异性），用一个低阶 SH 残差盖在解析镜面之上。可选，但论文证明有用。

训练成本：~2× 基线 3DGS——GGX 求值每像素都得做。推理渲染成本：~1.5× 基线。画质：在采集到的视图上对齐基线 3DGS，在换环境贴图时看起来合理。"合理"二字在干活——真实采集场景并没有"重打光后的真值"，所以社区在合成 Blender 场景（有真值）上做基准，GaussianShader 在任意新光照下能达到 26+ dB PSNR。

def render_relightable(g, view_dir, env_map):
    # ---- 1. diffuse term ----
    diffuse = (1 - g.metallic) * g.albedo / np.pi * irradiance(g.normal, env_map)
    # ---- 2. specular term (Cook-Torrance GGX) ----
    F0 = mix(0.04, g.albedo, g.metallic)
    L_spec = prefiltered_env(g.normal, g.roughness, env_map)        # split-sum approx
    F = schlick_fresnel(F0, dot(view_dir, g.normal))
    LUT = brdf_lut(dot(view_dir, g.normal), g.roughness)            # 2D pre-baked
    specular = L_spec * (F * LUT[0] + LUT[1])
    return diffuse + specular

§ 4 · Inverse rendering§ 4 · 逆向渲染

Pushing further: GS-IR and Relightable-3DGS 再往前推：GS-IR 与 Relightable-3DGS

GaussianShader gives a single-light-source decomposition. GS-IR (Liang et al., CVPR 2024) and Relightable-3DGS (Gao et al., 2024) go further:

Indirect illumination. Real scenes bounce light: a red wall lights up the white floor next to it. GS-IR adds a screen-space approximation of one-bounce indirect via a baked irradiance volume. The volume's voxels are jointly optimized.
Visibility. Self-shadowing matters. A point in the shadow of another part of the scene gets less direct light. Relightable-3DGS computes a per-Gaussian visibility function (precomputed by ray-marching once after the geometry stabilizes), used as a multiplier on direct lighting.
Material disambiguation priors. The diffuse-vs-specular split is notoriously ambiguous: a bright pixel can be a bright diffuse surface in dim light or a dim specular surface with a strong highlight. The full inverse-rendering systems add sparsity priors (most surfaces are weakly specular) and chromatic priors (specular highlights are typically white-ish on dielectrics) to pin down the decomposition.

The shared design principle across the family: more buffers, more priors, more passes. Relighting is fundamentally an under-constrained inverse problem; you bring it to a unique solution by adding regularization until you have one. The 3DGS-specific contribution is that all of this regularization can be applied per-Gaussian, and most of it composes nicely with the rasterizer's existing kernels.

GaussianShader 给的是单光源解耦。GS-IR（Liang 等，CVPR 2024）和 Relightable-3DGS（Gao 等，2024）再往前走一步：

间接光照。真实场景的光会反弹：红墙会把旁边的白地板染红。GS-IR 用一个烤好的辐照度体积近似一次反弹的间接光（屏幕空间近似），其中体素跟着一起联合优化。
可见性。自遮蔽很重要。一个被场景其它部分遮住的点接受到的直接光更少。Relightable-3DGS 给每颗高斯算一个可见性函数（在几何稳定后用 ray marching 预计算一次），作为直接光照的乘子。
材质消歧的先验。漫反射 vs 镜面的拆分本身就是出了名的歧义：一个亮像素可能是弱光下的高漫反，也可能是强镜面带高光。完整的逆向渲染系统加上稀疏性先验（大多数表面镜面较弱）和色度先验（电介质表面的高光通常偏白）来把解耦锁定下来。

整个家族的共同设计原则是：更多缓冲、更多先验、更多 pass。重打光本质是一个欠约束的逆问题——你只能通过不断添加正则项把它逼到一个唯一解。3DGS 特有的贡献在于：所有这些正则都可以逐颗高斯加，而且大部分都能跟现有光栅化器的 kernel 干净地结合。

~26 dB

PSNR on synthetic relit benchmarks (TensoIR)合成重打光基准 (TensoIR) 上的 PSNR

~2× train

Cost vs baseline 3DGS训练成本相对基线 3DGS

~1.5× render

Cost vs baseline 3DGS渲染成本相对基线 3DGS

~70 FPS

Render at 1080p with relight on RTX 4090RTX 4090 上带重打光的 1080p 渲染

§ 5 · Systems§ 5 · 系统

A year of 3DGS inverse rendering 3DGS 逆向渲染的一年

§ 6 · Open§ 6 · 仍未解决

What's still hard 还很难的几件事

Subsurface scattering. Skin, marble, wax — light that enters the surface and scatters before exiting violates the BRDF assumption. A "BSSRDF" (B-Sub-Surface-RDF) is the classical fix; nobody has cleanly integrated one with 3DGS yet.

Multi-source lighting. Environment maps assume distant lighting (sun, sky). A nearby lamp violates that — its irradiance varies with position. Some 2025 work uses small learned local-light volumes, but they don't generalize across scenes.

Editing. Once the decomposition is done, can a user move a single light? Some proof-of-concept editors exist; production-quality lighting design on a 3DGS scene is still out of reach.

次表面散射。皮肤、大理石、蜡——光从表面进入、散射一阵再出来，这就破坏了 BRDF 的假设。"BSSRDF"（B-Sub-Surface-RDF）是经典补丁；目前还没有人干净地把它跟 3DGS 集成起来。

多光源。环境贴图假设远场光源（太阳、天空）。一盏近距离的台灯就违反这个假设——它的辐照度随位置变化。2025 年若干工作用小规模的可学局部光体积，但跨场景泛化得不好。

编辑。解耦做完之后，用户能不能挪动单个光源？已经有概念验证级别的编辑器；但在 3DGS 场景上做生产级灯光设计还远未到。