[DL輪読会]Neural Radiance Flow for 4D View Synthesis and Video Processing (NeRF...Deep Learning JP
?
Neural Radiance Flow (NeRFlow) is a method that extends Neural Radiance Fields (NeRF) to model dynamic scenes from video data. NeRFlow simultaneously learns two fields - a radiance field to reconstruct images like NeRF, and a flow field to model how points in space move over time using optical flow. This allows it to generate novel views from a new time point. The model is trained end-to-end by minimizing losses for color reconstruction from volume rendering and optical flow reconstruction. However, the method requires training separate models for each scene and does not generalize to unknown scenes.
【DL輪読会】NeRF-VAE: A Geometry Aware 3D Scene Generative ModelDeep Learning JP
?
NeRF-VAE is a 3D scene generative model that combines Neural Radiance Fields (NeRF) and Generative Query Networks (GQN) with a variational autoencoder (VAE). It uses a NeRF decoder to generate novel views conditioned on a latent code. An encoder extracts latent codes from input views. During training, it maximizes the evidence lower bound to learn the latent space of scenes and allow for novel view synthesis. NeRF-VAE aims to generate photorealistic novel views of scenes by leveraging NeRF's view synthesis abilities within a generative model framework.
1. Two papers on unsupervised domain adaptation were presented at ICML2018: "Learning Semantic Representations for Unsupervised Domain Adaptation" and "CyCADA: Cycle-Consistent Adversarial Domain Adaptation".
2. The CyCADA paper uses cycle-consistent adversarial domain adaptation with cycle GAN to translate images at the pixel level while also aligning representations at the semantic level.
3. The semantic representation paper uses semantic alignment and introduces techniques like adding noise to improve over previous semantic alignment methods.
【DL輪読会】NeRF-VAE: A Geometry Aware 3D Scene Generative ModelDeep Learning JP
?
NeRF-VAE is a 3D scene generative model that combines Neural Radiance Fields (NeRF) and Generative Query Networks (GQN) with a variational autoencoder (VAE). It uses a NeRF decoder to generate novel views conditioned on a latent code. An encoder extracts latent codes from input views. During training, it maximizes the evidence lower bound to learn the latent space of scenes and allow for novel view synthesis. NeRF-VAE aims to generate photorealistic novel views of scenes by leveraging NeRF's view synthesis abilities within a generative model framework.
1. Two papers on unsupervised domain adaptation were presented at ICML2018: "Learning Semantic Representations for Unsupervised Domain Adaptation" and "CyCADA: Cycle-Consistent Adversarial Domain Adaptation".
2. The CyCADA paper uses cycle-consistent adversarial domain adaptation with cycle GAN to translate images at the pixel level while also aligning representations at the semantic level.
3. The semantic representation paper uses semantic alignment and introduces techniques like adding noise to improve over previous semantic alignment methods.
Comparison between Blur Transfer and Blur Re-Generation in Depth Image Based ...Norishige Fukushima
?
The document compares three methods for handling blur during depth image based rendering (DIBR): blur erasing, blur regeneration, and blur transfer. It proposes an improved blur transfer method that generates a mask using Canny filtering and smooths the masked region with Gaussian filtering. Experimental results show that the proposed method achieves similar subjective quality as blur regeneration with a 5x speed improvement and has the second highest PSNR scores on average. The proposed method improves blur treatment at object boundaries with only a minor computational cost increase over basic DIBR methods.
Non-essentiality of Correlation between Image and Depth Map in Free Viewpoin...Norishige Fukushima
?
This document summarizes an experiment on the correlation between images and depth maps in free viewpoint image coding. The experiment found that when using an accurate depth map, there is no need to consider correlation between the image and depth map. Various image codecs and post-filtering techniques were tested, and the best results were achieved using a post-filter set without a joint filter. Future work could optimize bit allocation between coded images and depth maps.
2. 紹介論文
? J. Lu, K. Shi, D. Min, L. Lin, and M. N. Do,
“Cross-Based Local Multipoint Filtering,”
CVPR2012.
? 内容:
高速なエッジキープ型のフィルタの
拡張とアプリケーション例
– 注)細かいところはカットしています
– なかなか本題にはいりません.
3. 関連論文
? Bilateral Filterの論文
– C. Tomasi and R. Manduchi, “Bilateral Filtering for Gray and Color Images,” ICCV’98.
? Joint/Cross Bilarater Filterによるデノイズ
– G. Petschnigg, M. Agrawala, H. Hoppe, R. Szeliski, M. Cohen, K. Toyama, “Digital Photography with
Fash and No-?ash Image Pairs,” SIGGRAPH’04.
– E. Eisemann and F. Durand, “Flash Photography Enhancement via Intrinsic Relighting,” SIGGRAPH’04.
? Joint Bilateral Filterを~mapに適用して解像度やノイズ除去
– J. Kopf,M. F. Cohen,D. Lischinski and M. Uyttendaele, “Joint bilateral upsampling,” SIGGRAPH’07
? NLFによるデノイズ
– A. Buades, B. Coll, J.M. Morel, “A Non-local Algorithm for Image Denoising,” CVPR’05.
? Guided Filterの論文
– K. He, J. Sun and X. Tang, “Guided Image Filtering,” ECCV’10.
? Guided Filterをコストボリュームに適用することによる応用の拡張
– C. Rhemann, A. Hosni, M. Bleyer, C. Rother, and M. Gelautz, ”Fast Cost-volume Filtering for Visual
Correspondence and Beyond,” CVPR’11.
? クロススケルトンによる適用的インテグラルイメージ
– K. Zhang, J. Lu, and G. Lafruit, “Cross-Based Local Stereo Matching Using Orthogonal Integral
Images,” IEEE Trans. CSVT’09.
15. ? Joint/Cross Filterの本当の効果
– ノイズ除去...それだけ?
Graph cut ? Belief Propagation ?
そんな重たいもの使わなくても!?
– Ex:
C. Rhemann, A. Hosni, M. Bleyer, C. Rother, and M. Gelautz,
”Fast Cost-volume Filtering for Visual Correspondence and Beyond,”
CVPR’11.
37. Guided Filter詳細
? 入力画像pはガイド画像Iのカーネル内の適当な
係数の線形変換であると仮定
– カーネル内にエッジは一つ
– ?q = a?I
– マッティングや超解像などに使われる
– (係数a,bに入力画像pを用いる)
qi ? ak I i ? bk , ?i ? ?k
ある画像を線形変換(ax+b)すると目的の画像へ変わる
38. qi ? ak I i ? bk , ?i ? ?k の図解
ak カーネル ?k
qi
= coefficient Image: a
? Ii
bk
Output Image: q Guidance Image: I
coefficient Image: b
39. qi ? ak I i ? bk , ?i ? ?k の図解
ak カーネル ?k
qi
= coefficient Image: a
? Ii
bk
Output Image: q Guidance Image: I
coefficient Image: b
40. qi ? ak I i ? bk , ?i ? ?k の図解
ak
カーネル ?k
qi
= coefficient Image: a
? Ii
bk
Output Image: q Guidance Image: I
coefficient Image: b
41. qi ? ak I i ? bk , ?i ? ?k の図解
ak
カーネル ?k
qi
= coefficient Image: a
? Ii
bk
Output Image: q Guidance Image: I
coefficient Image: b
42. qi ? ak I i ? bk , ?i ? ?k の図解
カーネル ?k
ak
qi
= coefficient Image: a
? Ii
Output Image: q Guidance Image: I
bk
coefficient Image: b
43. qi ? ak I i ? bk , ?i ? ?k の図解
カーネル ?k
ak
qi
= coefficient Image: a
? Ii
Output Image: q Guidance Image: I
bk
coefficient Image: b
44. カーネル内の全ての係数で,辩=补滨+产が成り立つように係数を决定
つまり,あるカーネル内での線形最小二乗
?qi ? ? a1 I i ? b1 ? ? a1 b1 ? ? I i 1?
?q ? ? a I ? b ? ? a b2 ? ? I i 1?
? i? ? 2 i 2 ? ? 2 ? ? ?
?qi ? ? ? ? ??? ? ? ? ? ?Ii 1?
? ? ? ? ? ? ? ?
?qi ? ?ak ?1 I i ? bk ? ? ?ak ?1 bk ?1 ? ? I i 1?
?qi ? ? ak I i ? bk ? ? ak
? ? ? ? ? bk ? ? I i
? ? 1?
?
65. カーネル内の全ての係数で,辩=补滨+产が成り立つように係数を决定
条件を満たす場所だけで
あるカーネル内での重み付き線形最小二乗
重みwik(画素位置iの時のカーネル位置kの時の重み)
※厳密に調べていないが,ほぼWLSと同じ式?
A. Levin, D. Lischinski, Y. Weiss, “Colorization using optimization,” SIGGRAPH’04.
Z. Farbman, R. Fattal, D. Lischinski, R. Szeliski, “Edge-preserving decompositions for multi-scale tone
and detail manipulation,” SIGGRAPH’08.
?qi ? ? a1 I i ? b1 ? ? wi1 wi1 ? ? a1 b1 ? ? I i 1?
?q ? ? a I ? b ? ? i i ? ?
a2 b2 ? ? I i 1?
? i? ? 2 i 2 ? ? w 2 w2 ? ? ? ? ?
?qi ? ? ? ? ??? ??? ? ? ? ? ?Ii 1?
? ? ? ? ? i ? ? ? ? ?
?qi ? ?ak ?1 I i ? bk ? ? ? w k ?1
i
w k ?1 ? ?ak ?1 bk ?1 ? ? I i 1?
?qi ? ? ak I i ? bk ? ? wi k
? ? ? ? ? wi k ? ? a k
? ? bk ? ? I i
? ? 1?
?
67. カーネル内の全ての係数で,辩=补滨+产が成り立つように係数を决定
条件を満たす場所だけで
あるカーネル内での重み付き線形最小二乗
※重みwik(画素位置iの時のカーネル位置kの時の重み)
は,0 or 1のhard thresholding
?qi ? ? a1 I i ? b1 ? ? wi1 wi1 ? ? a1 b1 ? ? I i 1?
?q ? ? a I ? b ? ? i i ? ?
a2 b2 ? ? I i 1?
? i? ? 2 i 2 ? ? w 2 w2 ? ? ? ? ?
?qi ? ? ? ? ??? ??? ? ? ? ? ?Ii 1?
? ? ? ? ? i ? ? ? ? ?
?qi ? ?ak ?1 I i ? bk ? ? ? w k ?1
i
w k ?1 ? ?ak ?1 bk ?1 ? ? I i 1?
?qi ? ? ak I i ? bk ? ? wi k
? ? ? ? ? wi k ? ? a k
? ? bk ? ? I i
? ? 1?
?
68. Cross based local filter (CLF)
? 形が適応的なジョイントボックスフィルタ
Integral Imageを使用
普通のボックスフィルタ
重みが一定の矩形フィルタ
Integral image によりO(1)で計算可能
p
69. Cross based local filter (CLF)
? 形が適応的なジョイントボックスフィルタ
– Cross Based Orthogonal Integral Imageを使用
OIIを使った適応的ボックスフィルタ
重みが0or1と適応的に変化するフィルタ
通常ならフィルタ半径内の係数を
計算しなければならないが...
70. Cross based local filter (CLF)
? 形が適応的なジョイントボックスフィルタ
– Cross Based Orthogonal Integral Imageを使用
下記手順でO(1)化
フィルタターゲットのエッジなどの情報を
用いてクロス(十字)を作成
まず,水平にインテグラルイメージ作成
各画素のクロスの水平成分に沿った
積分値を計算.
最後に縦方向に同様の処理
セパレータブルフィルタと同じような原理