端端舝

Learning Spatial Common Sense with
Geometry-Aware Recurrent Networks

Novel View Synthesis
? 中仁勾井及紝䛐井日, �e�萸及賒砉毛軑䛐
允月正旦弁
? �褪悝及煦珧卞云仃月?腔隙�ㄗ丟件正伙
伕奈氾奈扑亦件ㄘ午中丹政砓午憝窣互旮中
? ?嶲反?卞佷中腹井屯凶奶丟奈斥毛隙�今
六月仇午互匹五月
? CVpaperchallenge及Novel View
Synthesis及逃桶互??井勻凶及匹憝窣
�?毛畿賡仄引允
Shepard午Metzler及灍歠
方仁歹井月庲眭褪悝ㄗグ﹜憚捶﹜捶?
�﹜2011﹜立生伙任央𤩸滇ㄘpp.61

瞰 : Generative Query Network (GQN)
? Novel view synthesis及正旦弁毛籵仄化, 諾嶲及ロ�毛摩廣允月扑奈
件桶政 (scene representation) 毛陂腕
? Eslami日反, 仇木毛conditional VAE及�瞎心匹灍政允月?楊毛枑偶
? ?掛惤分午??今氏, �?今氏及賤掊揃蹋互歹井曰支允中匹允
? /MasayaKaneko/neural-scene-
representation-and-rendering-33d
? /DeepLearningJP2016/dlgqn-111725780
S. Eslami et al. Neural Scene Representation
and Rendering, Science, 2018.

Novel View Synthesis
? cvpaperchallenge及逃桶匹反, 正旦弁及衙猁?�?及畿賡?帤賤𢜪�觳卅
升毛畿賡仄化中凶分中凶
? 帤賤𢜪�觳反眕狟及�?脹互伃仆日木凶
? 市氾打伉卞甡湔仄卅中陔��萸賒砉?傖
? 恚杅昜极及novel view synthesis
? 灍犯奈正卞云仃月novel view synthesis
? 帤眭�萸尺及𡘙趙
? 逃桶毛�中化, 輪爛反扑奈件桶政毛升及方丹卞乒犯伉件弘允月井午中丹垀
互笭猁公丹分午佷勻凶

Novel View Synthesis�?
? Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene
Representations
? Visual Object Networks: Image Generation with Disentangled 3D Representations
? Transformable Bottleneck Networks
? DeepVoxels: Learning Persistent 3D Feature Embeddings
? Geometry-Aware Recurrent Neural Networks for Active Visual Recognition
? Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence
? Transformation-Grounded Image Generation Network for Novel 3D View Synthesis
? View Synthesis by Appearance Flow
? Learning Spatial Common Sense with Geometry-Aware Recurrent Networks

Novel View Synthesis�?
? Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene
Representations
? Visual Object Networks: Image Generation with Disentangled 3D Representations
? Transformable Bottleneck Networks
? DeepVoxels: Learning Persistent 3D Feature Embeddings
? Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence
? Transformation-Grounded Image Generation Network for Novel 3D View Synthesis
? View Synthesis by Appearance Flow
? Learning Spatial Common Sense with Geometry-Aware Recurrent Networks
踏?逃桶允月支勾

𤩸庹ロ�
? CMU及旃噶民奈丞卞方月�?
? last author及Fragkiadaki?反辻迮卞茬賒毛燴賤今六月及互?㻢日仄中
? 掛旃噶手公及凶戶卞斛猁卅撮胍午中丹弇离內仃ˋ
? CVPR2019 oral 卞公木冗木妸亼
? 仇及旃噶反﹜嗣�萸及ロ�毛3D及サ婓桶政午仄化緙磁允月?楊及枑偶毛
仄化中月
? geometry及眭舑毛deep learning及乒犯伙卞爀?
? 左弁伙奈斥亦件卞�中
? 杻卞�徭及卅中�磁, �反掛�?井日及竘?

衙猁
? 恚杅及2D賒砉井日喲堤仄凶杻釾毛3D及サ婓桶政卞緙磁允月?楊及枑偶
? 枑偶?楊匹反峚煦褫夔卅𢜔睡腔卅紱釬(芘荌, 婬芘荌, ego-motion estimation卅升) 毛
deep learning卞龰曰?木凶
? 政灍岍賜午3D feature及弇离腔卅憝窣反悵湔仄化中月
? 枑偶仄凶乒犯伙反傻中賒砉蹈井日陔��萸及view毛軑䛐允月正旦弁匹悝�
? 今日卞, 3D segmentation支3D object detection手悝�褫夔
? 杻卞𨈘堤及�觳匹反昜极及蚗適俶毛蕉𩬅仄凶𨈘堤 (左弁伙奈斥亦件卞�中)
? 灍极毛厥勾visual agent卞spatial common sense毛厥凶六月凶戶卞斛猁卅
撮胍匹丐月午磐�勿仃凶

掖劓, 乒民矛奈扑亦件
? 輪爛及賒砉庲舑乒犯伙反?嶲互厥勾昜极及蚗適俶支諾嶲庲
舑夔?毛厥切磁歹六化中卅中
? �賒匹昜极互允木綃勻凶媆卞, 蕎木化中月窒煦卞手昜极反湔婓仄
凶引引及反內 (蚗適俶)
? 仇及方丹卅夔?反賒砉+仿矛伙及犯奈正毛?中凶諒�丐曰悝�
匹反陂腕今木卅中
? 陔仄中乒犯伙毛枑偶允月斛猁互丐月
? 2D賒砉及扑奈弗件旦毛3D feature卞緙磁允月Geometry-
aware RNN及枑偶
1. 2D feature毛3D諾嶲卞欄芘荌 (unprojection)
2. ego-motion及軑䛐
3. GRU匹扑奈件及3D feature毛載陔
? 枑偶?楊反, SLAM井日覂砑毛腕凶窒煦互?五中

枑偶?楊及advantage
? 陔��萸軑䛐及正旦弁卞云仃月𡘙趙俶夔互?中
? geometry毛蕉𩬅仄卅中?楊 (GQN) 及俶夔毛?五仁奻隙月
? 凶分仄, ego-motion及芢隅毛六內GT毛妏勻凶�磁匹丐月仇午卞蛁砩
? 3D segmentation支3D object detection卞手羥?褫夔
? �萸及劐趙卞圈丹?媆腔卅左弁伙奈斥亦件卞薉翩卅𨈘堤磐彆毛腕凶
?昜极及蚗適俶 (object permanence) 毛燴賤仄凶庲舑?楊匹丐月ㄐ
? 灍蚾鼠嶱丐曰
? https://github.com/ricsonc/grnn

枑偶?楊
? 枑偶?楊 (奻�) 反4勾及禾奶件玄井日卅月
1. Unprojection
2. Egomotion estimation and stabilization
3. Recurrent map update
4. Projection and decoding Given

Unprojection
1. CNN (2D U-net) 匹2D杻釾穴永皿毛喲堤
2. 2D杻釾穴永皿毛3D諾嶲卞欄芘荌
3. depth穴永皿井日肮元扔奶朮及3D occupancy grid毛釬傖 (昜极互丐月弇离反1, 公木眕俋反0卞卅
月氾件末伙)
4. 3D U-net毛?中化3D feature !?# 毛喲堤
Ｋ
Ｌ
Ｎ

Egomotion estimation and stabilization
? �萸反擒褩毛劐尹內卞, ?僅及心劐趙允月午中
丹�隅
? 陔仄中�萸及井日釬傖仄凶3D氾件末伙 !?# 毛,
中仁勾井及�卅月?僅匹隙�今六月 ↙ !?$%#
? 蕉尹日木月?僅及杅分仃 !?$%# 毛釬傖
? 公及媆覦及3D feature memory午囀搪毛午曰
郔手旦戊失及?中?僅毛芢隅仄凶ego-motion
午允月 ↙ !?#
&
? 灍蕣卞反旦戊失匹笭心葆仃ⅸ歙毛午月�I燴毛
?卅勻化中月
? 芢隅仄凶?僅匹婬僅劐𡥼毛?卅勻凶摽,
GRU卞??

Recurrent map update
? Egomotion estimation卞方曰砃五毛磁歹六凶3D
feature !?#
& 毛3D convolutional Gated Recurrent
Unit (GRU) layer卞??
? 蕎木袨颷及3D feature memory : ?#毛載陔仄化中仁
? ?#及場ヽ�反0午仄凶
? Novel View Prediction及正旦弁匹反GRU毛妏歹內
卞ⅸ歙毛午月�I燴匹手肮元方丹卅俶夔互腕日木凶

Projection and decoding
? 腕日木凶3D feature memory ?#毛?中化正旦弁毛
?丹窒煦及生永玄伐奈弁�婖
? 弁巨伉及�萸 : ? 毛迵尹, ?# 毛劐𡥼
? 跪depth及�卞𡛟元凶2D feature卞芘荌仄stack ↙
?*
? ?*毛convLSTM匹 ? 卞�𡛟允月RGB賒砉尺decode
? 昜极及visibility反隴?腔卞反迵尹內NN及�呾卞￤
六月
? ?丹正旦弁午仄化反view prediction午3D
MaskRCNN互丐月

Projection and decoding
? View prediction及�磁反?#毛衵及�及
方丹卞decode仄化賒砉毛伊件母伉件弘
允月
? 3D MaskRCNN匹反, ?# 及緊娗薆郖及
窒煦毛ROI pooling仄化, 公及窒煦毛
decode允月仇午卞方曰昜极穴旦弁毛?
傖

灍歠
? 𨈘偩仄凶中及反眕狟及�中
1. GRNNs反spatial common sense毛悝�允月井
2. geometry毛蕉𩬅仄凶生永玄伐奈弁�婖反spatial common sense毛陂腕允月及
卞斛猁井
3. GRNNs及俶夔卞勾中化
? spatial common sense反, ?嶲互厥勾諾嶲庲舑夔?�啜毛硌允 (𣷣中砩庤)
? 3D shape反2Dⅸ?毛壩日引六月仇午匹?傖褫夔
? 扑奈件反昜极井日�傖今木月
? 3棒啋昜极反蝠船仄卅中
? 昜极反摹卞湔婓毛秏仄凶曰仄卅中

View prediction
? 恚杅賒砉及??毛啋卞, 陔仄中�萸及賒砉毛軑䛐
允月正旦弁
? 灍歠卞?中凶犯奈正本永玄
? ShapeNet : 悝�犯奈正反2勾及昜极毛紝䛐允月午中丹偞隅匹
𨃨�
? Shepard-metzler : 氾玄伉旦心凶中卅支勾
? Rooms-ring-camera dataset from : 窒挌及笢卞仿件母丞卞昜
极互丐月方丹卅犯奈正本永玄
? ?廌�砓 : GQN
? 沭璃毛𠐓尹月凶戶, 枑偶?楊匹depth map及GT反?中內, ego-
motion及GT反?中凶

View prediction
? 婬�傖悷船反枑偶?楊及?互?今中
? 方曰淏�卞軑聆互匹五凶

View prediction
氾旦玄媆及心昜极毛4勾卞�支允�磁
∥及磐彆反枑偶?楊及?中𡘙趙俶夔毛偩隴
(帤眭及偞隅匹手方仁軑䛐匹五月)

View prediction
? 杻釾桶政及?仄呾?竘五呾

3D object detection and segmentation
? 撿极腔卞反instance segmentation及正旦弁毛?卅勻化中月
? ShapeNet匹犯奈正本永玄毛釬傖
? mean Average Precision (mAP)匹啐�
? 4勾及偞隅匹𨈘偩
? geometry-aware匹卅中乒犯伙 + ego-motion及GT + depth及GT
? GRNN + ego-motion及GT + 芢隅仄凶depth
? GRNN + ego-motion及GT + depth及GT
? GRNN + 芢隅仄凶ego-motion + depth及GT
? �I?芢隅允月及反支日卅中及＃?

? GRNN + ego-motion及GT + depth及GT 及磐彆互郔手謎中 (公木反公丹)
? mAP0.75卞云中化反GRNN反geometry-aware匹卅中乒犯伙方曰手謎中磐彆
? geometry-aware卅枑偶?楊及�弇俶反?六凶
? ego-motion午depth毛�I?芢隅匹五月午今日卞謎今公丹

? 恚杅及�萸及紝䛐毛緙磁允月仇午匹, 左弁伙奈斥亦件卞薉翩卅𨈘堤毛灍政

引午戶
? spatial common sense毛陂腕允月凶戶, 2D賒砉蹈井日3D feature毛?傖允
月生永玄伐奈弁�婖毛枑偶
? unprojection, ego-motion estimation卅升, 峚煦褫夔卅geometric卅�I燴毛
?中月仇午卞方曰灍政
? 陔��萸軑䛐及正旦弁卞云中化, 腴中婬�傖悷船支?中𡘙趙俶夔毛?仄凶
? 3D object detection & segmentation卞云中化反, 左弁伙奈斥亦件卞薉翩
卅𨈘堤互匹五凶仇午毛復庲
? Future works
? 政灍及犯奈正?�腔卅扑奈件卅升卞羥?褫夔卅乒犯伙及枑偶
? 4D氾件末伙及旦由奈旦俶毛?中凶�呾�薹及砃奻

憝窣旃噶
? NeurIPS 2018卞妸亼
? 肮元旃噶弘伙奈皿及�?
? 肮�及扑旦氾丞毛�腔卞紝䛐弇离毛劐趙今六月巨奈斥尼件玄卞羥?
? 方曰informative卅?砃毛�萸毛�井允?習毛悝�匹五凶
R. Cheng et al. ※Geometry-Aware Recurrent Neural Networks for Active
Visual Recognition§, NeurIPS, 2018.
R. Cheng et al. Supplemental materials of ※Geometry-Aware Recurrent
Neural Networks for Active Visual Recognition§, NeurIPS, 2018.

云引仃 : Novel View Synthesis 扔奈矛奶

View Synthesis by Appearance Flow
? Novel view synthesis 及正旦弁毛, 2D賒砉井日及白伕它毛芢隅允月仇午卞方曰賤
中凶﹝
? 白伊奈丞伐奈弁�极反狟�及方丹卞卅月﹝仇木反?𠸎籵�卞悝�允月仇午互匹五
月﹝
T. Zou et al. ※View Synthesis by Appearance Flow§, in ECCV, 2016.

Transformation-Grounded Image Generation
Network for Novel 3D View Synthesis
? NovelViewSynthesis及正旦弁卞云中化﹜陔��萸匹及左皮斥尼弁玄及丹切末奈
旦賒砉匹?尹化中月窒煦反公木毛戊疋奈仄化?中﹜紹曰及窒煦反GAN匹?傖允月
方丹卅�瞎心毛枑偶仄凶﹝生永玄伐奈弁反disocclusion-aware appearance flow
network (DOAFN) 午completion network井日�傖今木月﹝
? 珂?旃噶及Appearance Flow Network (AFN) 方曰手方中磐彆毛腕凶﹝
E. Park et al. ※Transformation-Grounded Image Generation Network for Novel 3D View Synthesis§, in CVPR, 2017.

Visual Object Networks: Image Generation with
Disentangled 3D Representations
? 3D毛蕉𩬅仄凶賒砉?傖毛?丹?楊及枑偶
? 3D shape及?傖↙正奈必永玄�萸卞�𡛟仄凶旮僅賒砉午穴旦弁卞劐𡥼↙
氾弁旦民乓戊奈玉毛迵尹化賒砉卞CNN匹伊件母伉件弘
J. Zhu et al. ※Visual Object Networks: Image Generation with Disentangled 3D Representations§, in NeurIPS, 2018.

Multi-view to Novel view: Synthesizing novel views
with Self-Learned Confidence
? 恚杅�萸及賒砉井日﹜陔��萸及賒砉毛?傖允月?楊及枑偶﹝白伊奈丞伐奈弁反FlowPredictor午
Recurrent Pixel Generator井日卅曰﹜ゴ氪反末奈旦賒砉井日正奈必永玄賒砉尺及白伕它毛芢隅仄﹜
摽氪反??井日眻諉賒砉毛覧啋允月仇午毛�心月﹝郔摽卞仇木日毛復陓僅匹笭心葆仃毛仄化緙磁允
月﹝
? 3DCG及左皮斥尼弁玄毛?中化灍歠毛?中絞媆及SOTA午卅勻凶﹝
S. Sun et al. ※Multi-view to Novel view:
Synthesizing novel views with Self-Learned
Confidence§, in ECCV, 2018.

Transformable Bottleneck Networks
? 2D賒砉毛CNN卞方曰3D及�摩互匹五月方丹卞允月?楊及枑偶﹝
? 賒砉井日3D feature毛喲堤仄﹜公仇卞正奈必永玄禾奈朮卞憝允月劐倛毛?木凶及切2D尺
及芘荌毛?中﹜賒砉及婬�傖卅升摽僇及正旦弁毛?丹﹝
? 仇木卞方曰�极劐𡥼卞午升引日卅中3D毛蕉𩬅仄凶賒砉�摩毛?丹仇午互匹五月﹝
K. Olszewski et al. ※Transformable Bottleneck Networks§, 2019.

DeepVoxels: Learning Persistent 3D Feature
Embeddings
? 賒砉扑奈弗件旦毛1勾及示弁本伙桶政卞邈午仄煋戈?楊及枑偶﹝
? 枑偶?楊及白伊奈丞伐奈弁反眕狟及�匹�I燴毛?丹﹝
? 賒砉井日2D feature毛喲堤↙2D feature毛3D feature卞婬芘荌↙仇木日毛賒砉扑奈弗件旦卞勾中化?中GRU匹緙磁
↙3D feature毛正奈必永玄及�萸尺芘荌仄賒砉毛婬�傖
? 仇及婬�傖悷船卞方曰�极及白伊奈丞伐奈弁及悝�毛?丹﹝
? 枑偶?楊反novel view synthesis及俶夔互謎中﹝
V. Sitzmann et al. ※DeepVoxels: Learning Persistent 3D Feature Embeddings§, in CVPR, 2019.

DeepVoxels: Learning Persistent 3D Feature
Embeddings
V. Sitzmann et al. ※DeepVoxels: Learning Persistent 3D Feature Embeddings§, in CVPR, 2019.

統蕉?瓬
? S. Eslami et al. Neural Scene Representation and Rendering, Science, 2018.
? T. Zou et al. ※View Synthesis by Appearance Flow§, in ECCV, 2016.
? E. Park et al. ※Transformation-Grounded Image Generation Network for Novel 3D View Synthesis§, in CVPR,
2017.
? J. Zhu et al. ※Visual Object Networks: Image Generation with Disentangled 3D Representations§, in NeurIPS,
2018.
? S. Sun et al. ※Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence§, in ECCV,
2018.
? K. Olszewski et al. ※Transformable Bottleneck Networks§, 2019.
? V. Sitzmann et al. ※DeepVoxels: Learning Persistent 3D Feature Embeddings§, in CVPR, 2019.
? R. Cheng et al. ※Geometry-Aware Recurrent Neural Networks for Active Visual Recognition§, NeurIPS, 2018.
? H. Tung et al. ※Learning Spatial Common Sense with Geometry-Aware Recurrent Networks§, in CVPR, 2019.

端端舝

Learning Spatial Common Sense with Geometry-Aware Recurrent Networks

Recommended

More Related Content

What's hot (20)

Similar to Learning Spatial Common Sense with Geometry-Aware Recurrent Networks (20)

More from Kento Doi (9)

Learning Spatial Common Sense with Geometry-Aware Recurrent Networks