Deep neural networks for semantic segmentation
| dc.contributor.author | Bojja, Abhishake Kumar | |
| dc.contributor.supervisor | Yi, Kwang Moo | |
| dc.contributor.supervisor | Tagliasacchi, Andrea | |
| dc.date.accessioned | 2020-04-29T03:43:07Z | |
| dc.date.available | 2020-04-29T03:43:07Z | |
| dc.date.copyright | 2020 | en_US |
| dc.date.issued | 2020-04-28 | |
| dc.degree.department | Department of Computer Science | |
| dc.degree.level | Master of Science M.Sc. | en_US |
| dc.description.abstract | Segmenting image into multiple meaningful regions is an essential task in Computer Vision. Deep Learning has been highly successful for segmentation, benefiting from the availability of the annotated datasets and deep neural network architectures. However, depth-based hand segmentation, an important application area of semantic segmentation, has yet to benefit from rich and large datasets. In addition, while deep methods provide robust solutions, they are often not efficient enough for low-powered devices. In this thesis, we focus on these two problems. To tackle the problem of lack of rich data, we propose an automatic method for generating high-quality annotations and introduce a large scale hand segmentation dataset. By exploiting the visual cues given by an RGBD sensor and a pair of colored gloves, we automatically generate dense annotations for two-hand segmentation. Our automatic annotation method lowers the cost/complexity of creating high-quality datasets and makes it easy to expand the dataset in the future. To reduce the computational requirement and allow real-time segmentation on low power devices, we propose a new representation and architecture for deep networks that predict segmentation maps based on Voronoi Diagrams. Voronoi Diagrams split space into discrete regions based on proximity to a set of points making them a powerful representation of regions, which we can then use to represent our segmentation outcomes. Specifically, we propose to estimate the location and class for these sets of points, which are then rasterized into an image. Notably, we use a differentiable definition of the Voronoi Diagram based on the softmax operator, enabling its use as a decoder layer in an end-to-end trainable network. As rasterization can take place at any given resolution, our method especially excels at rendering high-resolution segmentation maps, given a low-resolution image. We believe that our new HandSeg dataset will open new frontiers in Hand Segmentation research, and our cost-effective automatic annotation pipeline can benefit other relevant labeling tasks. Our newly proposed segmentation network enables high-quality segmentation representations that are not practically possible on low power devices using existing approaches. | en_US |
| dc.description.scholarlevel | Graduate | en_US |
| dc.identifier.bibliographicCitation | Abhishake Kumar Bojja, Franziska Mueller, Sri Raghu Malireddi, Markus Oberweger, Vincent Lepetit, Christian Theobalt, Kwang Moo Yi, and Andrea Tagliasacchi. Handseg: An automatically labeled dataset for hand segmentation from depth images. In2019 16th Conference on Computer and Robot Vision (CRV), pages 151–158. IEEE, 2019 | en_US |
| dc.identifier.uri | http://hdl.handle.net/1828/11696 | |
| dc.language | English | eng |
| dc.language.iso | en | en_US |
| dc.rights | Available to the World Wide Web | en_US |
| dc.subject | Deep Learning | en_US |
| dc.subject | Computer Vision | en_US |
| dc.subject | Semantic Segmentation | en_US |
| dc.subject | Dataset | en_US |
| dc.subject | Hands | en_US |
| dc.subject | Hand Segmentation | en_US |
| dc.subject | Automatic Labelling | en_US |
| dc.subject | Voronoi | en_US |
| dc.subject | Implicit Representation | en_US |
| dc.subject | Rendering | en_US |
| dc.subject | Cityscapes | en_US |
| dc.subject | HandSeg | en_US |
| dc.title | Deep neural networks for semantic segmentation | en_US |
| dc.type | Thesis | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Bojja_AbhishakeKumar_MSc_2020.pdf
- Size:
- 7.06 MB
- Format:
- Adobe Portable Document Format
- Description:
- Thesis document
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: