Jonathan Masci

google
                scholar profile



Conference



Understand Locally Competitive Networks
Deep Networks with Internal Selective Attention through Feedback Connections.
M. F. Stollenga*, J. Masci*, F. Gomez, J. Schmidhuber,
Advances in Neural Information Processing Systems (NIPS), 2014.
Abstract. Traditional convolutional neural networks (CNN) are stationary and feedforward. They neither change their parameters during evaluation nor use feedback from higher to lower layers. Real brains, however, do. So does our Deep Attention Selective Network (dasNet) architecture. [continue reading] DasNet’s feedback structure can dynamically alter its convolutional filter sensitivities during classification. It harnesses the power of sequential processing to improve classification performance, by allowing the network to iteratively focus its internal attention on some of its convolutional filters. Feedback is trained through direct policy search in a huge million-dimensional parameter space, through scalable natural evolution strategies (SNES). On the CIFAR-10 and CIFAR-100 datasets, dasNet outperforms the previous state-of-the-art model on unaugmented datasets. [close]


multimodal similarity preserving hashing

multimodal similarity preserving hashing
Sparse Similarity-Preserving Hashing.
J. Masci, A.M Bronstein, M.M. Bronstein, P. Sprechmann, G. Sapiro,
International Conference on Learning Representations (ICLR), 2014.
Abstract. In recent years, a lot of attention has been devoted to efficient nearest neighbor search by means of similarity-preserving hashing. One of the plights of existing hashing techniques is the intrinsic trade-off between performance and computational complexity: while longer hash [continue reading] codes allow for lower false positive rates, it is very difficult to increase the embedding dimensionality without incurring in very high false negatives rates or prohibiting computational costs. In this paper, we propose a way to overcome this limitation by enforcing the hash codes to be sparse. Sparse high-dimensional codes enjoy from the low false positive rates typical of long hashes, while keeping the false negative rates similar to those of a shorter dense hashing scheme with equal number of degrees of freedom. We use a tailored feed-forward neural network for the hashing function. Extensive experimental evaluation involving visual and multi-modal data shows the benefits of the proposed method. [close]


local competitive units LWTA
Compete to Compute.
R. K. Srivastava, J. Masci, S. Kazerounian F. Gomez, J. Schmidhuber,
Advances in Neural Information Processing Systems (NIPS), 2013.
Abstract. Local competition among neighboring neurons is common in biological neural networks (NNs). [continue reading] In this paper, we apply the concept to gradient-based, backprop-trained artificial multilayer NNs. NNs with competing linear units tend to outperform those with non-competing nonlinear units, and avoid catastrophic forgetting when training sets change over time. [close]


Fast Learning Algorithm for Image Scanning with Deep MPCNN
A Fast Learning Algorithm for Image Segmentation with Max-Pooling Convolutional Networks.
J. Masci, A. Giusti, D. C. Ciresan, G. Fricout, J. Schmidhuber,
International Conference on Image Processing (ICIP), 2013.
Abstract. We present a fast algorithm for training MaxPooling Convolutional Networks to segment images. [continue reading] This type of network yields recordbreaking performance in a variety of tasks, but is normally trained on a computationally expensive patch-by-patch basis. Our new method processes each training image in a single pass, which is vastly more efficient. We validate the approach in different scenarios and report a 1500-fold speed–up. In an application to automated steel defect detection and segmentation, we obtain excellent performance with short training times. [close]


Quantifying Challenging Images of Fiber-like Structures.
A. Giusti, J. Masci, P. M. V. Rancoita,
International Conference on Image Processing (ICIP), 2013.
Abstract. We present a practical, parameter-free, general computational-statistical technique for quantitative analysis of 2D images representing fiber-like structures (vessels, neurons, elongated objects, cell boundaries...), [continue reading] which is a common task in many experimental biomedicine scenarios. Our approach does not require segmentation or tracing of fibers; instead, it relies on a learned detector of intersections between fibers and arbitrary segments. The detector's probabilistic outputs are used to compute an estimate of the density of fibers and of its uncertainty; the latter accounts for several factors, including the intrinsic difficulty of the problem, i.e. the inaccuracy of the detector. After few minutes of training by the user, the procedure performs well in a variety of challenging scenarios, and compares favorably even with problem-specific algorithms. [close]


Fast Image Scanning with Deep MPCNN
Fast Image Scanning with Deep Max-Pooling Convolutional Neural Networks.
A. Giusti, D. C. Ciresan, J. Masci, J. Schmidhuber,
International Conference on Image Processing (ICIP), 2013.
Abstract. Deep Neural Networks now excel at image classification, detection and segmentation. [continue reading] When used to scan images by means of a sliding window, however, their high computational complexity can bring even the most powerful hardware to its knees. We show how dynamic programming can speedup the process by orders of magnitude, even when max-pooling layers are present. [close]


Multi-Scale Pyramidal Pooling Network
Multi-Scale Pyramidal Pooling Network for Generic Steel Defect Classification.
J. Masci, U. Meier, G. Fricout, J. Schmidhuber,
International Joint Conference on Neural Networks (IJCNN), 2013.
Abstract. We introduce a Multi-Scale Pyramidal Pooling Network tailored to generic steel defect classification, featuring a novel pyramidal pooling layer at multiple scales and a novel encoding layer. [continue reading] Thanks to the former, the network does not require all images of a given classification task to be of equal size. The latter narrows the gap to bag-of-features approaches. On various benchmark datasets, we evaluate and compare our system to convolutional neural networks and state-of-the-art computer vision methods. We also present results on a real industrial steel defect classification problem, where existing architectures are not applicable as they require equally sized input images. Our method substantially outperforms previous methods based on engineered features. It can be seen as a fully supervised hierarchical bag-of-features extension that is trained online and can be fine-tuned for any given task. [close]


Learning Morphological Operators
A Learning Framework for Morphological Operators Using Counter-Harmonic Mean.
J. Masci, J. Angulo, J. Schmidhuber,
Mathematical Morphology and Its Applications to Signal and Image Processing, 11th International Symposium, ISMM, 2013.
Abstract. We present a novel framework for learning morphological operators using counter-harmonic mean. It combines concepts from morphology and convolutional neural networks. [continue reading] A thorough experimental validation analyzes basic morphological operators dilation and erosion, opening and closing, as well as the much more complex top-hat transform, for which we report a real-world application from the steel industry. Using online learning and stochastic gradient descent, our system learns both the structuring element and the composition of operators. It scales well to large datasets and online settings. [close]


Steel Defect Classification with Max-Pooling Convolutional Neural Networks.
J. Masci, U. Meier, D. C. Ciresan, J. Schmidhuber,
International Joint Conference on Neural Networks (IJCNN), 2012.
Abstract. We present a Max-Pooling Convolutional Neural Network approach for supervised steel defect classification. [continue reading] On a classification task with 7 defects, collected from a real production line, an error rate of 7% is obtained. Compared to SVM classifiers trained on commonly used feature descriptors our best net performs at least two times better. Not only we do obtain much better results, but the proposed method also works directly on raw pixel intensities of detected and segmented steel defects, avoiding further time consuming and hard to optimize ad-hoc preprocessing [close]


Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction.
J. Masci, U. Meier, D. C. Ciresan, J. Schmidhuber,
International Conference on Artificial Neural Networks (ICANN), 2011.
Abstract. We present a novel convolutional auto-encoder (CAE) for unsupervised feature learning. A stack of CAEs forms a convolutional neural network (CNN). [continue reading] Each CAE is trained using conventional on-line gradient descent without additional regularization terms. A max-pooling layer is essential to learn biologically plausible features consistent with those found by previous approaches. Initializing a CNN with filters of a trained CAE stack yields superior performance on a digit (MNIST) and an object recognition (CIFAR10) benchmark. [close]


Flexible, High Performance Convolutional Neural Networks for Image Classification.
D. C. Ciresan., U. Meier, J. Masci, J. Schmidhuber,
International Joint Conference on Artificial Intelligence (IJCAI), 2011.
Abstract. We present a fast, fully parameterizable GPU implementation of Convolutional Neural Network variants. [continue reading] Our feature extractors are neither carefully designed nor pre-wired, but rather learned in a supervised way. Our deep hierarchical architectures achieve the best published results on benchmarks for object classification (NORB, CIFAR10) and handwritten digit recognition (MNIST), with error rates of 2.53%, 19.51%, 0.35%, respectively. Deep nets trained by simple back-propagation perform better than more shallow ones. Learning is surprisingly rapid. NORB is completely trained within five epochs. Test error rates on MNIST drop to 2.42%, 0.97% and 0.48% after 1, 3 and 17 epochs, respectively. [close]


A Committee of Neural Networks for Traffic Sign Classification.
D. C. Ciresan., U. Meier, J. Masci, J. Schmidhuber,
International Joint Conference on Neural Networks (IJCNN), 2011.
Abstract. We describe the approach that won the preliminary phase of the German traffic sign recognition benchmark with a better-than-human recognition rate of 98.98%. [continue reading] We obtain an even better recognition rate of 99.15% by further training the nets. Our fast, fully parameterizable GPU implementation of a Convolutional Neural Network does not require careful design of pre-wired feature extractors, which are rather learned in a supervised way. A CNN/MLP committee further boosts recognition performance. [close]


On Fast Deep Nets for AGI Vision.
J. Schmidhuber, D. C. Ciresan, U. Meier, J. Masci, A. Graves,
Artificial General Intelligence (AGI), 2011.


AutoIncSFA and vision-based developmental learning for humanoid robots.
V. R. Kompella, L. Pape, J. Masci, M. Frank , J. Schmidhuber
Humanoids, 2011.



Journal



multimodal similarity preserving hashing
Multimodal Similarity-Preserving Hashing.
J. Masci, M.M Bronstein, A.M. Bronstein, J. Schmidhuber,
IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 36/4, pp. 824-830, April 2014.
Abstract. We introduce an efficient computational framework for hashing data belonging to multiple modalities into a single representation space where they become mutually comparable. [continue reading] The proposed approach is based on a novel coupled siamese neural network architecture and allows unified treatment of intra- and inter-modality similarity learning. Unlike existing cross-modality similarity learning approaches, our hashing functions are not limited to binarized linear projections and can assume arbitrarily complex forms. We show experimentally that our method significantly outperforms state-of-the-art hashing approaches on multimedia retrieval tasks. [close]


Multi Column Deep Neural Network for Traffic Sign Classification [invited].
D. C. Ciresan, U. Meier, J. Masci, J. Schmidhuber,
Neural Networks, 2012.
Abstract. We describe the approach that won the final phase of the German traffic sign recognition benchmark. Our method is the only one that achieved a betterthan-human recognition rate of 99.46%. [continue reading] We use a fast, fully parameterizable GPU implementation of a Deep Neural Network (DNN) that does not require careful design of pre-wired feature extractors, which are rather learned in a supervised way. Combining various DNNs trained on differently preprocessed data into a Multi-Column DNN (MCDNN) further boosts recognition performance, making the system insensitive also to variations in contrast and illumination. [close]



Workshop



multimodal similarity preserving hashing

multimodal similarity preserving hashing
Sparse Similarity-Preserving Hashing.
J. Masci, A.M Bronstein, M.M. Bronstein, P. Sprechmann, G. Sapiro,
International BASP Frontiers workshop, 2015.
Abstract. In recent years, a lot of attention has been devoted to efficient nearest neighbor search by means of similarity-preserving hashing. One of the plights of existing hashing techniques is the intrinsic trade-off between performance and computational complexity: while longer hash [continue reading] codes allow for lower false positive rates, it is very difficult to increase the embedding dimensionality without incurring in very high false negatives rates or prohibiting computational costs. In this paper, we propose a way to overcome this limitation by enforcing the hash codes to be sparse. Sparse high-dimensional codes enjoy from the low false positive rates typical of long hashes, while keeping the false negative rates similar to those of a shorter dense hashing scheme with equal number of degrees of freedom. We use a tailored feed-forward neural network for the hashing function. Extensive experimental evaluation involving visual and multi-modal data shows the benefits of the proposed method. [close]


Understand Locally Competitive Networks
Understanding Locally Competitive Networks.
R. K. Srivastava, J. Masci, F. Gomez, J. Schmidhuber,
Advances in Neural Information Processing Systems (NIPS), 2014.
Abstract. Recently proposed neural network activation functions such as rectified linear, maxout, and local winner-take-all have allowed for faster and more effective training of deep neural architectures on large and complex datasets. [continue reading] The common trait among these functions is that they implement local competition between small groups of units within a layer, so that only part of the network is activated for any given input pattern. In this paper, we attempt to visualize and understand this self-modularization, and suggest a unified explanation for the beneficial properties of such networks. We also show how our insights can be directly useful for efficiently performing retrieval over large datasets using neural networks. [close]



Book Chapter



descriptor learning for omnidirectional image matching
Descriptor learning for omnidirectional image matching.
J. Masci, D. Migliore, M. M Bronstein, J. Schmidhuber,
Registration and Recognition in Images and Videos, 2014.
Abstract. Feature matching in omnidirectional vision systems is a challenging problem, mainly because complicated optical systems make the theoretical modelling of invariance and construction of invariant feature descriptors hard or even impossible. [continue reading] In this paper, we propose learning invariant descriptors using a training set of similar and dissimilar descriptor pairs.We use the similarity-preserving hashing framework, in which we are trying to map the descriptor data to the Hamming space preserving the descriptor similarity on the training set. A neural network is used to solve the underlying optimization problem. Our approach outperforms not only straightforward descriptor matching, but also state-of-the-art similarity-preserving hashing methods. [close]



Technical Report



Object Recognition with Multi-Scale Pyramidal Pooling Networks.
J. Masci, U. Meier, G. Fricout, J. Schmidhuber,


Multimodal similarity-preserving hashing.
J. Masci, M.M Bronstein, A.M. Bronstein, J. Schmidhuber,


High-Performance Neural Networks for Visual Object Classification.
D. C. Ciresan, U. Meier, J. Masci, J. Schmidhuber,