Speech Emotion Recognition (SER) has emerged as a critical component of the next generation of human-machine interfacing technologies. In this work, we propose a new dual-level model that predicts emotions based on both MFCC features and mel-spectrograms produced from raw audio signals. Each utterance is preprocessed into MFCC features and two mel-spectrograms at different time-frequency resolutions. A standard LSTM processes the MFCC features, while a novel LSTM architecture, denoted as Dual-Sequence LSTM (DSLSTM), processes the two mel-spectrograms simultaneously. The outputs are later averaged to produce a final classification of the utterance. Our proposed model achieves, on average, a weighted accuracy of 72.7% and an unweighted accuracy of 73.3%—a 6% improvement over current state-of-the-art unimodal models—and is comparable with multimodal models
that leverage textual information as well as audio signals.
Dual-Sequence LSTM
MelSpectrogram
Speech emotion recognition
Time series
The Whittaker 2d growth model is a triangular continuous Markov diffusion process that appears in many scientific contexts. It has been theoretically intriguing to establish a large deviation principle for this 2d process with a scaling factor. The main challenge is the spatiotemporal interactions and dynamics that may depend on potential sample-path intersections. We develop such a principle with a novel rate function. Our approach is based on Schider’s Theorem, contraction principle, and special treatment for intersecting sample paths.
Large deviation principle
Markov diffusion process
This paper proves that a class of scaled Whittaker growth models will converge in distribution to the Dyson Brownian motion. A Whittaker 2d growth model is a continuous-time Markov diffusion process embedded on a spatial triangular array. Our result is interesting because each particle in a Whittaker 2d growth model only interacts with its neighboring particles. In contrast, each particle in the Dyson Brownian motion interacts with all the other particles. We provide two different proofs of the main result.
Stochastic differential equations
Dyson Brownian motion
In this work, we propose the concept of complementary lattice arrays in order to enable a broader range of designs for coded aperture imaging systems. We provide a general framework and methods that generate richer and more flexible designs compared to the existing techniques. Besides this, we review and interpret the state-of-the-art uniformly redundant array designs, broaden the related concepts, and propose new design methods.
Combinatorial design
Complementary sequence
Imaging systems
X-ray coded apertures
With the development of wireless communication technologies that considerably contributed to wireless sensor networks (WSNs), we have witnessed ever-increasing WSN-based applications that induced a host of research activities in both academia and industry. Since most of the target WSN applications are very sensitive, the security issue is one of the major challenges in the deployment of WSN. One of the important building blocks in securing WSN is key management. Traditional key management solutions developed for other networks are not suitable for WSN since WSN networks are resource (e.g., memory, computation, and energy) limited. Key pre-distribution algorithms have recently evolved as efficient alternatives to key management in these networks. Secure communication is achieved between a pair of nodes either by a key allowing direct communication or a chain of keys. This paper considers prior knowledge of network characteristics and application constraints in terms of communication needs between sensor nodes. We propose methods to design key pre-distribution schemes to provide better security and connectivity while requiring fewer resources. Our methods are based on casting prior information as a graph. Motivated by this idea, we also propose a class of quasi-symmetric designs named g-designs. Our proposed key pre-distribution schemes significantly improve upon the existing constructions based on the unital designs. We give some examples and point out open problems for future research.
Balanced incomplete block design
Graph
Key pre-distribution
Quasi-symmetric design
Sensor networks
Orthogonal Matching Pursuit (OMP) is a canonical greedy pursuit algorithm for sparse approximation. Previous studies of OMP have considered the recovery of a sparse signal through Φ and y = Φx + b, where is a matrix with more columns than rows and denotes the measurement noise. In this paper, based on Restricted Isometry Property, the performance of OMP is analyzed under general perturbations, which means both y and Φ are perturbed. Though the exact recovery of an almost sparse signal x is no longer feasible, the main contribution reveals that the support set of the best k-term approximation of x can be recovered under reasonable conditions. The error bound between x and the estimation of OMP is also derived. By constructing an example it is also demonstrated that the sufficient conditions for support recovery of the best k-term approximation of are rather tight. When x is strong-decaying, it is proved that the sufficient conditions for support recovery of the best k-term approximation of x can be relaxed, and the support can even be recovered in the order of the entries’ magnitude. Our results are also compared in detail with some related previous ones.
Compressed sensing
Orthogonal matching pursuit
Restricted isometry property
Strong-decaying signals