您现在的位置:首页>外文期刊>Neural computation

期刊信息

  • 期刊名称:

    Neural computation

  • 中文名称: 神经计算
  • 刊频: 2.175
  • ISSN: 0899-7667
  • 出版社: -
  • 简介:
  • 排序:
  • 显示:
  • 每页:
全选(0
<1/20>
3029条结果
  • 机译 通过各种复杂度的张量脸进行脸表示
    摘要: Neurons selective for faces exist in humans and monkeys. However,characteristics of face cell receptive fields are poorly understood. In thistheoretical study, we explore the effects of complexity, defined as algorithmicinformation (Kolmogorov complexity) and logical depth, onpossible ways that face cells may be organized. We use tensor decompositionsto decompose faces into a set of components, called tensorfaces,and their associated weights, which can be interpreted as model face cellsand their firing rates. These tensorfaces form a high-dimensional representationspace in which each tensorface forms an axis of the space.A distinctive feature of the decomposition algorithm is the ability tospecify tensorface complexity. We found that low-complexity tensorfaceshave blob-like appearances crudely approximating faces, while high-complexity tensorfaces appear clearly face-like. Low-complexitytensorfaces require a larger population to reach a criterion face reconstructionerror than medium- or high-complexity tensorfaces, and thusare inefficient by that criterion. Low-complexity tensorfaces, however,generalize better when representing statistically novel faces, which arefaces falling beyond the distribution of face description parametersfound in the tensorface training set. The degree to which face representationsare parts based or global forms a continuum as a functionof tensorface complexity, with low and medium tensorfaces beingmore parts based. Given the computational load imposed in creatinghigh-complexity face cells (in the form of algorithmic information andlogical depth) and in the absence of a compelling advantage to usinghigh-complexity cells, we suggest face representations consist of a mixtureof low- and medium-complexity face cells.
  • 机译 过渡尺度空间:离散的内嗅皮层的计算理论。
    摘要: Although hippocampal grid cells are thought to be crucial for spatial navigation,their computational purpose remains disputed. Recently, theywere proposed to represent spatial transitions and convey this knowledgedownstream to place cells. However, a single scale of transitions isinsufficient to plan long goal-directed sequences in behaviorally acceptabletime.Here, a scale-space data structure is suggested to optimally accelerateretrievals from transition systems, called transition scale-space (TSS).Remaining exclusively on an algorithmic level, the scale increment isproved to be ideally√2 for biologically plausible receptive fields. It isthen argued that temporal buffering is necessary to learn the scale-spaceonline. Next, two modes for retrieval of sequences from the TSS are presented:top down and bottom up. The two modes are evaluated in symbolicsimulations (i.e., without biologically plausible spiking neurons).Additionally, a TSS is used for short-cut discovery in a simulated Morriswater maze. Finally, the results are discussed in depth with respectto biological plausibility, and several testable predictions are derived.Moreover, relations to other grid cell models, multiresolution path planning,and scale-space theory are highlighted. Summarized, reward-freetransition encoding is shown here, in a theoretical model, to be compatiblewith the observed discretization along the dorso-ventral axis of themedial entorhinal cortex. Because the theoretical model generalizes beyondnavigation, the TSS is suggested to be a general-purpose corticaldata structure for fast retrieval of sequences and relational knowledge.Source code for all simulations presented in this paper can be found athttps://github.com/rochus/transitionscalespace.
  • 机译 从突触相互作用到随机神经网络模型中的集体动力学:特征向量和瞬态行为的关键作用
    摘要: The study of neuronal interactions is at the center of several big collaborativeneuroscience projects (including the Human Connectome Project,the Blue Brain Project, and the Brainome) that attempt to obtain a detailedmap of the entire brain. Under certain constraints, mathematicaltheory can advance predictions of the expected neural dynamics basedsolely on the statistical properties of the synaptic interaction matrix. Thiswork explores the application of free random variables to the study oflarge synaptic interaction matrices. Besides recovering in a straightforwardway known results on eigenspectra in types of models of neuralnetworks proposed by Rajan and Abbott (2006), we extend them to heavy-tailed distributions of interactions. More important, we analyticallyderive the behavior of eigenvector overlaps, which determine thestability of the spectra. We observe that on imposing the neuronal excitation/inhibition balance, despite the eigenvalues remaining unchanged,their stability dramatically decreases due to the strong nonorthogonalityof associated eigenvectors. This leads us to the conclusion that understandingthe temporal evolution of asymmetric neural networks requiresconsidering the entangled dynamics of both eigenvectors and eigenvalues,which might bear consequences for learning and memory processesin these models. Considering the success of free random variables theoryin a wide variety of disciplines, we hope that the results presentedhere foster the additional application of these ideas in the area of brainsciences.
  • 机译 突触缩放可改善能够模拟大脑可塑性的神经质量模型的稳定性
    摘要: Neural mass models offer a way of studying the development and behaviorof large-scale brain networks through computer simulations. Suchsimulations are currently mainly research tools, but as they improve, theycould soon play a role in understanding, predicting, and optimizing patienttreatments, particularly in relation to effects and outcomes of braininjury. To bring us closer to this goal, we took an existing state-of-theartneural mass model capable of simulating connection growth throughsimulated plasticity processes. We identified and addressed some ofthe model’s limitations by implementing biologically plausible mechanisms.The main limitation of the original model was its instability,which we addressed by incorporating a representation of the mechanismof synaptic scaling and examining the effects of optimizing parametersin the model. We show that the updated model retains all the merits ofthe original model, while being more stable and capable of generatingnetworks that are in several aspects similar to those found in real brains.
  • 机译 比例耦合规范和耦合高阶张量完成
    摘要: Recently, a set of tensor norms known as coupled norms has been proposedas a convex solution to coupled tensor completion. Coupled normshave been designed by combining low-rank inducing tensor norms withthe matrix trace norm. Though coupled norms have shown good performances,they have two major limitations: they do not have a methodto control the regularization of coupled modes and uncoupled modes,and they are not optimal for couplings among higher-order tensors. Inthis letter, we propose a method that scales the regularization of coupledcomponents against uncoupled components to properly induce thelow-rankness on the coupled mode. We also propose coupled norms forhigher-order tensors by combining the square norm to coupled norms.Using the excess risk-bound analysis, we demonstrate that our proposedmethods lead to lower risk bounds compared to existing coupled norms.We demonstrate the robustness of our methods through simulation andreal-data experiments.
  • 机译 通过对即用型数据进行属性选择来提高通用性
    摘要: Zero-shot learning (ZSL) aims to recognize unseen objects (test classes)given some other seen objects (training classes) by sharing informationof attributes between different objects. Attributes are artificially annotatedfor objects and treated equally in recent ZSL tasks. However, someinferior attributes with poor predictability or poor discriminability mayhave negative impacts on the ZSL system performance. This letter firstderives a generalization error bound for ZSL tasks. Our theoretical analysisverifies that selecting the subset of key attributes can improve thegeneralization performance of the original ZSL model, which uses all theattributes.Unfortunately, previous attribute selection methods have beenconducted based on the seen data, and their selected attributes have poorgeneralization capability to the unseen data, which is unavailable in thetraining stage of ZSL tasks. Inspired by learning from pseudo-relevancefeedback, this letter introduces out-of-the-box data—pseudo-data generatedby an attribute-guided generative model—to mimic the unseen data.We then present an iterative attribute selection (IAS) strategy that iterativelyselects key attributes based on the out-of-the-box data. Since thedistribution of the generated out-of-the-box data is similar to that of the test data, the key attributes selected by IAS can be effectively generalizedto test data. Extensive experiments demonstrate that IAS can significantlyimprove existing attribute-based ZSL methods and achieve state-of-theartperformance.
  • 机译 训练循环神经网络进行终身学习
    摘要: Catastrophic forgetting and capacity saturation are the central challenges of any parametric lifelong learning system. In this work, we study these challenges in the context of sequential supervised learning with an emphasis on recurrent neural networks. To evaluate the models in the lifelong learning setting, we propose a curriculum-based, simple, and intuitive benchmark where the models are trained on tasks with increasing levels of difficulty. To measure the impact of catastrophic forgetting, the model is tested on all the previous tasks as it completes any task. As a step toward developing true lifelong learning systems, we unify gradient episodic memory (a catastrophic forgetting alleviation approach) and Net2Net (a capacity expansion approach). Both models are proposed in the context of feedforward networks, and we evaluate the feasibility of using them for recurrent networks. Evaluation on the proposed benchmark shows that the unified model is more suitable than the constituent models for lifelong learning setting.
  • 机译 分布随机梯度的连续时间分析
    摘要: We analyze the effect of synchronization on distributed stochastic gradient algorithms. By exploiting an analogy with dynamical models of biological quorum sensing, where synchronization between agents is induced through communication with a common signal, we quantify how synchronization can significantly reduce the magnitude of the noise felt by the individual distributed agents and their spatial mean. This noise reduction is in turn associated with a reduction in the smoothing of the loss function imposed by the stochastic gradient approximation. Through simulations on model nonconvex objectives, we demonstrate that coupling can stabilize higher noise levels and improve convergence.We provide a convergence analysis for strongly convex functions by deriving a bound on the expected deviation of the spatial mean of the agents from the global minimizer for an algorithm based on quorum sensing, the same algorithm with momentum, and the elastic averaging SGD (EASGD) algorithm.We discuss extensions to new algorithms that allow each agent to broadcast its current measure of success and shape the collective computation accordingly.We supplement our theoretical analysis with numerical experiments on convolutional neural networks trained on the CIFAR-10 data set, where we note a surprising regularizing property of EASGD even when applied to the non-distributed case. This observation suggests alternative second-order in time algorithms for nondistributed optimization that are competitive with momentum methods.
  • 机译 基于核方法的连接主义模型和无反向传播的受监督深度学习
    摘要: We propose a novel family of connectionist models based on kernel machines and consider the problem of learning layer by layer a compositional hypothesis class (i.e., a feedforward, multilayer architecture) in a supervised setting. In terms of the models, we present a principled method to “kernelize” (partly or completely) any neural network (NN). With this method, we obtain a counterpart of any given NN that is powered by kernel machines instead of neurons. In terms of learning, when learning a feedforward deep architecture in a supervised setting, one needs to train all the components simultaneously using backpropagation (BP) since there are no explicit targets for the hidden layers (Rumelhart, Hinton, & Williams, 1986). We consider without loss of generality the two-layer case and present a general framework that explicitly characterizes a target for the hidden layer that is optimal for minimizing the objective function of the network. This characterization then makes possible a purely greedy training scheme that learns one layer at a time, starting from the input layer. We provide instantiations of the abstract framework under certain architectures and objective functions. Based on these instantiations, we present a layer-wise training algorithm for an llayer feedforward network for classification, where l ≥ 2 can be arbitrary. This algorithm can be given an intuitive geometric interpretation that makes the learning dynamics transparent. Empirical results are provided to complement our theory.We show that the kernelized networks, trained layer-wise, compare favorably with classical kernel machines as well as other connectionist models trained by BP. We also visualize the inner workings of the greedy kernelized models to validate our claim on the transparency of the layer-wise algorithm.
  • 机译 在Willshaw关联网络中存储对象相关的稀疏代码
    摘要: Willshaw networks are single-layered neural networks that store associations between binary vectors. Using only binary weights, these networks can be implemented efficiently to store large numbers of patterns and allow for fault-tolerant recovery of those patterns from noisy cues. However, this is only the case when the involved codes are sparse and randomly generated. In this letter, we use a recently proposed approach that maps visual patterns into informative binary features. By doing so, we manage to transformMNIST handwritten digits into well-distributed codes that we then store in a Willshaw network in autoassociation. We perform experiments with both noisy and noiseless cues and verify a tenuous impact on the recovered pattern’s relevant information. More specifically, we were able to perform retrieval after filling the memory to several factors of its number of units while preserving the information of the class to which the pattern belongs.
  • 机译 门控工作记忆的鲁棒模型
    摘要: Gatedworking memory is defined as the capacity of holding arbitrary information at any time in order to be used at a later time. Based on electrophysiological recordings, several computationalmodels have tackled the problem using dedicated and explicit mechanisms.We propose instead to consider an implicit mechanism based on a random recurrent neural network. We introduce a robust yet simple reservoir model of gated working memory with instantaneous updates. The model is able to store an arbitrary real value at random time over an extended period of time. The dynamics of the model is a line attractor that learns to exploit reentry and a nonlinearity during the training phase using only a few representative values. A deeper study of the model shows that there is actually a large range of hyperparameters for which the results hold (e.g., number of neurons, sparsity, global weight scaling) such that any large enough population, mixing excitatory and inhibitory neurons, can quickly learn to realize such gated working memory. In a nutshell, with a minimal set of hypotheses, we show that we can have a robust model of working memory. This suggests this property could be an implicit property of any random population, that can be acquired through learning. Furthermore, considering working memory to be a physically open but functionally closed system, we give account on some counterintuitive electrophysiological recordings.
  • 机译 用于低功耗和快速分类的深度尖峰神经网络的FPGA实现
    摘要: Aspiking neural network (SNN) is a type of biological plausibility model that performs information processing based on spikes. Training a deep SNN effectively is challenging due to the nondifferention of spike signals. Recent advances have shown that high-performance SNNs can be obtained by converting convolutional neural networks (CNNs). However, the large-scale SNNs are poorly served by conventional architectures due to the dynamic nature of spiking neurons. In this letter, we propose a hardware architecture to enable efficient implementation of SNNs. All layers in the network are mapped on one chip so that the computation of different time steps can be done in parallel to reduce latency.We propose new spiking max-pooling method to reduce computation complexity. In addition, we apply approaches based on shift register and coarsely grained parallels to accelerate convolution operation. We also investigate the effect of different encoding methods on SNN accuracy. Finally, we validate the hardware architecture on the Xilinx Zynq ZCU102. The experimental results on the MNIST data set show that it can achieve an accuracy of 98.94% with eight-bit quantized weights. Furthermore, it achieves 164 frames per second (FPS) under 150 MHz clock frequency and obtains 41× speed-up compared to CPU implementation and 22 times lower power than GPU implementation.
  • 机译 自联想和异联想记忆中的迭代检索和块编码
    摘要: Neural associative memories (NAM) are perceptron-like single-layer networks with fast synaptic learning typically storing discrete associations between pairs of neural activity patterns. Gripon and Berrou (2011) investigatedNAMemploying block coding, a particular sparse coding method, and reported a significant increase in storage capacity.Herewe verify and extend their results for both heteroassociative and recurrent autoassociative networks. For this we provide a new analysis of iterative retrieval in finite autoassociative and heteroassociative networks that allows estimating storage capacity for random and block patterns. Furthermore, we have implemented various retrieval algorithms for block coding and compared them in simulations to our theoretical results and previous simulation data. In good agreement of theory and experiments,we find that finite networks employing block coding can store significantly more memory patterns. However, due to the reduced information per block pattern, it is not possible to significantly increase stored information per synapse. Asymptotically, the information retrieval capacity converges to the known limits C = ln 2 ≈ 0.69 and C = (ln 2)/4 ≈ 0.17 also for block coding. We have also implemented very large recurrent networks up to n = 2 · 10~6 neurons, showing that maximal capacity C ≈ 0.2 bit per synapse occurs for finite networks having a size n ≈ 10~5 similar to cortical macrocolumns.
  • 机译 参数族的最佳抽样:对机器学习的启示
    摘要: It is well known in machine learning that models trained on a training set generated by a probability distribution function perform farworse on test sets generated by a different probability distribution function. In the limit, it is feasible that a continuum of probability distribution functions might have generated the observed test set data; a desirable property of a learned model in that case is its ability to describe most of the probability distribution functions from the continuum equally well. This requirement naturally leads to sampling methods from the continuum of probability distribution functions that lead to the construction of optimal training sets.We study the sequential prediction of Ornstein-Uhlenbeck processes that form a parametric family. We find empirically that a simple deep network trained on optimally constructed training sets using the methods described in this letter can be robust to changes in the test set distribution.
  • 机译 ADOS研究的隐患必将影响自闭症科学
    摘要: The research-grade Autism Diagnostic Observational Schedule (ADOS)is a broadly used instrument that informs and steers much of the scienceof autism. Despite its broad use, little is known about the empiricalvariability inherently present in the scores of the ADOS scale or theirappropriateness to define change and its rate, to repeatedly use this test tocharacterize neurodevelopmental trajectories. Here we examine the empiricaldistributions of research-grade ADOS scores from 1324 records ina cross-section of the population comprising participants with autism betweenfive and 65 years of age.We find that these empirical distributionsviolate the theoretical requirements of normality and homogeneous variance,essential for independence between bias and sensitivity. Further,we assess a subset of 52 typical controls versus those with autism and finda lack of proper elements to characterize neurodevelopmental trajectoriesin a coping nervous system changing at nonuniform, nonlinear rates.Repeating the assessments over four visits in a subset of the participantswith autism for whom verbal criteria retained the same appropriateADOS modules over the time span of the four visits reveals that switchingthe clinician changes the cutoff scores and consequently influencesthe diagnosis, despite maintaining fidelity in the same test’s modules,room conditions, and tasks’ fluidity per visit. Given the changes inprobability distribution shape and dispersion of these ADOS scores, thelack of appropriate metric spaces to define similarity measures to characterizechange and the impact that these elements have on sensitivity-biascodependencies and on longitudinal tracking of autism, we invite adiscussion on readjusting the use of this test for scientific purposes.
  • 机译 电机控制的无模型鲁棒最优反馈机制
    摘要: Sensorimotor tasks that humans perform are often affected by differentsources of uncertainty. Nevertheless, the central nervous system (CNS)can gracefully coordinate our movements. Most learning frameworksrely on the internal model principle, which requires a precise internalrepresentation in the CNS to predict the outcomes of our motorcommands. However, learning a perfect internal model in a complexenvironment over a short period of time is a nontrivial problem. Indeed,achieving proficient motor skills may require years of training for somedifficult tasks. Internal models alone may not be adequate to explainthe motor adaptation behavior during the early phase of learning. Recentstudies investigating the active regulation of motor variability, thepresence of suboptimal inference, and model-free learning have challengedsome of the traditional viewpoints on the sensorimotor learningmechanism. As a result, it may be necessary to develop a computationalframework that can account for these new phenomena. Here, we developa novel theory of motor learning, based on model-free adaptiveoptimal control, which can bypass some of the difficulties in existingtheories. This new theory is based on our recently developed adaptivedynamic programming (ADP) and robust ADP (RADP) methods and isespecially useful for accounting for motor learning behavior when aninternal model is inaccurate or unavailable. Our preliminary computationalresults are in line with experimental observations reported in theliterature and can account for some phenomena that are inexplicableusing existing models.
  • 机译 使用深度学习评估听觉和视听语音预测编码的潜在增益
    摘要: Sensory processing is increasingly conceived in a predictive frameworkin which neurons would constantly process the error signal resultingfrom the comparison of expected and observed stimuli. Surprisingly,few data exist on the accuracy of predictions that can be computedin real sensory scenes. Here, we focus on the sensory processing ofauditory and audiovisual speech. We propose a set of computationalmodels based on artificial neural networks (mixing deep feedforwardand convolutional networks), which are trained to predict future audioobservations from present and past audio or audiovisual observations(i.e., including lip movements). Those predictions exploit purelylocal phonetic regularities with no explicit call to higher linguistic levels.Experiments are conducted on the multispeaker LibriSpeech audiospeech database (around 100 hours) and on the NTCD-TIMIT audiovisualspeech database (around 7 hours). They appear to be efficient in ashort temporal range (25–50 ms), predicting 50% to 75% of the varianceof the incoming stimulus, which could result in potentially saving upto three-quarters of the processing power. Then they quickly decreaseand almost vanish after 250 ms. Adding information on the lips slightlyimproves predictions, with a 5% to 10% increase in explained variance.Interestingly the visual gain vanishes more slowly, and the gain is maximumfor a delay of 75 ms between image and predicted sound.
  • 机译 小脑星状细胞兴奋性响应一对抑制性/兴奋性突触前输入的切换:动力学系统的角度。
    摘要: Cerebellar stellate cells form inhibitory synapses with Purkinje cells, thesole output of the cerebellum. Upon stimulation by a pair of varying inhibitoryand fixed excitatory presynaptic inputs, these cells do not respondto excitation (i.e., do not generate an action potential) when themagnitude of the inhibition is within a given range, but they do respondoutside this range. We previously used a revised Hodgkin–Huxley typeof model to study the nonmonotonic first-spike latency of these cells andtheir temporal increase in excitability in whole cell configuration (termedrun-up). Here, we recompute these latency profiles using the same modelby adapting an efficient computational technique, the two-point boundaryvalue problem, that is combined with the continuation method. Wethen extend the study to investigate how switching in responsiveness,upon stimulation with presynaptic inputs, manifests itself in the contextof run-up. A three-dimensional reduced model is initially derived fromthe original six-dimensional model and then analyzed to demonstratethat both models exhibit type 1 excitability possessing a saddle-node onan invariant cycle (SNIC) bifurcationwhen varying the amplitude of Iapp.Using slow-fast analysis,we show that the original model possesses threeequilibria lying at the intersection of the critical manifold of the fast subsystemand the nullcline of the slow variable hA (the inactivation of theA-type K+ channel), the middle equilibrium is of saddle type with twodimensionalstable manifold (computed from the reduced model) actingas a boundary between the responsive and non-responsive regimes, andthe (ghost of) SNIC is formed when the hA-nullcline is (nearly) tangentialto the critical manifold. We also show that the slow dynamics associatedwith (the ghost of) the SNIC and the lower stable branch of the criticalmanifold are responsible for generating the nonmonotonic first-spike latency.These results thus provide important insight into the complex dynamicsof stellate cells.
  • 机译 来自三重态比较数据的分类
    摘要: Learning from triplet comparison data has been extensively studied inthe context of metric learning, where we want to learn a distance metricbetween two instances, and ordinal embedding, where we want to learnan embedding in a Euclidean space of the given instances that preservethe comparison order as much as possible. Unlike fully labeled data,triplet comparison data can be collected in a more accurate and humanfriendlyway. Although learning from triplet comparison data has beenconsidered in many applications, an important fundamental questionof whether we can learn a classifier only from triplet comparison datawithout all the labels has remained unanswered. In this letter, we givea positive answer to this important question by proposing an unbiasedestimator for the classification risk under the empirical risk minimizationframework. Since the proposed method is based on the empiricalrisk minimization framework, it inherently has the advantage that anysurrogate loss function and any model, including neural networks, canbe easily applied. Furthermore, we theoretically establish an estimationerror bound for the proposed empirical risk minimizer. Finally, weprovide experimental results to show that our method empirically workswell and outperforms various baseline methods.
  • 机译 三层感知器退化引起的高原现象的中心流形分析
    摘要: A hierarchical neural network usually has many singular regions in theparameter space due to the degeneration of hidden units. Here, we focuson a three-layer perceptron, which has one-dimensional singularregions comprising both attractive and repulsive parts. Such a singularregion is often called a Milnor-like attractor. It is empirically knownthat in the vicinity of aMilnor-like attractor, several parameters convergemuch faster than the rest and that the dynamics can be reduced to smallerdimensionalones. Here we give a rigorous proof for this phenomenonbased on a center manifold theory. As an application, we analyze the reduceddynamics near the Milnor-like attractor and study the stochasticeffects of the online learning.
  • 联系方式:010-58892860转803 (工作时间) 18141920177 (微信同号)
  • 客服邮箱:kefu@zhangqiaokeyan.com
  • 京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-1 六维联合信息科技(北京)有限公司©版权所有
  • 客服微信
  • 服务号