Top Document: comp.ai.neuralnets FAQ, Part 7 of 7: Hardware Previous Document: What are some applications of NNs? Next Document: How to forecast time series (temporal sequences)? See reader questions & answers on this topic!  Help others by sharing your knowledge The problem of missing data is very complex. For unsupervised learning, conventional statistical methods for missing data are often appropriate (Little and Rubin, 1987; Schafer, 1997; Schafer and Olsen, 1998). There is a concise introduction to these methods in the University of Texas statistics FAQ at http://www.utexas.edu/cc/faqs/stat/general/gen25.html. For supervised learning, the considerations are somewhat different, as discussed by Sarle (1998). The statistical literature on missing data deals almost exclusively with training rather than prediction (e.g., Little, 1992). For example, if you have only a small proportion of cases with missing data, you can simply throw those cases out for purposes of training; if you want to make predictions for cases with missing inputs, you don't have the option of throwing those cases out! In theory, Bayesian methods take care of everything, but a full Bayesian analysis is practical only with special models (such as multivariate normal distributions) or small sample sizes. The neural net literature contains a few good papers that cover prediction with missing inputs (e.g., Ghahramani and Jordan, 1997; Tresp, Neuneier, and Ahmad 1995), but much research remains to be done. References: Donner, A. (1982), "The relative effectiveness of procedures commonly used in multiple regression analysis for dealing with missing values," American Statistician, 36, 378381. Ghahramani, Z. and Jordan, M.I. (1994), "Supervised learning from incomplete data via an EM approach," in Cowan, J.D., Tesauro, G., and Alspector, J. (eds.) Advances in Neural Information Processing Systems 6, San Mateo, CA: Morgan Kaufman, pp. 120127. Ghahramani, Z. and Jordan, M.I. (1997), "Mixture models for Learning from incomplete data," in Greiner, R., Petsche, T., and Hanson, S.J. (eds.) Computational Learning Theory and Natural Learning Systems, Volume IV: Making Learning Systems Practical, Cambridge, MA: The MIT Press, pp. 6785. Jones, M.P. (1996), "Indicator and stratification methods for missing explanatory variables in multiple linear regression," J. of the American Statistical Association, 91, 222230. Little, R.J.A. (1992), "Regression with missing X's: A review," J. of the American Statistical Association, 87, 12271237. Little, R.J.A. and Rubin, D.B. (1987), Statistical Analysis with Missing Data, NY: Wiley. McLachlan, G.J. (1992) Discriminant Analysis and Statistical Pattern Recognition, Wiley. Sarle, W.S. (1998), "Prediction with Missing Inputs," in Wang, P.P. (ed.), JCIS '98 Proceedings, Vol II, Research Triangle Park, NC, 399402, ftp://ftp.sas.com/pub/neural/JCIS98.ps. Schafer, J.L. (1997), Analysis of Incomplete Multivariate Data, London: Chapman & Hall, ISBN 0 412 04061 1. Schafer, J.L., and Olsen, M.K. (1998), "Multiple imputation for multivariate missingdata problems: A data analyst's perspective," http://www.stat.psu.edu/~jls/mbr.pdf or http://www.stat.psu.edu/~jls/mbr.ps Tresp, V., Ahmad, S. and Neuneier, R., (1994), "Training neural networks with deficient data", in Cowan, J.D., Tesauro, G., and Alspector, J. (eds.) Advances in Neural Information Processing Systems 6, San Mateo, CA: Morgan Kaufman, pp. 128135. Tresp, V., Neuneier, R., and Ahmad, S. (1995), "Efficient methods for dealing with missing data in supervised learning", in Tesauro, G., Touretzky, D.S., and Leen, T.K. (eds.) Advances in Neural Information Processing Systems 7, Cambridge, MA: The MIT Press, pp. 689696. User Contributions:Comment about this article, ask questions, or add new information about this topic:Top Document: comp.ai.neuralnets FAQ, Part 7 of 7: Hardware Previous Document: What are some applications of NNs? Next Document: How to forecast time series (temporal sequences)? Part1  Part2  Part3  Part4  Part5  Part6  Part7  Single Page [ Usenet FAQs  Web FAQs  Documents  RFC Index ] Send corrections/additions to the FAQ Maintainer: saswss@unx.sas.com (Warren Sarle)
Last Update March 27 2014 @ 02:11 PM
