Download>Subject> A must-read classic for beginners of artificial intelligence

A must-read classic for beginners of artificial intelligence

If you are interested in artificial intelligence, or are planning to work in the field of artificial intelligence, and want to master a new technology but don\'t know how to get started quickly, I have compiled this collection of \"Must-Read Classics for Getting Started with Artificial Intelligence\". These resources cover the basic concepts of artificial intelligence, algorithms, machine learning, deep learning, pattern recognition, basic mathematical theories, and other aspects of knowledge, suitable for people at different levels. Collect them and learn now~

Download：100+

A must-read classic for beginners of artificial intelligence Document List

pdf

A Brief History of Artificial Intelligence: Points it Requires : 2 Type:Technical DocumentationUploader:太白金星Date:2020-09-29; Introduction:This book comprehensively tells the history of the development of artificial intelligence, covering almost all areas of artificial intelligence, including the origin of artificial intelligence, automatic theorem proving, expert systems, neural networks, natural language processing, genetic algorithms, deep learning, reinforcement learning, super intelligence, philosophical issues and future trends, etc. It comprehensively reviews and deeply comments on artificial intelligence with a broad vision and vivid language.

Download

pdf

The nature of intelligence: 64 big questions in artificial intelligence and robotics: Points it Requires : 2 Type:Technical DocumentationUploader:太白金星Date:2020-09-29; Introduction:In the book, the author expresses many \"surprising\" and thought-provoking views on artificial intelligence and robots based on common sense. If machines can create a better world, why does the world still need us? Deep learning is the technology of learning what humans have done (in the past tense). In terms of accompanying the elderly, advanced robots so far are not as good as dogs. Immortality will eventually become a service to be sold or rented, just like the current cloud computing service. When we study how to create intelligent machines, do we mean real \"intelligence\" or \"intelligence that serves humans in a stupid way\"? The plan of robot humanization has not yet succeeded, while the mechanization of humans has achieved remarkable results. There is no machine in the world that can create another more advanced machine. We have created better machines. There are two ways to achieve the Turing test: the first is to make the machine as smart as a human; the second is to make people as stupid as machines. ... This is a popular science book that everyone can understand and think about the relationship between machine intelligence and their own lives.

Download

pdf

Artificial Intelligence: A Modern Approach (4th Edition): Points it Requires : 2 Type:Technical DocumentationUploader:抛砖引玉Date:2024-01-28; Introduction:This book comprehensively and deeply explores the theory and practice in the field of artificial intelligence (AI). It integrates the popular AI ideas and terms into applications that have attracted widespread attention in a unified style, truly combining theory and practice. The book is divided into 7 parts, with a total of 28 chapters. The theoretical part introduces the main theories and methods of AI research and traces back to related ideas more than 2,000 years ago. The content mainly includes logic, probability and continuous mathematics, perception, reasoning, learning and action, fairness, trust, social welfare and security; the practical part perfectly implements the \"modern\" concept, and the actual application selects the currently popular microelectronic devices, robotic planetary probes, online services with billions of users, AlphaZero, humanoid robots, autonomous driving, AI-assisted medical treatment, etc. This book is suitable as a textbook for undergraduate and graduate students in AI-related majors in colleges and universities, and can also be used as a reference book for professionals in related fields. Copyright InformationFreeCopyrightFreeContent SummaryFreeCopyright StatementFreePraise for this BookFreeForewordFreeMethods Are More Than Just IntelligenceFreeOnly Thoughts Are EternalFreeChinese VersionAcknowledgmentsFreeForewordFreeAuthor ProfileFreeResources and ServicesFreePart IFoundations of Artificial IntelligenceFreeChapter 1IntroductionFreeChapter 2AgentsFreePart IIProblem SolvingChapter 3Problem Solving by SearchChapter 4Search in Complex EnvironmentsChapter 5Adversarial Search and GamesChapter 6Constraint Satisfaction ProblemsPart IIIKnowledge, Reasoning, and PlanningChapter 7Logical AgentsChapter 8First-Order LogicChapter 9Inference in First-Order LogicChapter 10Knowledge RepresentationChapter 11Automated PlanningPart IVNo Certain Knowledge and Uncertain Reasoning Chapter 12 Quantifying Uncertainty Chapter 13 Probabilistic Reasoning Chapter 14 Probabilistic Reasoning over Time Chapter 15 Probabilistic Programming Chapter 16 Making Simple Decisions Chapter 17 Making Complex Decisions Chapter 18 Multi-Agent Decision Making Part V Machine Learning Chapter 19 Learning from Examples Chapter 20 Learning Probabilistic Models Chapter 21 Deep Learning Chapter 22 Reinforcement Learning Part VI Communication, Perception, and Action Chapter 23 Natural Language Processing Chapter 24 Deep Learning for Natural Language Processing Chapter 25 Computer Vision Chapter 26 Robotics Part VII Conclusion Chapter 27 Philosophy, Ethics, and Security of Artificial Intelligence Chapter 28 The Future of Artificial Intelligence Appendix A Mathematical Background Appendix B Notes on Language and Algorithms Reference Index

Download

pdf

Machine Learning Basics_Merriam Mowry: Points it Requires : 2 Type:Technical DocumentationUploader:sigmaDate:2024-07-04; Introduction:This book explores the basic theories and typical algorithms of machine learning from the perspective of Probabilistic Approximate Correctness (PAC) theory, including PAC learning framework, VC-dimension, support vector machine, kernel method, online learning, multi-classification, sorting, regression, dimensionality reduction, reinforcement learning and other rich contents. In addition, the appendix briefly reviews the necessary preparatory knowledge closely related to machine learning, such as probability theory, convex optimization, matrix and norm. This book focuses on introducing the theoretical support of typical algorithms and pointing out the key points of algorithms in practical applications, focusing on theoretical details and proof process. It can be used as a textbook for courses such as machine learning and statistics in colleges and universities, or as a reference book for researchers in related fields.

Download

pdf

A Practical Guide to Machine Learning with Python: Points it Requires : 2 Type:Technical DocumentationUploader:jujuyaya222Date:2018-11-07; Introduction:Machine learning is a field that has become increasingly popular in recent years. At the same time, Python has gradually become one of the mainstream programming languages after a period of development. This book combines the two popular fields of machine learning and Python, and explains in detail how to build real machine learning applications through easy-to-understand projects. The book has 10 chapters. Chapter 1 explains the ecosystem of Python machine learning, and the remaining 9 chapters introduce many algorithms related to machine learning, including clustering algorithms, recommendation engines, etc., mainly including the application of machine learning in apartments, air tickets, IPO markets, news sources, content promotion, stock markets, images, chatbots, and recommendation engines. This book is suitable for Python programmers, data analysts, readers interested in algorithms, practitioners in the field of machine learning, and researchers.

Download

pdf

Machine Learning in Action (Peter Harrington): Points it Requires : 2 Type:Technical DocumentationUploader:抛砖引玉Date:2023-06-29; Introduction:Machine learning is an extremely important research direction in the field of artificial intelligence. In today\'s big data era, capturing data and extracting valuable information or patterns from it has become a decisive means for various industries to survive and develop, which has made this research field, which was once exclusive to analysts and mathematicians, increasingly attract people\'s attention. The first part of this book mainly introduces the basics of machine learning and how to use algorithms for classification. It gradually introduces a variety of classic supervised learning algorithms, such as the k-nearest neighbor algorithm, the naive Bayes algorithm, the logistic regression algorithm, the support vector machine, the AdaBoost ensemble method, the tree-based regression algorithm, and the classification and regression tree (CART) algorithm. The third part focuses on unsupervised learning and some of its main algorithms: the k-means clustering algorithm, the Apriori algorithm, and the FP-Growth algorithm. The fourth part introduces some auxiliary tools of machine learning algorithms. The book uses carefully arranged examples to cut into daily work tasks, abandons academic language, and uses efficient and reusable Python code to explain how to process statistical data, perform data analysis and visualization. Through various examples, readers can learn the core algorithms of machine learning and apply them to some strategic tasks, such as classification, prediction, and recommendation. In addition, they can also be used to implement some more advanced functions, such as aggregation and simplification. Table of Contents Part I Classification Chapter 1 Basics of Machine Learning 2 1.1 What is Machine Learning 3 1.1.1 Sensors and Massive Data 4 1.1.2 Machine Learning is Important 5 1.2 Key Terms 5 1.3 The Main Tasks of Machine Learning 7 1.4 How to Choose the Right Algorithm 8 1.5 Steps to Develop a Machine Learning Application 9 1.6 Advantages of Python 10 1.6.1 Executable Pseudocode 10 1.6.2 Python is Popular 10 1.6.3 Features of Python 11 1.6.4 Disadvantages of Python 11 1.7 Basics of NumPy 12 1.8 Summary of this Chapter 13 Chapter 2 k-Nearest Neighbor Algorithm 15 2.1 Overview of k-Nearest Neighbor Algorithm 15 2.1.1 Preparation: Importing Data Using Python 17 2.1.2 Parsing Data from Text Files 19 2.1.3 How to Test a Classifier 20 2.2 Example: Using k-Nearest Neighbor Algorithm to Improve the Matching Effect of Dating Websites 20 2.2.1 Preparing Data: Parsing Data from a Text File 21 2.2.2 Analyzing Data: Creating a Scatter Plot Using Matplotlib 23 2.2.3 Preparing Data: Normalizing Numerical Values 25 2.2.4 Testing the Algorithm: Validating the Classifier as a Complete Program 26 2.2.5 Using the Algorithm: Building a Complete, Working System 27 2.3 Example: Handwriting Recognition System 28 2.3.1 Preparing Data: Converting Images to Test Vectors 29 2.3.2 Testing the Algorithm: Recognizing Handwritten Digits Using the k-Nearest Neighbor Algorithm 30 2.4 Chapter Summary 31 Chapter 3 Decision Trees 32 3.1 Construction of Decision Trees 33 3.1.1 Information Gain 35 3.1.2 Partitioning a Dataset 37 3.1.3 Recursively Constructing Decision Trees 39 3.2 Annotating Trees in Python Using Matplotlib 42 3.2.1 Matplotlib Annotations 43 3.2.2 4.5.1 Preparing Data: Building Word Embeds from Text 4.5.2 Training Algorithms: Calculating Probabilities from Word Embeds 61 4.5.3 Testing Algorithms: Modifying Classifiers Based on Reality 63 4.5.4 Preparing Data: Bag-of-Words Model 64 4.6 Example: Using Naive Bayes to Filter Spam 64 4.6.1 Preparing Data: Segmenting Text 65 4.6.2 Testing the Algorithm: Cross-Validation Using Naive Bayes 66 4.7 Example: Using the Naive Bayes Classifier to Extract Regional Preferences from Personal Ads 68 4.7.1 Collecting Data: Importing RSS Feeds 68 4.7.2 Analyzing Data: Displaying Regionally Related Words 71 4.8 Chapter Summary 72 Chapter 5 Logistic Regression 73 5.1 Classification Based on Logistic Regression and Sigmoid Function 74 5.2 Determining the Optimal Regression Coefficients Using Optimization Methods 75 5.2.1 Gradient Ascent Method 75 5.2.2 Training Algorithm: Finding the Optimal Parameters Using Gradient Ascent 77 5.2.3 Analyzing Data: Drawing the Decision Boundary 79 5.2.4 Training Algorithm: Stochastic Gradient Ascent 80 5.3 Example: Predicting Horse Mortality from Hernia Symptoms 85 5.3.1 Preparing Data: Handling Missing Values in the Data 85 5.3.2 5.4 Chapter Summary 88 Chapter 6 Support Vector Machines 89 6.1 Separating Data Based on the Maximum Margin 89 6.2 Finding the Maximum Margin 91 6.2.1 Optimization Problems Solved by Classifiers 92 6.2.2 General Framework for SVM Applications 93 6.3 Efficient SMO Optimization Algorithms 94 6.3.1 Platt’s SMO Algorithm 94 6.3.2 Applying a Simplified SMO Algorithm to Handle Small Data Sets 94 6.4 Accelerating Optimization Using the Full Platt SMO Algorithm 99 6.5 Applying Kernel Functions to Complex Data 105 6.5.1 Using Kernel Functions to Map Data into a High-Dimensional Space 106 6.5.2 The Radial Basis Kernel Function 106 6.5.3 Using Kernel Functions in Testing 108 6.6 Example: Handwriting Recognition Problem Review 111 6.7 Chapter Summary 113 Chapter 7 Improving Classification Performance Using the AdaBoost Meta-Algorithm 115 7.1 Classifiers Based on Multiple Sampling of Datasets 115 7.1.1 Bagging: Classifier Construction Based on Random Resampling of Data 116 7.1.2 Boosting 116 7.2 Training Algorithms: Boosting Classifier Performance Based on Errors 117 7.3 Building Weak Classifiers Based on Single-Layer Decision Trees 118 7.4 Implementation of the Full AdaBoost Algorithm 122 7.5 Testing Algorithms: Classification Based on AdaBoost 124 7.6 Example: Applying AdaBoost on a Difficult Dataset 125 7.7 Imbalanced Classification Problems 127 7.7.1 Other Classification Performance Measures: Precision, Recall, and ROC Curves 128 7.7.2 Cost-based Classifier Decision Control 131 7.7.3 Data Sampling Methods for Dealing with Imbalanced Problems 132 7.8 Chapter Summary 132 Part II Predicting Numerical Data Using Regression Chapter 8 Predicting Numerical Data: Regression 136 8.1 8.4 Reducing the coefficients to “understand” the data 146 8.4.1 Ridge regression 146 8.4.2 Lasso 148 8.4.3 Forward stepwise regression 149 8.5 Balancing bias and variance 152 8.6 Example: Predicting the price of a Lego set 153 8.6.1 Collecting data: Using the Google Shopping API 153 8.6.2 Training the algorithm: Building a model 155 8.7 Chapter summary 158 Chapter 9 Tree regression 159 9.1 Modeling locality in complex data 159 9.2 Building trees for continuous and discrete features 160 9.3 Using the CART Algorithm for Regression 163 9.3.1 Building a Tree 163 9.3.2 Running the Code 165 9.4 Tree Pruning 167 9.4.1 Pre-pruning 167 9.4.2 Post-pruning 168 9.5 Model Tree 170 9.6 Example: Comparison of Tree Regression with Standard Regression 173 9.7 Creating a GUI with Python’s Tkinter Library 176 9.7.1 Creating a GUI with Tkinter 177 9.7.2 Integrating Matplotlib and Tkinter 179 9.8 Chapter Summary 182 Part III Unsupervised Learning Chapter 10 Grouping Unlabeled Data Using the K-Means Clustering Algorithm 184 10.1 K-Means Clustering Algorithm 185 10.2 Improving Clustering Performance Using Post-Processing 189 10.3 The Bisection K-Means Algorithm 190 10.4 Example: Clustering Map Points 193 10.4.1 The Yahoo! PlaceFinder API 194 10.4.2 Clustering Geographic Coordinates 196 10.5 Chapter Summary 198 Chapter 11 Association Analysis Using the Apriori Algorithm 200 11.1 Association Analysis 201 11.2 The Apriori Principle 202 11.3 Using the Apriori Algorithm to Discover Frequent Itemsets 204 11.3.1 Generating Candidate Itemsets 204 11.3.2 Organizing the Complete Apriori Algorithm 207 11.4 Mining Association Rules from Frequent Itemsets 209 11.5 Example: Discovering Patterns in Congressional Voting 212 11.5.1 Collecting Data: Constructing a Transactional Dataset of U.S. Congressional Voting Records 213 11.5.2 Testing the Algorithm: Mining Association Rules Based on U.S. Congressional Voting Records 219 11.6 Example: Discovering Similar Features of Toxic Mushrooms 220 11.7 12.3 Mining Frequent Itemsets from an FP-Tree 231 12.3.1 Extracting Conditional Pattern Bases 231 12.3.2 Creating Conditional FP-Trees 232 12.4 Example: Finding Co-occurring Words in a Twitter Feed 235 12.5 Example: Mining from a News Site Clickstream 238 12.6 Chapter Summary 239 Part IV: Other Tools Chapter 13: Using PCA to Simplify Data 242 13.1 Dimensionality Reduction Techniques 242 13.2 PCA 243 13.2.1 Moving the Axis 243 13.2.2 13.3 Example: Using PCA to Reduce the Dimensionality of Semiconductor Manufacturing Data 248 13.4 Chapter Summary 251 Chapter 14 Using SVD to Simplify Data 252 14.1 Applications of SVD 252 14.1.1 Latent Semantic Indexing 253 14.1.2 Recommendation System 253 14.2 Matrix Decomposition 254 14.3 Implementing SVD in Python 255 14.4 Recommendation Engine Based on Collaborative Filtering 257 14.4.1 Similarity Calculation 257 14.4.2 Similarity Based on Items or Similarity Based on Users? 260 14.4.3 Evaluation of Recommendation Engines 260 14.5 Example: Restaurant Dish Recommendation Engine 260 14.5.1 Recommending Untried Dishes 261 14.5.2 Improving Recommendation Performance with SVD 263 14.5.3 Challenges in Building Recommendation Engines 265 14.6 Image Compression Based on SVD 266 14.7 Chapter Summary 268 Chapter 15 Big Data and MapReduce 270 15.1 MapReduce: A Framework for Distributed Computing 271 15.2 Hadoop Streams 273 15.2.1 Distributed Computation of Mean and Variance Mappers 273 15.2.2 Distributed Computation of Mean and Variance Reducers 274 15.3 Running Hadoop Programs on Amazon Web Services 275 15.3.1 Available Services on AWS 276 15.3.2 Getting Started with Amazon Web Services 276 15.3.3 15.5.1 Seamless integration of mrjob and EMR 283 15.5.2 Anatomy of a MapReduce script for mrjob 284 15.6 Example: Pegasos algorithm for distributed SVM 286 15.6.1 Pegasos algorithm 287 15.6.2 Training algorithm: MapReduce version of SVM implemented with mrjob 288 15.7 Do you really need MapReduce? 292 15.8 Chapter Summary 292 Appendix A Introduction to Python 294 Appendix B Linear Algebra 303 Appendix C Review of Probability 309 Appendix D Resources 312 Index 313 Copyright Notice 3166 Chapter Summary 239 Part IV Other Tools Chapter 13 Using PCA to Simplify Data 242 13.1 Dimensionality Reduction Techniques 242 13.2 PCA 243 13.2.1 Moving Coordinate Axes 243 13.2.2 Implementing PCA in NumPy 246 13.3 Example: Using PCA to Reduce the Dimensionality of Semiconductor Manufacturing Data 248 13.4 Chapter Summary 251 Chapter 14 Using SVD to Simplify Data 252 14.1 Applications of SVD 252 14.1.1 Latent Semantic Indexing 253 14.1.2 Recommendation System 253 14.2 Matrix Decomposition 254 14.3 Implementing SVD in Python 255 14.4 Recommendation Engine Based on Collaborative Filtering 257 14.4.1 Similarity Calculation 257 14.4.2 Similarity Based on Items or Similarity Based on Users? 260 14.4.3 Evaluation of Recommendation Engines 260 14.5 Example: Restaurant Dish Recommendation Engine 260 14.5.1 Recommending Untried Dishes 261 14.5.2 Improving Recommendation Performance with SVD 263 14.5.3 Challenges in Building Recommendation Engines 265 14.6 Image Compression Based on SVD 266 14.7 Chapter Summary 268 Chapter 15 Big Data and MapReduce 270 15.1 MapReduce: A Framework for Distributed Computing 271 15.2 Hadoop Streams 273 15.2.1 Distributed Computation of Mean and Variance Mappers 273 15.2.2 Distributed Computation of Mean and Variance Reducers 274 15.3 Running Hadoop Programs on Amazon Web Services 275 15.3.1 Available Services on AWS 276 15.3.2 Getting Started with Amazon Web Services 276 15.3.3 15.5.1 Seamless integration of mrjob and EMR 283 15.5.2 Anatomy of a MapReduce script for mrjob 284 15.6 Example: Pegasos algorithm for distributed SVM 286 15.6.1 Pegasos algorithm 287 15.6.2 Training algorithm: MapReduce version of SVM implemented with mrjob 288 15.7 Do you really need MapReduce? 292 15.8 Chapter Summary 292 Appendix A Introduction to Python 294 Appendix B Linear Algebra 303 Appendix C Review of Probability 309 Appendix D Resources 312 Index 313 Copyright Notice 3166 Chapter Summary 239 Part IV Other Tools Chapter 13 Using PCA to Simplify Data 242 13.1 Dimensionality Reduction Techniques 242 13.2 PCA 243 13.2.1 Moving Coordinate Axes 243 13.2.2 Implementing PCA in NumPy 246 13.3 Example: Using PCA to Reduce the Dimensionality of Semiconductor Manufacturing Data 248 13.4 Chapter Summary 251 Chapter 14 Using SVD to Simplify Data 252 14.1 Applications of SVD 252 14.1.1 Latent Semantic Indexing 253 14.1.2 Recommendation System 253 14.2 Matrix Decomposition 254 14.3 Implementing SVD in Python 255 14.4 Recommendation Engine Based on Collaborative Filtering 257 14.4.1 Similarity Calculation 257 14.4.2 Similarity Based on Items or Similarity Based on Users? 260 14.4.3 Evaluation of Recommendation Engines 260 14.5 Example: Restaurant Dish Recommendation Engine 260 14.5.1 Recommending Untried Dishes 261 14.5.2 Improving Recommendation Performance with SVD 263 14.5.3 Challenges in Building Recommendation Engines 265 14.6 Image Compression Based on SVD 266 14.7 Chapter Summary 268 Chapter 15 Big Data and MapReduce 270 15.1 MapReduce: A Framework for Distributed Computing 271 15.2 Hadoop Streams 273 15.2.1 Distributed Computation of Mean and Variance Mappers 273 15.2.2 Distributed Computation of Mean and Variance Reducers 274 15.3 Running Hadoop Programs on Amazon Web Services 275 15.3.1 Available Services on AWS 276 15.3.2 Getting Started with Amazon Web Services 276 15.3.3 15.5.1 Seamless integration of mrjob and EMR 283 15.5.2 Anatomy of a MapReduce script for mrjob 284 15.6 Example: Pegasos algorithm for distributed SVM 286 15.6.1 Pegasos algorithm 287 15.6.2 Training algorithm: MapReduce version of SVM implemented with mrjob 288 15.7 Do you really need MapReduce? 292 15.8 Chapter Summary 292 Appendix A Introduction to Python 294 Appendix B Linear Algebra 303 Appendix C Review of Probability 309 Appendix D Resources 312 Index 313 Copyright Notice 316

Download

pdf

A Practical Guide to Machine Learning: Case Studies (2nd Edition): Points it Requires : 2 Type:Technical DocumentationUploader:nkyqslDate:2018-11-07; Introduction:\"Machine Learning Practice Guide: Case Application Analysis (2nd Edition)\" author: Mai Hao, published in 2016. The second edition of \"Machine Learning Practice Guide\" adds more cases and algorithm analysis than the first edition. The book introduces in detail the development and application prospects of machine learning, scientific computing platform, Python computing platform application, R language computing platform application, production environment foundation, statistical analysis foundation, descriptive analysis case, hypothesis testing and regression model case, neural network, statistical algorithm, Euclidean distance and cosine similarity, SVM, regression algorithm, PCA dimensionality reduction, association rules, clustering and classification algorithm, data fitting case, image algorithm case, machine vision case, text classification case and other machine learning practices and applications.

Download

pdf

Introduction to Deep Learning with PyTorch by Liao Xingyu: Points it Requires : 2 Type:Technical DocumentationUploader:sigmaDate:2022-03-12; Introduction:Deep learning has now become the hottest technology in the field of science and technology. In the book PyTorch for Deep Learning, we will help you get started in the field of deep learning. This book will start with an introduction to artificial intelligence, understand the basic theories of machine learning and deep learning, and learn how to build models using the PyTorch framework. By reading this book, you will learn linear regression and logistic regression in machine learning, optimization methods for deep learning, multi-layer fully connected neural networks, convolutional neural networks, recurrent neural networks, and generative adversarial networks. At the same time, you will learn PyTorch from scratch, understand the basics of PyTorch and how to use it to build models, and finally understand the most cutting-edge research results and the application of PyTorch in actual projects through actual combat. Suitable for readers: This book combines theory and code to help readers better enter the field of deep learning. It is suitable for anyone interested in deep learning. Contents Chapter 1 Introduction to Deep Learning 1 Chapter 2 Deep Learning Framework 11 Chapter 3 Multi-layer Fully Connected Neural Network 24 Chapter 4 Convolutional Neural Network 76 Chapter 5 Recurrent Neural Network 111 Chapter 6 Generative Adversarial Network 144 Chapter 7 Deep Learning in Action 173

Download

pdf

Neural network design. (USA) Hagan: Points it Requires : 2 Type:PaperUploader:frogluckyDate:2018-11-07; Introduction:Neural Network Design. (USA) Hagan. Clear version This book introduces the design and application of neural networks in detail, from basic perceptrons to deep neural networks, and provides practical guidance and case studies.

Download

pdf

The Beauty of Mathematics (Third Edition): Points it Requires : 2 Type:Technical DocumentationUploader:eisbergeisbergDate:2023-04-09; Introduction:This is a highly respected classic popular science work, recommended by many institutions as a stepping stone to mathematics, and a good book for college students in the field of information. Mathematics is both a summary and induction of facts in nature and the result of abstract thinking. In \"The Beauty of Mathematics\", Dr. Wu Jun focuses on his understanding of professional disciplines such as mathematics and information processing, and brilliantly expresses the beauty of mathematics in the IT field, especially in speech recognition, natural language processing, and information search, which are all hot technical topics in the intelligent era. This book also uses a lot of space to introduce allusions in various fields. It is a popular science book that liberal arts students can also understand. Becoming a master in a field is accidental, but more inevitable. Its inevitability is the thinking method of the masters. Through this book, you can understand their ordinariness and excellence, understand the reasons for their success, and feel the beautiful life of those who truly understand the beauty of mathematics.

Download

pdf

Statistical Learning Methods (2nd Edition) (Li Hang): Points it Requires : 2 Type:Technical DocumentationUploader:抛砖引玉Date:2022-12-13; Introduction:This book systematically introduces the main methods of statistical learning. It is divided into two parts. The first part systematically introduces various important methods of supervised learning, including decision trees, perceptrons, support vector machines, maximum entropy models and logistic regression, boosting method, multi-class classification, EM algorithm, hidden Markov model and conditional random field, etc. The second part introduces unsupervised learning, including clustering, singular values, principal component analysis, latent semantic analysis, etc. In both parts, in addition to the introduction and summary, each chapter introduces one or two methods. Part 1 Supervised Learning Chapter 1 Introduction to Statistical Learning and Supervised Learning. 3 1.1 Statistical Learning. 3 1.2 Classification of Statistical Learning. 5 1.2.1 Basic Classification. 6 1.2.2 Classification by Model 11 1.2.3 Classification by Algorithm 13 1.2.4 Classification by Technique 13 1.3 Three Elements of Statistical Learning Methods 15 1.3.1 Model 15 1.3.2 Strategy 16 1.3.3 Algorithm 19 1.4 Model Evaluation and Model Selection 19 1.4.1 Training Error and Test Error 19 1.4.2 Overfitting and Model Selection 20 1.5 Regularization and Cross-Validation 23 1.5.1 Regularization 23 1.5.2 Cross-Validation. 24 1.6 Generalization Ability. 24 1.6.1 Generalization Error. 24 1.6.2 Upper Bound of Generalization Error 25 1.7 Generative Model and Discriminative Model 27 1.8 Applications of Supervised Learning 28 1.8.1 Classification Problem. 28 1.8.2 Labeling Problem. 30 1.8.3 Regression Problem. 32 Chapter Overview 33 Continue Reading 33 Exercises 33 References 34 Chapter 2 Perceptron 35 2.1 Perceptron Model 35 2.2 Perceptron Learning Strategy. 36 2.2.1 Linear Separability of Datasets 36 2.2.2 Perceptron Learning Strategy. 37 2.3 Perceptron Learning Algorithm. 38 2.3.1 Primitive Form of Perceptron Learning Algorithm 38 2.3.2 Convergence of Algorithm 41 ... 2.3.3 Dual Form of Perceptron Learning Algorithm 43 Chapter Overview 46 Continue Reading 46 Exercises 46 References 47 Chapter 3 k-Nearest Neighbor Method 49 3.1 k-Nearest Neighbor Algorithm 49 3.2 k-Nearest Neighbor Model 50 3.2.1 Model 50 3.2.2 Distance Metrics 50 3.2.3 Selection of k Values 52 3.2.4 Classification Decision Rules 52 3.3 Implementation of the k-Nearest Neighbor Method: KD Tree 53 3.3.1 Constructing a KD Tree 53 3.3.2 Searching a KD Tree 55 Chapter Overview 57 Continue Reading 57 Exercises 58 References 58 Chapter 4 Naive Bayesian Method 59 4.1 Learning and Classification of Naive Bayesian Method 59 4.1.1 Basic Method 59 4.1.2 The Meaning of Maximizing the Posterior Probability 61 4.2 Parameter Estimation of Naive Bayesian Method 62 4.2.1 Maximum Likelihood Estimation 62 4.2.2 Learning and Classification Algorithms 62 4.2.3 Bayesian Estimation 64 Chapter Overview 65 Continue Reading 66 Exercises 66 References 66 Chapter 5 Decision Trees 67 5.1 Decision Tree Model and Learning 67 5.1.1 Decision Tree Model 67 5.1.2 Decision Tree and If-Then Rule 68 5.1.3 Decision Tree and Conditional Probability Distribution 68 5.1.4 Decision Tree Learning 69 5.2 Feature Selection 71 5.2.1 Feature Selection Problem 71 5.2.2 Information Gain 72 5.2.3 Information Gain Ratio 76 5.3 Generation of Decision Tree 76 5.3.1 ID3 Algorithm 76 5.3.2 C4.5 Generation Algorithm 78 5.4 Decision Tree Pruning 78 5.5 CART Algorithm 80 5.5.1 CART Generation 81 5.5.2 CART Pruning 85 Chapter Summary 87 Continue Reading 88 Exercises 89 References 89 Chapter 6 Logistic Regression and Maximum Entropy Model 91 6.1 Logistic Regression Model 91 6.1.1 Logistic Distribution 91 6.1.2 Binomial Logistic Regression Model 92 6.1.3 Model Parameter Estimation 93 6.1.4 Multinomial Logistic Regression 94 6.2 Maximum Entropy Model 94 6.2.1 Maximum Entropy Principle 94 6.2.2 Definition of Maximum Entropy Model 96 6.2.3 Learning Maximum Entropy Model 98 6.2.4 Maximum Likelihood Estimation 102 6.3 Optimization Algorithm for Model Learning 103 6.3.1 Improved Iterative Scaling Method 103 6.3.2 Quasi-Newton Method 107 Chapter Summary 108 Continue Reading 109 Exercises 109 References 109 Chapter 7 Support Vector Machines 111 7.1 Linear Separable Support Vector Machines and Hard Margin Maximization 112 7.1.1 Linear Separable Support Vector Machines 112 7.1.2 Functional Margin and Geometric Margin 113 7.1.3 Margin Maximization 115 7.1.4 Dual Algorithm for Learning 120 7.2 Linear Support Vector Machines and Soft Margin Maximization 125 7.2.1 Linear Support Vector Machines 125 7.2.2 Dual Algorithm for Learning 127 7.2.3 Support Vectors 130 7.2.4 Hinge Loss Function 131 7.3 Nonlinear Support Vector Machines and Kernel Functions 133 7.3.1 Kernel Techniques 133 7.3.2 Positive Definite Kernel 136 7.3.3 Common Kernel Functions 140 7.3.4 Nonlinear Support Vector Classification Machines 141 7.4 Sequential Minimum Optimization Algorithm 142 7.4.1 Methods for Solving Quadratic Programming with Two Variables 143 7.4.2 Methods for Selecting Variables 147 7.4.3 SMO Algorithm 149 Chapter Summary 149 Continue Reading 152 Exercises 152 References 153 Chapter 8 Boosting Methods 155 8.1 Boosting Method AdaBoost Algorithm 155 8.1.1 Basic Idea of Boosting Method 155 8.1.2 AdaBoost Algorithm 156 8.1.3 AdaBoost Example 158 8.2 Analysis of Training Error of AdaBoost Algorithm 160 8.3 Explanation of AdaBoost Algorithm 162 8.3.1 Forward Step Algorithm 162 8.3.2 Forward Step Algorithm and AdaBoost 164 8.4 Boosting Tree 166 8.4.1 Boosting Tree Model 166 8.4.2 Boosting Tree Algorithm 166 8.4.3 Gradient Boosting 170 Chapter Summary 172 Continue Reading 172 Exercises 173 References 173 Chapter 9 EM Algorithm and Its Extension 175 9.1 Introduction of EM Algorithm 175 9.1.1 EM Algorithm 175 9.1.2 Derivation of EM Algorithm 179 9.1.3 Application of EM algorithm in unsupervised learning 181 9.2 Convergence of EM algorithm 181 9.3 Application of EM algorithm in Gaussian mixture model learning 183 9.3.1 Gaussian mixture model 183 9.3.2 EM algorithm for parameter estimation of Gaussian mixture model 183 9.4 Generalization of EM algorithm 187 9.4.1 Max-max algorithm for F function 187 9.4.2 GEM algorithm 189 Chapter summary 191 Continue reading 192 Exercises 192 References 192 Chapter 10 Hidden Markov Model 193 10.1 Basic concepts of hidden Markov model 193 10.1.1 Definition of hidden Markov model 193 10.1.2 Generation process of observation sequence 196 10.1.3 Three basic problems of hidden Markov model 196 10.2 Probability calculation algorithm 197 10.2.1 Direct Calculation Method 197 10.2.2 Forward Algorithm 198 10.2.3 Backward Algorithm 201 10.2.4 Some Probability and Expected Value Calculations 202 10.3 Learning Algorithms 203 10.3.1 Supervised Learning Methods 203 10.3.2 Baum-Welch Algorithm 204 10.3.3 Baum-Welch Model Parameter Estimation Formula 206 10.4 Prediction Algorithms 207 10.4.1 Approximation Algorithms 208 10.4.2 Viterbi Algorithm 208 Chapter Summary 212 Continue Reading 212 Exercises 213 References 213 Chapter 11 Conditional Random Fields 215 11.1 Probabilistic Undirected Graphical Models 215 11.1.1 Model Definition 215 11.1.2 Factorization of Probabilistic Undirected Graphical Models 217 11.2 Definition and form of conditional random fields.218 11.2.1 Definition of conditional random fields.218 11.2.2 Parameterized form of conditional random fields.220 11.2.3 Simplified form of conditional random fields.221 11.2.4 Matrix form of conditional random fields.223 11.3 Probability calculation problems of conditional random fields.224 11.3.1 Forward-backward algorithm.225 11.3.2 Probability calculation.225 11.3.3 Calculation of Expected Value 226 11.4 Learning Algorithms for Conditional Random Fields 227 11.4.1 Improved Iterative Scaling Method 227 11.4.2 Quasi-Newton Method 230 11.5 Prediction Algorithms for Conditional Random Fields 231 Chapter Summary 235 [2] Continue Reading 235 Exercises 236 References 236 Chapter 12 Summary of Supervised Learning Methods 237 Part 2 Unsupervised Learning Chapter 13 Introduction to Unsupervised Learning 245 13.1 Basic Principles of Unsupervised Learning 245 13.2 Basic Problems 246 13.3 Three Elements of Machine Learning 249 13.4 Unsupervised Learning Methods 249 Chapter Summary 253 Continue Reading 254 References 254 Chapter 14 Clustering Methods 255 14.1 Basic Concepts of Clustering 255 14.1.1 Similarity or Distance 255 14.1.2 Classes or Clusters 258 14.1.3 Distances between clusters 260 14.2 Hierarchical clustering 261 14.3 k-means clustering 263 14.3.1 Model 263 14.3.2 Strategy 263 14.3.3 Algorithm 264 14.3.4 Algorithm characteristics 266 Chapter overview 267 Continue reading 268 Exercises 269 References 269 Chapter 15 Singular value decomposition 271 15.1 Definition and properties of singular value decomposition 271 15.1.1 Definition and theorem 271 15.1.2 Compact singular value decomposition and truncated singular value decomposition 276 15.1.3 Geometric interpretation 279 15.1.4 Main properties 280 15.2 Computation of singular value decomposition 282 15.3 Singular value decomposition and matrix approximation 286 15.3.1 Frobenius norm 286 15.3.2 Optimal approximation of a matrix 287 15.3.3 Outer product expansion of a matrix 290 Chapter overview 292 Continue reading 294 Exercises 294 References 295 Chapter 16 Principal component analysis 297 16.1 Population principal component analysis 297 16.1.1 Basic idea 297 16.1.2 Definition and derivation 299 16.1.3 Main properties 301 16.1.4 Number of principal components 306 16.1.5 Population principal components of normalized variables 309 16.2 Sample principal component analysis 310 16.2.1 Definition and properties of sample principal components 310 16.2.2 Eigenvalue decomposition algorithm for correlation matrices 312 16.2.3 Singular value decomposition algorithm for data matrices 315 Chapter overview 317 Continue reading.319 Exercises.320 References.320 Chapter 17 Latent Semantic Analysis.321 17.1 Word Vector Space and Topic Vector Space.321 17.1.1 Word Vector Space.321 17.1.2 Topic Vector Space.324 17.2 Latent Semantic Analysis Algorithm.327 17.2.1 Matrix Singular Value Decomposition Algorithm.327 17.2.2 Example.329 17.3 Non-negative Matrix Factorization Algorithm.331 17.3.1 Non-negative Matrix Factorization.331 17.3.2 Latent Semantic Analysis Model.332 17.3.3 Formalization of Non-negative Matrix Factorization.332 17.3.4 Algorithm.333 Chapter Summary.335 Continue reading.337 Exercises.337 References.337 Chapter 18 Probabilistic Latent Semantic Analysis.339 18.1 Probabilistic Latent Semantic Analysis Model.339 18.1.1 Basic Idea 339 18.1.2 Generative Model 340 18.1.3 Co-occurrence Model 341 18.1.4 Model Properties 342 18.2 Algorithm of Probabilistic Latent Semantic Analysis 345 Chapter Summary 347 Continue Reading 348 Exercises 348 References 349 Chapter 19 Markov Chain Monte Carlo Method 351 19.1 Monte Carlo Method 351 19.1.1 Random Sampling 351 19.1.2 Estimation of Mathematical Expectation 353 19.1.3 Integral Calculation 353 19.2 Markov Chain 355 19.2.1 Basic Definition 355 19.2.2 Discrete State Markov Chain 356 19.2.3 Continuous State Markov Chain 362 19.2.4 Properties of Markov Chain 363 19.3 Markov Chain Monte Carlo Method 367 19.3.1 Basic Idea 367 19.3.2 Basic Steps 369 19.3.3 Markov Chain Monte Carlo Method and Statistical Learning 369 19.4 Metropolis-Hastings Algorithm 370 19.4.1 Basic Principle 370 19.4.2 Metropolis-Hastings Algorithm 373 19.4.3 Single Component Metropolis-Hastings Algorithm 374 19.5 Gibbs Sampling 375 19.5.1 Basic Principle 376 19.5.2 Gibbs Sampling Algorithm 377 19.5.3 Sampling Calculation 378 Chapter Summary 379 Continue Reading 381 Exercises 381 References 383 Chapter 20 Latent Dirichlet Allocation 385 20.1 Dirichlet Distribution 385 20.1.1 Distribution Definition 385 20.1.2 Conjugate Prior 389 20.2 Latent Dirichlet Allocation Model 390 20.2.1 Basic Idea 390 20.2.2 Model Definition 391 20.2.3 Probabilistic Graphical Model 393 20.2.4 Exchangeability of Random Variable Sequences 394 20.2.5 Probability Formula 395 20.3 Gibbs Sampling Algorithm for LDA 396 20.3.1 Basic Idea 396 20.3.2 Main Parts of the Algorithm 397 20.3.3 Post-processing of the Algorithm 399 20.3.4 Algorithm 399 20.4 Variational EM Algorithm for LDA 401 20.4.1 Variational Inference 401 20.4.2 Variational EM Algorithm 403 20.4.3 Algorithm Derivation 404 20.4.4 Algorithm Summary 411 Chapter Summary.411 Continue Reading.413 Exercises.413 References.413 Chapter 21 PageRank Algorithm.415 21.1 Definition of PageRank.415 21.1.1 Basic Idea.415 21.1.2 Directed Graph and Random Walk Model.416 21.1.3 Basic Definition of PageRank.418 21.1.4 General Definition of PageRank.421 21.2 Calculation of PageRank.423 21.2.1 Iterative Algorithm.423 21.2.2 Power Method.425 21.2.3 Algebraic Algorithm.430 Chapter Summary.430 Continue Reading.432 Exercises.432 References.432 Chapter 22 Summary of Unsupervised Learning Methods.435 22.1 Relationships and Characteristics of Unsupervised Learning Methods.435 22.1.1 Relationships between Various Methods.435 22.1.2 Unsupervised Learning Methods.436 22.1.3 Basic Machine Learning Methods 437 22.2 Relationships and Characteristics of Topic Models 437 References 438 Appendix A Gradient Descent Method 439 Appendix B Newton Method and Quasi-Newton Method 441 Appendix C Lagrange Duality 447 Appendix D Basic Subspaces of Matrix 451 Appendix Definition of EKL Divergence and Properties of Dirichlet Distribution 455 Index 4574 Main properties 280 15.2 Computation of singular value decomposition 282 15.3 Singular value decomposition and matrix approximation 286 15.3.1 Frobenius norm 286 15.3.2 Optimal approximation of a matrix 287 15.3.3 Outer product expansion of a matrix 290 Chapter overview 292 Continue reading 294 Exercises 294 References 295 Chapter 16 Principal component analysis 297 16.1 Population principal component analysis 297 16.1.1 Basic idea 297 16.1.2 Definition and derivation 299 16.1.3 Main properties 301 16.1.4 Number of principal components 306 16.1.5 Population principal components of normalized variables 309 16.2 Sample principal component analysis 310 16.2.1 Definition and properties of sample principal components 310 16.2.2 Eigenvalue Decomposition Algorithm for Correlation Matrix 312 16.2.3 Singular Value Decomposition Algorithm for Data Matrix 315 Chapter Summary 317 Continue Reading 319 Exercises 320 References 320 Chapter 17 Latent Semantic Analysis 321 17.1 Word Vector Space and Topic Vector Space 321 17.1.1 Word Vector Space 321 17.1.2 Topic Vector Space 324 17.2 Latent Semantic Analysis Algorithm 327 17.2.1 Matrix Singular Value Decomposition Algorithm 327 17.2.2 Example 329 17.3 Non-negative Matrix Factorization Algorithm 331 17.3.1 Non-negative Matrix Factorization 331 17.3.2 Latent Semantic Analysis Model 332 17.3.3 Formalization of Non-negative Matrix Factorization 332 17.3.4 Algorithm 333 Chapter Summary 335 Continue Reading 337 Exercises.337 References.337 Chapter 18 Probabilistic Latent Semantic Analysis.339 18.1 Probabilistic Latent Semantic Analysis Model.339 18.1.1 Basic Idea.339 18.1.2 Generative Model.340 18.1.3 Co-occurrence Model.341 18.1.4 Model Properties.342 18.2 Probabilistic Latent Semantic Analysis Algorithm.345 Chapter Summary.347 Continue Reading.348 Exercises.348 References.349 Chapter 19 Markov Chain Monte Carlo Method.351 19.1 Monte Carlo Method.351 19.1.1 Random Sampling.351 19.1.2 Mathematical Expectation Estimation.353 19.1.3 Integral Calculation.353 19.2 Markov Chain.355 19.2.1 Basic Definition.355 19.2.2 Discrete State Markov Chain.356 19.2.3 Continuous-state Markov chain 362 19.2.4 Properties of Markov chain 363 19.3 Markov chain Monte Carlo method 367 19.3.1 Basic idea 367 19.3.2 Basic steps 369 19.3.3 Markov chain Monte Carlo method and statistical learning 369 19.4 Metropolis-Hastings algorithm 370 19.4.1 Basic principle 370 19.4.2 Metropolis-Hastings algorithm 373 19.4.3 Single-component Metropolis-Hastings algorithm 374 19.5 Gibbs sampling 375 19.5.1 Basic principle 376 19.5.2 Gibbs sampling algorithm 377 19.5.3 Sampling calculation 378 Chapter summary 379 Continue reading 381 Exercises 381 References.383 Chapter 20 Latent Dirichlet Allocation385 20.1 Dirichlet Distribution385 20.1.1 Distribution Definition385 20.1.2 Conjugate Prior389 20.2 Latent Dirichlet Allocation Model390 20.2.1 Basic Idea390 20.2.2 Model Definition391 20.2.3 Probabilistic Graphical Models393 20.2.4 Exchangeability of Random Variable Sequences394 20.2.5 Probability Formula395 20.3 Gibbs Sampling Algorithm for LDA396 20.3.1 Basic Idea396 20.3.2 Main Parts of the Algorithm397 20.3.3 Post-processing of the Algorithm399 20.3.4 Algorithm399 20.4 Variational EM Algorithm for LDA401 20.4.1 Variational Inference401 20.4.2 Variational EM Algorithm 403 20.4.3 Derivation of the Algorithm 404 20.4.4 Summary of the Algorithm 411 Chapter Overview 411 Continue Reading 413 Exercises 413 References 413 Chapter 21 PageRank Algorithm 415 21.1 Definition of PageRank 415 21.1.1 Basic Idea 415 21.1.2 Directed Graph and Random Walk Model 416 21.1.3 Basic Definition of PageRank 418 21.1.4 General Definition of PageRank 421 21.2 Calculation of PageRank 423 21.2.1 Iterative Algorithm 423 21.2.2 Power Method 425 21.2.3 Algebraic Algorithm 430 Chapter Overview 430 Continue Reading 432 Exercises 432 References 432 Chapter 22 Summary of Unsupervised Learning Methods 435 22.1 Relationships and Characteristics of Unsupervised Learning Methods 435 22.1.1 Relationships among Various Methods 435 22.1.2 Unsupervised Learning Methods 436 22.1.3 Basic Machine Learning Methods 437 22.2 Relationships and Characteristics of Topic Models 437 References 438 Appendix A Gradient Descent Method 439 Appendix B Newton Method and Quasi-Newton Method 441 Appendix C Lagrange Duality 447 Appendix D Basic Subspaces of Matrices 451 Appendix Definition of EKL Divergence and Properties of Dirichlet Distribution 455 Index 4574 Main properties 280 15.2 Computation of singular value decomposition 282 15.3 Singular value decomposition and matrix approximation 286 15.3.1 Frobenius norm 286 15.3.2 Optimal approximation of a matrix 287 15.3.3 Outer product expansion of a matrix 290 Chapter overview 292 Continue reading 294 Exercises 294 References 295 Chapter 16 Principal component analysis 297 16.1 Population principal component analysis 297 16.1.1 Basic idea 297 16.1.2 Definition and derivation 299 16.1.3 Main properties 301 16.1.4 Number of principal components 306 16.1.5 Population principal components of normalized variables 309 16.2 Sample principal component analysis 310 16.2.1 Definition and properties of sample principal components 310 16.2.2 Eigenvalue Decomposition Algorithm for Correlation Matrix 312 16.2.3 Singular Value Decomposition Algorithm for Data Matrix 315 Chapter Summary 317 Continue Reading 319 Exercises 320 References 320 Chapter 17 Latent Semantic Analysis 321 17.1 Word Vector Space and Topic Vector Space 321 17.1.1 Word Vector Space 321 17.1.2 Topic Vector Space 324 17.2 Latent Semantic Analysis Algorithm 327 17.2.1 Matrix Singular Value Decomposition Algorithm 327 17.2.2 Example 329 17.3 Non-negative Matrix Factorization Algorithm 331 17.3.1 Non-negative Matrix Factorization 331 17.3.2 Latent Semantic Analysis Model 332 17.3.3 Formalization of Non-negative Matrix Factorization 332 17.3.4 Algorithm 333 Chapter Summary 335 Continue Reading 337 Exercises.337 References.337 Chapter 18 Probabilistic Latent Semantic Analysis.339 18.1 Probabilistic Latent Semantic Analysis Model.339 18.1.1 Basic Idea.339 18.1.2 Generative Model.340 18.1.3 Co-occurrence Model.341 18.1.4 Model Properties.342 18.2 Probabilistic Latent Semantic Analysis Algorithm.345 Chapter Summary.347 Continue Reading.348 Exercises.348 References.349 Chapter 19 Markov Chain Monte Carlo Method.351 19.1 Monte Carlo Method.351 19.1.1 Random Sampling.351 19.1.2 Mathematical Expectation Estimation.353 19.1.3 Integral Calculation.353 19.2 Markov Chain.355 19.2.1 Basic Definition.355 19.2.2 Discrete State Markov Chain.356 19.2.3 Continuous-state Markov chain 362 19.2.4 Properties of Markov chain 363 19.3 Markov chain Monte Carlo method 367 19.3.1 Basic idea 367 19.3.2 Basic steps 369 19.3.3 Markov chain Monte Carlo method and statistical learning 369 19.4 Metropolis-Hastings algorithm 370 19.4.1 Basic principle 370 19.4.2 Metropolis-Hastings algorithm 373 19.4.3 Single-component Metropolis-Hastings algorithm 374 19.5 Gibbs sampling 375 19.5.1 Basic principle 376 19.5.2 Gibbs sampling algorithm 377 19.5.3 Sampling calculation 378 Chapter summary 379 Continue reading 381 Exercises 381 References.383 Chapter 20 Latent Dirichlet Allocation385 20.1 Dirichlet Distribution385 20.1.1 Distribution Definition385 20.1.2 Conjugate Prior389 20.2 Latent Dirichlet Allocation Model390 20.2.1 Basic Idea390 20.2.2 Model Definition391 20.2.3 Probabilistic Graphical Models393 20.2.4 Exchangeability of Random Variable Sequences394 20.2.5 Probability Formula395 20.3 Gibbs Sampling Algorithm for LDA396 20.3.1 Basic Idea396 20.3.2 Main Parts of the Algorithm397 20.3.3 Post-processing of the Algorithm399 20.3.4 Algorithm399 20.4 Variational EM Algorithm for LDA401 20.4.1 Variational Inference401 20.4.2 Variational EM Algorithm 403 20.4.3 Derivation of the Algorithm 404 20.4.4 Summary of the Algorithm 411 Chapter Overview 411 Continue Reading 413 Exercises 413 References 413 Chapter 21 PageRank Algorithm 415 21.1 Definition of PageRank 415 21.1.1 Basic Idea 415 21.1.2 Directed Graph and Random Walk Model 416 21.1.3 Basic Definition of PageRank 418 21.1.4 General Definition of PageRank 421 21.2 Calculation of PageRank 423 21.2.1 Iterative Algorithm 423 21.2.2 Power Method 425 21.2.3 Algebraic Algorithm 430 Chapter Overview 430 Continue Reading 432 Exercises 432 References 432 Chapter 22 Summary of Unsupervised Learning Methods 435 22.1 Relationships and Characteristics of Unsupervised Learning Methods 435 22.1.1 Relationships among Various Methods 435 22.1.2 Unsupervised Learning Methods 436 22.1.3 Basic Machine Learning Methods 437 22.2 Relationships and Characteristics of Topic Models 437 References 438 Appendix A Gradient Descent Method 439 Appendix B Newton Method and Quasi-Newton Method 441 Appendix C Lagrange Duality 447 Appendix D Basic Subspaces of Matrices 451 Appendix Definition of EKL Divergence and Properties of Dirichlet Distribution 455 Index 4573 Single-component Metropolis-Hastings algorithm 374 19.5 Gibbs sampling 375 19.5.1 Basic principle 376 19.5.2 Gibbs sampling algorithm 377 19.5.3 Sampling calculation 378 Chapter summary 379 Continue reading 381 Exercises 381 References 383 Chapter 20 Latent Dirichlet allocation 385 20.1 Dirichlet distribution 385 20.1.1 Distribution definition 385 20.1.2 Conjugate prior 389 20.2 Latent Dirichlet allocation model 390 20.2.1 Basic idea 390 20.2.2 Model definition 391 20.2.3 Probabilistic graphical model 393 20.2.4 Exchangeability of random variable sequences 394 20.2.5 Probability formula 395 20.3 Gibbs Sampling Algorithm for LDA 396 20.3.1 Basic Idea 396 20.3.2 Main Parts of the Algorithm 397 20.3.3 Post-processing of the Algorithm 399 20.3.4 Algorithm 399 20.4 Variational EM Algorithm for LDA 401 20.4.1 Variational Inference 401 20.4.2 Variational EM Algorithm 403 20.4.3 Algorithm Derivation 404 20.4.4 Algorithm Summary 411 Chapter Overview 411 Continue Reading 413 Exercises 413 References 413 Chapter 21 PageRank Algorithm 415 21.1 Definition of PageRank 415 21.1.1 Basic Idea 415 21.1.2 Directed Graph and Random Walk Model 416 21.1.3 Basic Definition of PageRank 418 21.1.4 General Definition of PageRank 421 21.2 Calculation of PageRank 423 21.2.1 Iterative Algorithm 423 21.2.2 Power Method 425 21.2.3 Algebraic Algorithm 430 Chapter Summary 430 Continue Reading 432 Exercises 432 References 432 Chapter 22 Summary of Unsupervised Learning Methods 435 22.1 Relationships and Characteristics of Unsupervised Learning Methods 435 22.1.1 Relationships between Various Methods 435 22.1.2 Unsupervised Learning Methods 436 22.1.3 Basic Machine Learning Methods 437 22.2 Relationships and Characteristics between Topic Models 437 References 438 Appendix A Gradient Descent Method 439 Appendix B Newton Method and Quasi-Newton Method 441 Appendix C Lagrange Duality 447 Appendix D Basic Subspaces of Matrices 451 Appendix Definition of EKL divergence and properties of Dirichlet distribution.455 Index.4573 Single-component Metropolis-Hastings algorithm 374 19.5 Gibbs sampling 375 19.5.1 Basic principle 376 19.5.2 Gibbs sampling algorithm 377 19.5.3 Sampling calculation 378 Chapter summary 379 Continue reading 381 Exercises 381 References 383 Chapter 20 Latent Dirichlet allocation 385 20.1 Dirichlet distribution 385 20.1.1 Distribution definition 385 20.1.2 Conjugate prior 389 20.2 Latent Dirichlet allocation model 390 20.2.1 Basic idea 390 20.2.2 Model definition 391 20.2.3 Probabilistic graphical model 393 20.2.4 Exchangeability of random variable sequences 394 20.2.5 Probability formula 395 20.3 Gibbs Sampling Algorithm for LDA 396 20.3.1 Basic Idea 396 20.3.2 Main Parts of the Algorithm 397 20.3.3 Post-processing of the Algorithm 399 20.3.4 Algorithm 399 20.4 Variational EM Algorithm for LDA 401 20.4.1 Variational Inference 401 20.4.2 Variational EM Algorithm 403 20.4.3 Algorithm Derivation 404 20.4.4 Algorithm Summary 411 Chapter Overview 411 Continue Reading 413 Exercises 413 References 413 Chapter 21 PageRank Algorithm 415 21.1 Definition of PageRank 415 21.1.1 Basic Idea 415 21.1.2 Directed Graph and Random Walk Model 416 21.1.3 Basic Definition of PageRank 418 21.1.4 General Definition of PageRank 421 21.2 Calculation of PageRank 423 21.2.1 Iterative Algorithm 423 21.2.2 Power Method 425 21.2.3 Algebraic Algorithm 430 Chapter Summary 430 Continue Reading 432 Exercises 432 References 432 Chapter 22 Summary of Unsupervised Learning Methods 435 22.1 Relationships and Characteristics of Unsupervised Learning Methods 435 22.1.1 Relationships between Various Methods 435 22.1.2 Unsupervised Learning Methods 436 22.1.3 Basic Machine Learning Methods 437 22.2 Relationships and Characteristics between Topic Models 437 References 438 Appendix A Gradient Descent Method 439 Appendix B Newton Method and Quasi-Newton Method 441 Appendix C Lagrange Duality 447 Appendix D Basic Subspaces of Matrices 451 Appendix Definition of EKL divergence and properties of Dirichlet distribution.455 Index.457

Download

The Founder

: 黄土马家

Selected Subjects

Latest Downloading