# Blog

## 2021

- Jul 9, 2021Introduction to Deep Learning -- 170 Video Lectures from Adaptive Linear Neurons to Zero-shot Classification with Transformers

I just sat down this morning and organized all deep learning related videos I recorded in 2021. I am sure this will be a useful reference for my future self, but I am also hoping it might be useful for one or the other person out there. PS: All code examples are in PyTorch :) - Feb 11, 2021Datasets for Machine Learning and Deep Learning -- Some of the Best Places to Explore

随着学期的如火如荼，我最近与我的深度学习课程分享了这套数据集存储库。但是，我认为除了使用此列表来寻找有趣的学生课程项目的灵感外，这些也是为您的模型寻找其他Bechmark数据集的好地方。 - Jan 21, 2021书评:深度学习与PyTorch——Practical Deep Learning Guide With a Computer Vision Focus and an Interesting Structure

在2020年8月发行后，与Pytorch进行深度学习一直坐在我的架子上，然后我终于有机会在这个寒假期间阅读它。事实证明，这是放松假期后的一点点生产力的完美随和的阅读材料。正如上周承诺的那样，这是我的想法。 - Jan 3, 2021How I Keep My Projects Organized

Since I started my undergraduate studies in 2008, I have been obsessed with productivity tips, notetaking solutions, and todo-list management. Over the years, I tried many, many workflows and hundreds of (mostly digital) tools to keep my life, projects, and notes organized. Occasionally, I exchange ideas with friends and colleagues, and upon request, I talked about my workflow a couple of times on Twitter. After today's 2021-edition of this discussion, I thought that writing a quick and informal blogpost makes sense, making it easier to read and having a quick reference if someone asks about it again :).

## 2020

- Sep 27, 2020Scientific Computing in Python: Introduction to NumPy and Matplotlib -- Including Video Tutorials

由于我的STAT 451中的许多学生（机器学习和统计模式分类简介）班级对Python和Numpy都是相对较新的，因此我最近致力于后者。由于课程注释是基于一个交互式jupyter笔记本文件（我用作讲座视频的基础），因此我认为值得将其重新格式化为博客文章，其中包含嵌入式“叙述内容”（视频录制）。 - 2020年8月26日Interpretable Machine Learning -- Book Review and Thoughts about Linear and Logistic Regression as Interpretable Models

在这篇博客文章中，我（简短地）回顾了克里斯托夫·莫尔纳（Christoph Molnar）的 *可解释的机器学习书 *。然后，我正在写两个经典的广义线性模型，即线性和逻辑回归。主要是，这篇博客文章解释了特征权重和预测之间的关系，并演示了如何通过Python构建置信区间。 - Aug 5, 2020Chapter 1: Introduction to Machine Learning and Deep Learning

The first chapter (draft) of the Introduction to Deep Learning book, which is a book based on my lecture notes and slides. - 2020年1月6日书评：马丁·福特的智力建筑师

A brief review of Martin Ford's book that features interviews with 23 of the most well-known and brightest minds working on AI.

## 2019

- Dec 12, 2019What's New in the 3rd Edition

A brief summary of what's new in the 3rd edition of Python Machine Learning. - 2019年5月24日我在威斯康星大学麦迪逊分校的第一年和一个很棒的学生项目画廊

不久前，在2018年夏天，我很高兴能在获得我的博士学位后加入威斯康星大学麦迪逊分校的统计系。经过〜5个长期且富有成效的岁月。现在，在决赛周后的两个学期之后，我终于找到了一些安静的日子，回顾自那时以来发生的事情。在这篇文章中，我分享了简短的反思以及我的学生正在从事的一些激动人心的项目。

## 2018

- Nov 10, 2018Model evaluation, model selection, and algorithm selection in machine learning Part IV - Comparing the performance of machine learning models and algorithms using statistical tests and nested cross-validation

This final article in the series *Model evaluation, model selection, and algorithm selection in machine learning* presents overviews of several statistical hypothesis testing approaches, with applications to machine learning model and algorithm comparisons. This includes statistical tests based on target predictions for independent test sets (the downsides of using a single test set for model comparisons was discussed in previous articles) as well as methods for algorithm comparisons by fitting and evaluating models via cross-validation. Lastly, this article will introduce *nested cross-validation*, which has become a common and recommended a method of choice for algorithm comparisons for small to moderately-sized datasets. - Aug 2, 2018通过半逆转神经网络生成性别中立的面部图像以增强隐私

我认为，对于最近的项目，与更普遍的受众（包括同事和学生）分享他们的简短简明摘要是很好的。因此，我挑战自己要使用少于1000个单词，而不会因细节细节和技术术语而分心。在这篇文章中，我主要介绍了我最近与[Iprobe Lab]（http://iprobe.cse.msu.edu）合作的一些研究，该研究属于开发方法的广泛类别，以隐藏特定信息在面部图像中。这篇文章中讨论的研究是关于“最大化隐私的同时保存公用事业”。

## 2016

- Oct 2, 2016Model evaluation, model selection, and algorithm selection in machine learning Part III - Cross-validation and hyperparameter tuning

Almost every machine learning algorithm comes with a large number of settings that we, the machine learning researchers and practitioners, need to specify. These tuning knobs, the so-called hyperparameters, help us control the behavior of machine learning algorithms when optimizing for performance, finding the right balance between bias and variance. Hyperparameter tuning for performance optimization is an art in itself, and there are no hard-and-fast rules that guarantee best performance on a given dataset. In Part I and Part II, we saw different holdout and bootstrap techniques for estimating the generalization performance of a model. We learned about the bias-variance trade-off, and we computed the uncertainty of our estimates. In this third part, we will focus on different methods of cross-validation for model evaluation and model selection. We will use these cross-validation techniques to rank models from several hyperparameter configurations and estimate how well they generalize to independent datasets. - Aug 13, 2016Model evaluation, model selection, and algorithm selection in machine learning Part II - Bootstrapping and uncertainties

In this second part of this series, we will look at some advanced techniques for model evaluation and techniques to estimate the uncertainty of our estimated model performance as well as its variance and stability. Then, in the next article, we will shift the focus onto another task that is one of the main pillar of successful, real-world machine learning applications -- Model Selection. - 2016年6月11日Model evaluation, model selection, and algorithm selection in machine learning Part I - The basics

机器学习已成为我们生活的核心部分 - 作为消费者，客户以及希望作为研究人员和从业者！无论我们是将预测建模技术应用于我们的研究还是业务问题，我相信我们有一个共同点：我们想做出良好的预测！将模型拟合到我们的培训数据是一回事，但是我们如何知道它可以很好地概括地看不见的数据？我们如何知道它不简单地记住我们喂养的数据，也没有对未来样本，以前从未见过的样本做出良好的预测？首先，我们如何选择一个好的模型？也许另一种学习算法可以更好地解决手头的问题？模型评估当然不仅是我们机器学习管道的终点。

Before we handle any data, we want to plan ahead and use techniques that are suited for our purposes. In this article, we will go over a selection of these techniques, and we will see how they fit into the bigger picture, a typical machine learning workflow.

## 2015

- 2015年9月24日Writing 'Python Machine Learning' – A Reflection on a Journey

It's been about time. I am happy to announce that "Python Machine Learning" was finally released today! Sure, I could just send an email around to all the people who were interested in this book. On the other hand, I could put down those 140 characters on Twitter (minus what it takes to insert a hyperlink) and be done with it. Even so, writing "Python Machine Learning" really was quite a journey for a few months, and I would like to sit down in my favorite coffeehouse once more to say a few words about this experience. - Aug 24, 2015Python，机器学习和语言战争 - 高度主观的观点

最近对我来说，这确实是一段旅程。关于“您为什么选择Python进行机器学习？”的问题问题。我想是时候写我的脚本了。在本文中，我真的不是要告诉您为什么您或其他任何人都应该使用Python。但是请阅读您是否对我的意见感兴趣。 - Mar 24, 2015Single-Layer Neural Networks and Gradient Descent

This article offers a brief glimpse of the history and basic concepts of machine learning. We will take a look at the first algorithmically described neural network and the gradient descent algorithm in context of adaptive linear neurons, which will not only introduce the principles of machine learning but also serve as the basis for modern multilayer neural networks in future articles. - 2015年1月27日Principal Component Analysis in 3 Simple Steps

主成分分析（PCA）是一种简单而流行且有用的线性转换技术，用于许多应用程序，例如股票市场预测，基因表达数据的分析等等。在本教程中，我们将看到PCA不仅仅是一个“黑匣子”，我们将以3个基本步骤来解开其内部内容。 - Jan 11, 2015Implementing a Weighted Majority Rule Ensemble Classifier in scikit-learn

Here, I want to present a simple and conservative approach of implementing a weighted majority rule ensemble classifier in scikit-learn that yielded remarkably good results when I tried it in a kaggle competition. For me personally, kaggle competitions are just a nice way to try out and compare different approaches and ideas -- basically an opportunity to learn in a controlled environment with nice datasets.

## 2014

- Dec 5, 2014Musicmood - 一种机器学习模型，用于根据歌曲的歌词对音乐进行分类

在这篇文章中,我想分享我的经验h a recent data mining project which probably was one of my most favorite hobby projects so far. It's all about building a classification model that can automatically predict the mood of music based on song lyrics. - Nov 28, 2014Turn Your Twitter Timeline into a Word Cloud – using Python

Last week, I posted some visualizations in context of Happy Rock Song data mining project, and some people were curious about how I created the word clouds. Learn how to create YOUR personal Twitter Timeline! - Oct 4, 2014Naive Bayes and Text Classification – Introduction and Theory

Naive Bayes classifiers, a family of classifiers that are based on the popular Bayes’ probability theorem, are known for creating simple yet well performing models, especially in the fields of document classification and disease prediction. In this first part of a series, we will take a look at the theory of naive Bayes classifiers and introduce the basic concepts of text classification. In following articles, we will implement those concepts to train a naive Bayes spam filter and apply naive Bayes to song classification based on lyrics. - 2014年9月14日Kernel tricks and nonlinear dimensionality reduction via RBF kernel PCA

The focus of this article is to briefly introduce the idea of kernel methods and to implement a Gaussian radius basis function (RBF) kernel that is used to perform nonlinear dimensionality reduction via KBF kernel principal component analysis (kPCA). - 2014年8月25日Predictive modeling, supervised machine learning, and pattern classification — the big picture

当我正在研究下一个模式分类应用程序时，我意识到可能值得退后一步，看看模式分类的全局，以便将我以前的主题置于上下文，并为未来的主题提供和介绍将会随之而来。 - Aug 3, 2014Linear Discriminant Analysis – Bit by Bit

I received a lot of positive feedback about the step-wise Principal Component Analysis (PCA) implemen