【视频教程】社会科学家的机器学习

社会科学家的机器学习
社会科学家的机器学习_哔哩哔哩 (゜-゜)つロ 干杯~-bilibili
本工作坊面向社会科学研究者,采用Python介绍机器学习(和深度学习)的基本逻辑(需要学员提前安装Anaconda),主要内容包括三(或四)个部分:1. 机器学习简介:从泰坦尼克号讲起;2. 机器学习初步: 朴素贝叶斯与线性回归;3. 机器学习进阶:支持向量机与随机森林;4. 机器学习扩展:基于Pytorch的神经网络模型(备选)。本工作坊所使用到的Slides、Python代码、阅读文献、案例见 https://github.com/computational-class/machine-learning
Note: 本部分基于python介绍机器学习的基本逻辑和算法,需要学员提前安装Anaconda、熟悉Jupyter notebook的使用、安装pytorch)。
https://github.com/computational-class/machine-learning
Slides
- What Is Machine Learning?
- Introducing Scikit-Learn
- Hyperparameters and Model Validation
- Feature Engineering
- In Depth: Naive Bayes Classification
- In Depth: Linear Regression
- In-Depth: Support Vector Machines
- In-Depth: Decision Trees and Random Forests
- In Depth: Neural Networks 1
- In Depth: Neural Networks 2
- In Depth: Neural Networks 3
案例
- 房价预测 https://www.kaggle.com/c/house-prices-advanced-regression-techniques/
- 预测银行用户是否参与定期存款 http://www.dcjingsai.com/common/cmpt/ANZ%20Chengdu%20Data%20Science%20Competition_%E7%AB%9E%E8%B5%9B%E4%BF%A1%E6%81%AF.html?lang=en_US
- 游戏玩家的付费预测 http://www.dcjingsai.com/common/cmpt/%E6%B8%B8%E6%88%8F%E7%8E%A9%E5%AE%B6%E4%BB%98%E8%B4%B9%E9%87%91%E9%A2%9D%E9%A2%84%E6%B5%8B%E5%A4%A7%E8%B5%9B_%E7%AB%9E%E8%B5%9B%E4%BF%A1%E6%81%AF.html
- 预测假新闻 https://www.kaggle.com/c/fake-news
推荐教材
- Whirlwind Tour Of Python https://jakevdp.github.io/WhirlwindTourOfPython/
- Python Data Science Handbook https://jakevdp.github.io/PythonDataScienceHandbook/
参考书
- Python for Data Analysis by Wes McKinney, published by O'Reilly Media https://github.com/data-science-lab/pydata-book
- Introduction to Machine Learning with Python: A Guide for Data Scientist https://github.com/amueller/introduction_to_ml_with_python.
- Machine Learning in Action https://github.com/pbharrin/machinelearninginaction & https://github.com/RedstoneWill/MachineLearningInAction-Camp & https://github.com/TingNie/Machine-learning-in-action
- 周志华《机器学习》,北京:清华大学出版社,2016. (ISBN 978-7-302-42328-7)
- Easley, David and Jon Kleinberg. 2011.Networks, Crowds, and Markets: Reasoning About a Highly Connected World. New York: Cambridge University. 大卫・伊斯利, & 乔恩・克莱因伯格. (2011). 网络、群体与市场:揭示高度互联世界的行为原理与效应机制. 清华大学出版社.
- Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman.2011. Mining massive datasets (2nd)http://www.mmds.org/
相关课程
- 南京大学《大数据挖掘与分析》课程 https://github.com/computational-class/bigdata
- 用Python玩转数据_中国大学MOOC(慕课) http://www.icourse163.org/course/nju-1001571005
- Advanced Machine Learning with scikit-learn, Andreas Müller http://bit.ly/advanced_machine_learning_scikit-learn & https://github.com/computational-class/PythonMachineLearning