Tutorials 参考 练习 Videos NEW Menu
Paid Courses Spaces NEW

Python 教程

Python主页 Python简介 Python入门 Python语法 Python注释 Python变量 Python数据类型 Python数字 Python转换 Python字符串 Python布尔值 Python运算子 Python列表 Python元组 Python集 Python字典 Python If...Else Python While循环 Python的循环 Python函数 Python Lambda Python数组 Python类/对象 Python继承 Python迭代器 Python范围 Python模块 Python日期 Python数学 Python JSON Python正则表达式 Python PIP Python Try...Except Python用户输入 Python字符串格式

文件处理

Python 文件处理 Python读取文件 Python写入/创建文件 Python删除文件

Python模块

NumPy Tutorial Pandas Tutorial SciPy Tutorial

Python Matplotlib

Matplotlib Intro Matplotlib Get Started Matplotlib Pyplot Matplotlib Plotting Matplotlib Markers Matplotlib Line Matplotlib 标签 Matplotlib Grid Matplotlib Subplots Matplotlib Scatter Matplotlib Bars Matplotlib Histograms Matplotlib Pie Charts

机器学习

入门 平均中位数模式 标准偏差 百分位 资料分配 Normal 资料分配 散点图 线性回归 多项式回归 多重回归 规模 训练/测试 决策树

Python MySQL

MySQL入门 MySQL创建数据库 MySQL创建表 MySQL输入sert MySQL选择 MySQL位置 MySQL排序依据 MySQL删除 MySQL删除表 MySQL更新 MySQL限制 MySQL连接

Python MongoDB

MongoDB入门 MongoDB创建数据库 MongoDB创建集合 MongoDB插入 MongoDB查找 MongoDB查询 MongoDB排序 MongoDB删除 MongoDB Drop集合 MongoDB更新 MongoDB限制

Python参考

Python概述 Python内置函数 Python字符串方法 Python列表方法 Python字典方法 Python元组方法 Python Set方法 Python文件方法 Python关键字 Python异常 Python词汇表

Module 参考

随机模块 请求模块 Statistics Module 数学模块 c数学模块

Python如何

删除列表重复项 反转字符串 ADD 两个数字

Python 示例

Python 示例 Python编译器 Python 练习 Python 测试 Python 证书

机器学习

机器学习 is making the computer learn from studying data and statistics.

机器学习 is a step into the direction of artificial intelligence (AI).

机器学习 is a program that analyses data and learns to predict the outcome.

Where To Start?

In this tutorial we will go back to mathematics and study statistics, and how to calculate important numbers based on data sets.

We will also learn how to use various Python modules to get the answers we need.

And we will learn how to make functions that are able to predict the outcome based on what we have learned.


Data Set

In the mind of a computer, a data set is any collection of data. It can be anything from an array to a complete database.

例子 of an array:

[99,86,87,88,111,86,103,87,94,78,77,85,86]

例子 of a database:

CarnameColorAgeSpeedAutoPass
BMWred599Y
Volvoblack786Y
VWgray887N
VWwhite788Y
Fordwhite2111Y
VWwhite1786Y
Teslared2103Y
BMWblack987Y
Volvogray494N
Fordwhite1178N
Toyotagray1277N
VWwhite985N
Toyotablue686Y

By looking at the array, we can guess that the average value is probably around 80 or 90, and we are also able to determine the highest value and the lowest value, but what else can we do?

And by looking at the database we can see that the most popular color is white, and the oldest car is 17 years, but what if we could predict if a car had an AutoPass, just by looking at the other values?

That is what 机器学习 is for! Analyzing data and predicting the outcome!

In 机器学习 it is common to work with very large data sets. In this tutorial we will try to make it as easy as possible to understand the different concepts of machine learning, and we will work with small easy-to-understand data sets.


Data Types

To analyze data, it is important to know what type of data we are dealing with.

We can split the data types into three main categories:

  • Numerical
  • Categorical
  • Ordinal

Numerical data are numbers, and can be split into two numerical categories:

  • Discrete Data
    - numbers that are limited to integers. 例子: The number of cars passing by.
  • Continuous Data
    - numbers that are of infinite value. 例子: The price of an item, or the size of an item

Categorical data are values that cannot be measured up against each other. 例子: a color value, or any yes/no values.

Ordinal data are like categorical data, but can be measured up against each other. 例子: school grades where A is better than B and so on.

By knowing the data type of your data source, you will be able to know what technique to use when analyzing them.

You will learn more about statistics and analyzing data in the next chapters.