【Linux】第四十一站：线程控制-编程知识

一、Linux线程VS进程

1.进程和线程

进程是资源分配的基本单位
线程是调度的基本单位
线程共享进程数据，但也拥有自己的一部分数据:
线程ID
一组寄存器（上下文）
栈
errno
信号屏蔽字
调度优先级

2.进程的多个线程共享

同一地址空间,因此Text Segment、Data Segment都是共享的,如果定义一个函数,在各线程中都可以调用,如果定义一个全局变量,在各线程中都可以访问到,除此之外,各线程还共享以下进程资源和环境:

文件描述符表
每种信号的处理方式(SIG_ IGN、SIG_ DFL或者自定义的信号处理函数)
当前工作目录
用户id和组id

进程和线程的关系如下图:

3.关于进程线程的问题

如何看待之前学习的单进程？具有一个线程执行流的进程

二、线程控制

1.POSIX线程库

内核中有没有很明确的线程的概念呢？没有的。它只有轻量级进程的概念。

所以它就一定无法给我们提供线程的系统调用，只会给我们提供轻量级进程系统调用！

可是我们用户，需要的是线程的接口！**所以我们的应用层就有了一个pthread线程库。它是将轻量级进程接口进行了封装。为用户提供直接线程的接口。**对于这个pthread线程库，几乎所有的linux平台，都是默认自带这个库的！在Linux中编写多线程代码，需要使用第三方pthread库！

与线程有关的函数构成了一个完整的系列，绝大多数函数的名字都是以“pthread_”打头的
要使用这些函数库，要通过引入头文<pthread.h>
链接这些线程函数库时要使用编译器命令的“-lpthread”选项

2.快速使用一些常见的接口

2.1 创建线程

#include <pthread.h>
int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
void *(*start_routine) (void *), void *arg);

thread:返回线程ID，是一个输出型参数
attr:设置线程的属性，大部分情况下，attr为NULL表示使用默认属性
start_routine:是个函数地址，线程启动后要执行的函数，返回值为void*，参数也是void*，他是新线程所执行的入口函数。void*可以接收或返回任意指针类型，因为C语言没有模板，但是想用泛型。注意在linux平台下，指针是8字节，因为是64位机器，并且void的大小是1，且不可以形如void x这样定义变量
arg:传给线程启动函数的参数。创建线程成功，新线程回调线程函数的时候，需要参数，这个参数就是给线程函数传递的。没有就设置位nullptr
返回值：成功返回0；失败返回错误码。

然后我们用如下代码来进行验证

#include <iostream>
#include <pthread.h>
#include <unistd.h>
using namespace std;void* threadRountine(void* args)
{while(true){cout << "new thread, pid" << getpid() << endl;sleep(2);}
}int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRountine, nullptr);while(true){cout << "main thread, pid" << getpid() << endl;sleep(1);}return 0;
}

运行结果为

注意我们可以发现，它只有一个pid，说明它确实是一个进程，只不过一个进程里有两个执行流

可是如果我们就想查到两个该怎么办呢?我们可以用如下命令

ps -aL   #这里的L我们可以理解为light轻的意思，也就是轻量级进程

其中第二个LWP（light weight process）其实就是轻量级进程的意思。因为轻量级进程也需要有一个编号

其中有一个PID等于LWP，这里说明了这个线程是主线程，剩下的是被创建出来的线程

像我们以前写的单线程代码，其实就是PID永远等于LWP的。

除此之外，一个进程中的任何一个线程被干掉了，那么整个进程都会被干掉

那么这个信号是发给进程还是线程的呢？，其实是发给进程的。因为线程只是进程的一个执行分支。这也就是为什么线程的健壮性很差，因为一个线程被干掉了，其他线程也会被干掉。

我们在看一下下面的代码：

#include <iostream>
#include <pthread.h>
#include <string>
#include <unistd.h>
using namespace std;void show(const string& name)
{cout << name << "say# " << "hello thread" << endl;
}void* threadRountine(void* args)
{while(true){cout << "new thread, pid" << getpid() << endl;show("[new thread]");sleep(2);}
}int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRountine, nullptr);while(true){cout << "main thread, pid" << getpid() << endl;show("[main thread]");sleep(1);}return 0;
}

运行结果为

我们可以看到主线程和新线程都调用了这个方法，说明这个函数可以被多个执行流同时执行。即show函数被重入！

我们还可以看下面的代码

#include <iostream>
#include <pthread.h>
#include <string>
#include <unistd.h>
using namespace std;
int g_val = 100;void show(const string& name)
{cout << name << "say# " << "hello thread" << endl;
}void* threadRountine(void* args)
{while(true){printf("new thread, pid:%d, g_val:%d, &g_val:0x%p\n", getpid(), g_val, &g_val);// cout << "new thread, pid" << getpid() << endl;//show("[new thread]");sleep(2);}
}int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRountine, nullptr);while(true){printf("main thread, pid:%d, g_val:%d, &g_val:0x%p\n", getpid(), g_val, &g_val);// cout << "main thread, pid" << getpid() << ", g_val" << g_val << "&g_val" << &g_val << endl;//show("[main thread]");g_val++;sleep(1);}return 0;
}

运行结果为

我们可以看到，主线程和新线程都可以看到这个变量被修改了。说明两个线程共享这个变量。

所以两个线程想要进行通信实在是太容易了

我们再用下面的代码进行测试

#include <iostream>
#include <pthread.h>
#include <string>
#include <unistd.h>
using namespace std;
int g_val = 100;void show(const string& name)
{cout << name << "say# " << "hello thread" << endl;
}void* threadRountine(void* args)
{while(true){printf("new thread, pid:%d, g_val:%d, &g_val:0x%p\n", getpid(), g_val, &g_val);// cout << "new thread, pid" << getpid() << endl;//show("[new thread]");sleep(5);int a = 10;a = a / 0;}
}int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRountine, nullptr);while(true){printf("main thread, pid:%d, g_val:%d, &g_val:0x%p\n", getpid(), g_val, &g_val);// cout << "main thread, pid" << getpid() << ", g_val" << g_val << "&g_val" << &g_val << endl;//show("[main thread]");g_val++;sleep(1);}return 0;
}

运行结果为：

这就是因为一个线程出现异常了，所以导致整个进程挂掉了

我们接下来在看一下这个tid是什么

#include <iostream>
#include <pthread.h>
#include <string>
#include <unistd.h>
using namespace std;
int g_val = 100;void show(const string& name)
{cout << name << "say# " << "hello thread" << endl;
}void* threadRountine(void* args)
{while(true){printf("new thread, pid:%d, g_val:%d, &g_val:0x%p\n", getpid(), g_val, &g_val);// cout << "new thread, pid" << getpid() << endl;//show("[new thread]");sleep(1);// int a = 10;// a = a / 0;}
}int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRountine, nullptr);while(true){printf("main thread, pid:%d, g_val:%d, &g_val:0x%p, creat new thread tid:%p\n", getpid(), g_val, &g_val, tid);// cout << "main thread, pid" << getpid() << ", g_val" << g_val << "&g_val" << &g_val << endl;//show("[main thread]");g_val++;sleep(1);}return 0;
}

运行结果为

其实这个tid显然不是这个LWP，因为LWP是操作系统认识的就可以了，这tid是我们用户所使用的。至于它的具体使用，我们稍后再谈。

我们再来看看第四个参数，这个第四个参数是给第三个函数进行传参的。

#include <iostream>
#include <pthread.h>
#include <string>
#include <unistd.h>
using namespace std;
int g_val = 100;void show(const string& name)
{cout << name << "say# " << "hello thread" << endl;
}void* threadRountine(void* args)
{const char* name = (const char*)args;while(true){printf("%s, pid:%d, g_val:%d, &g_val:0x%p\n", name, getpid(), g_val, &g_val);// cout << "new thread, pid" << getpid() << endl;//show("[new thread]");sleep(1);// int a = 10;// a = a / 0;}
}int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRountine, (void*)"thread 1");while(true){printf("main thread, pid:%d, g_val:%d, &g_val:0x%p, creat new thread tid:%p\n", getpid(), g_val, &g_val, tid);// cout << "main thread, pid" << getpid() << ", g_val" << g_val << "&g_val" << &g_val << endl;//show("[main thread]");g_val++;sleep(1);}return 0;
}

运行结果为

我们果然看到这个第四个参数被传入了进去

2.2 线程等待

那么这两个线程谁先进行退出呢？一般来说是新线程先退出的，然后主线程才能退出的，因为是主线程创建的它，它要对这个新线程进行管理。

如果我们主线程是一个死循环，而新线程一直不退出，那么也会造成类似于进程中的僵尸进程的问题（当然线程里没有这个说法）。所以新线程被创建出来以后，一般也要被等待，如果不等待，可能会造成类似于僵尸进程的问题。当然这个问题我们是无法验证出来的，因为新线程一退，我们查也就查不到了。但是确确实实会存在这个问题。

更重要的是，我们将新线程创建出来，就是让他就办事的，我们得知道它办的怎么样，结果数据是什么？

所以我们线程等待的两个目的：

防止内存泄漏
如果需要，我们也可以获取一下子进程的退出结果

下面是线程等待的函数

#include <pthread.h>
int pthread_join(pthread_t thread, void **retval);
//Compile and link with -pthread.

如果成功返回0，失败返回错误码。注意：线程里面所有的函数都不用errno错误码，而是直接返回一个错误码。这就保证了所有的线程都可以有一个返回的错误码，不需要去抢占全局的那个变量

关于参数：

第一个参数是线程的tid

第二个参数是该线程结束时的返回值。注意*retval才是void*类型,也就是*retval才是函数的返回值

如下图所示，当void*通过pthread_join的方式传递的时候，会产生一个临时变量。比如说，我们调用函数的时候传递&x,那么&x其实会被拷贝一份，我们这里暂且记作retavl。然后在pthread_join内部执行，*retval = z这一步。最终就成功的为x赋值了。即x就相当于一个输入型参数。

我们可以用如下代码来进行操作一下

#include <iostream>
#include <pthread.h>
#include <string>
#include <unistd.h>
using namespace std;
int g_val = 100;void show(const string& name)
{cout << name << "say# " << "hello thread" << endl;
}void* threadRountine(void* args)
{const char* name = (const char*)args;int cnt = 5;while(true){printf("%s, pid:%d, g_val:%d, &g_val:0x%p\n", name, getpid(), g_val, &g_val);// cout << "new thread, pid" << getpid() << endl;//show("[new thread]");sleep(1);// int a = 10;// a = a / 0;cnt--;if(cnt == 0){break;}}return nullptr; //走到这里就默认线程退出了。
}int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRountine, (void*)"thread 1");// while(true)// {//     printf("main thread, pid:%d, g_val:%d, &g_val:0x%p, creat new thread tid:%p\n", getpid(), g_val, &g_val, tid);//     // cout << "main thread, pid" << getpid() << ", g_val" << g_val << "&g_val" << &g_val << endl;//     //show("[main thread]");//     g_val++;//     sleep(1);// }sleep(7);pthread_join(tid, nullptr);cout << "main thread quit..." << endl; return 0;
}

运行结果为：

我们可以很明显的看到，新线程先退出了，主线程等两秒之后也就退出了。这里我们观察不到新线程有类似于僵尸的状态，但是确确实实是存在的这个状态

我们现在来使用一下第二个参数retval

#include <iostream>
#include <pthread.h>
#include <string>
#include <unistd.h>
using namespace std;
int g_val = 100;void show(const string& name)
{cout << name << "say# " << "hello thread" << endl;
}void* threadRountine(void* args)
{const char* name = (const char*)args;int cnt = 5;while(true){printf("%s, pid:%d, g_val:%d, &g_val:0x%p\n", name, getpid(), g_val, &g_val);// cout << "new thread, pid" << getpid() << endl;//show("[new thread]");sleep(1);// int a = 10;// a = a / 0;cnt--;if(cnt == 0){break;}}//return nullptr; //走到这里就默认线程退出了。return (void*)1; //走到这里就默认线程退出了。
}int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRountine, (void*)"thread 1");// while(true)// {//     printf("main thread, pid:%d, g_val:%d, &g_val:0x%p, creat new thread tid:%p\n", getpid(), g_val, &g_val, tid);//     // cout << "main thread, pid" << getpid() << ", g_val" << g_val << "&g_val" << &g_val << endl;//     //show("[main thread]");//     g_val++;//     sleep(1);// }//sleep(7);void* retval;pthread_join(tid, &retval); //main thread等待的时候，默认是阻塞等待的cout << "main thread quit..., ret: " << (long long)retval << endl; return 0;
}

运行结果为，可以看到我们确实已经拿到了1

不过这里我们会感觉到哪里不对劲，为什么我们在这里join的时候不考虑异常呢？？

其实是因为做不到，因为线程一旦出异常了，主线程也就挂了。所以线程这里不用考虑异常，异常这里是进程考虑的。

2.3 线程终止

如果我们想要终止线程，能否像进程终止一样使用exit函数呢？

我们可以用下面代码来验证一下

#include <iostream>
#include <pthread.h>
#include <string>
#include <cstdlib>
#include <unistd.h>
using namespace std;
int g_val = 100;void show(const string& name)
{cout << name << "say# " << "hello thread" << endl;
}void* threadRountine(void* args)
{const char* name = (const char*)args;int cnt = 5;while(true){printf("%s, pid:%d, g_val:%d, &g_val:0x%p\n", name, getpid(), g_val, &g_val);// cout << "new thread, pid" << getpid() << endl;//show("[new thread]");sleep(1);// int a = 10;// a = a / 0;cnt--;if(cnt == 0){break;}}exit(11);//return nullptr; //走到这里就默认线程退出了。//return (void*)1; //走到这里就默认线程退出了。
}int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRountine, (void*)"thread 1");// while(true)// {//     printf("main thread, pid:%d, g_val:%d, &g_val:0x%p, creat new thread tid:%p\n", getpid(), g_val, &g_val, tid);//     // cout << "main thread, pid" << getpid() << ", g_val" << g_val << "&g_val" << &g_val << endl;//     //show("[main thread]");//     g_val++;//     sleep(1);// }//sleep(7);void* retval;pthread_join(tid, &retval); //main thread等待的时候，默认是阻塞等待的cout << "main thread quit..., ret: " << (long long)retval << endl; return 0;
}

运行结果为

我们可以注意到，主线程并没有打印出对应的main thread quit…。所以说明exit直接将主线程也终止了。

即exit是用来终止进程的！，不能用来直接终止线程

线程终止的接口如下所示：

#include <pthread.h>
void pthread_exit(void *retval);
//Compile and link with -pthread.

它的作用是终止调用这个函数的线程，谁调用它就终止谁。参数是void*，和这个函数的返回值的含义是一样的。

我们用如下代码来进行测试

#include <iostream>
#include <pthread.h>
#include <string>
#include <cstdlib>
#include <unistd.h>
using namespace std;
int g_val = 100;void show(const string& name)
{cout << name << "say# " << "hello thread" << endl;
}void* threadRountine(void* args)
{const char* name = (const char*)args;int cnt = 5;while(true){printf("%s, pid:%d, g_val:%d, &g_val:0x%p\n", name, getpid(), g_val, &g_val);// cout << "new thread, pid" << getpid() << endl;//show("[new thread]");sleep(1);// int a = 10;// a = a / 0;cnt--;if(cnt == 0){break;}}pthread_exit((void*)100);//exit(11);//return nullptr; //走到这里就默认线程退出了。//return (void*)1; //走到这里就默认线程退出了。
}int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRountine, (void*)"thread 1");// while(true)// {//     printf("main thread, pid:%d, g_val:%d, &g_val:0x%p, creat new thread tid:%p\n", getpid(), g_val, &g_val, tid);//     // cout << "main thread, pid" << getpid() << ", g_val" << g_val << "&g_val" << &g_val << endl;//     //show("[main thread]");//     g_val++;//     sleep(1);// }//sleep(7);void* retval;pthread_join(tid, &retval); //main thread等待的时候，默认是阻塞等待的cout << "main thread quit..., ret: " << (long long)retval << endl; return 0;
}

运行结果为

上面是新线程去调用pthread_exit接口，那么只有这个线程会退出，如果主线程去调用这个接口退出的话，那么整个进程都会终止

2.4 线程取消

如下所示，是线程取消的接口。

#include <pthread.h>
int pthread_cancel(pthread_t thread);
//Compile and link with -pthread.

我们用如下代码来进行验证

#include <iostream>
#include <pthread.h>
#include <string>
#include <cstdlib>
#include <unistd.h>
using namespace std;
int g_val = 100;void show(const string& name)
{cout << name << "say# " << "hello thread" << endl;
}void* threadRountine(void* args)
{const char* name = (const char*)args;int cnt = 5;while(true){printf("%s, pid:%d, g_val:%d, &g_val:0x%p\n", name, getpid(), g_val, &g_val);// cout << "new thread, pid" << getpid() << endl;//show("[new thread]");sleep(1);// int a = 10;// a = a / 0;cnt--;if(cnt == 0){break;}}//pthread_exit((void*)100);//exit(11);//return nullptr; //走到这里就默认线程退出了。//return (void*)1; //走到这里就默认线程退出了。
}int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRountine, (void*)"thread 1");// while(true)// {//     printf("main thread, pid:%d, g_val:%d, &g_val:0x%p, creat new thread tid:%p\n", getpid(), g_val, &g_val, tid);//     // cout << "main thread, pid" << getpid() << ", g_val" << g_val << "&g_val" << &g_val << endl;//     //show("[main thread]");//     g_val++;//     sleep(1);// }//sleep(7);sleep(1); //保证新线程已经启动pthread_cancel(tid);void* retval;pthread_join(tid, &retval); //main thread等待的时候，默认是阻塞等待的cout << "main thread quit..., ret: " << (long long)retval << endl; return 0;
}

运行结果为

我们可以注意到，此时这个线程等待以后的返回值为-1

其实是因为一个线程如果被取消的话，会有这样一个宏

#define PTHREAD_CANCELED ((void *) -1)

换句话说，如果线程是被取消的，那么它退出时的返回码就是-1，即上面的宏

2.5 综合使用前面的四个接口

其实线程的参数和返回值，不仅仅可以用来传递一般参数，也可以传递对象

我们可以用下面的代码来看

#include <iostream>
#include <pthread.h>
#include <string>
#include <cstdlib>
#include <unistd.h>
using namespace std;class Request
{
public:Request(int start, int end, const string& threadname):_start(start),_end(end),_threadname(threadname){}
public:int _start;int _end;string _threadname;
};
class Response
{
public:Response(int result, int exitcode):_result(result),_exitcode(exitcode){}
public:int _result; //计算结果int _exitcode; //计算结果是否可靠
};void* sumCount(void* args) //线程的参数和返回值，不仅仅可以用来传递一般参数，也可以传递对象
{Request *rq = static_cast<Request*>(args);Response *rsp = new Response(0, 0);for(int i = rq->_start; i <= rq->_end; i++){cout << rq->_threadname << " is runing, caling..., " << i << endl;rsp->_result += i;usleep(100000);}delete rq;return (void*)rsp;
}int main()
{pthread_t tid;Request* rq = new Request(1, 100, "thread 1"); pthread_create(&tid, nullptr, sumCount, rq);void* ret;pthread_join(tid, &ret);Response *rsp = static_cast<Response*>(ret);cout << "rsp->result: " << rsp->_result << ", exitcode: " << rsp->_exitcode << endl; delete rsp;return 0;
}

运行结果为

所以它就可以用来求出和。让每一个线程只执行其中的一部分计算，然后我们自己在将这些结果合并起来。

并且我们发现，我们的这些对象都是在堆区创建的。并且我们是交叉使用的，说明堆空间的也是被线程共享使用的

2.6 C++11中的线程

目前，我们使用的是原生线程库（pthread库）

其实C++11 语言本身也已经支持多线程了，它与我们的原生线程库有什么关系呢？

C++11的线程需要用下面的库

#include<thread>

我们看下面的代码

#include <iostream>
#include <pthread.h>
#include <string>
#include <cstdlib>
#include <unistd.h>
#include <thread>
using namespace std;
void threadrun()
{while(true){cout << "I am a new thread for C++" << endl;sleep(1);}
}int main()
{thread t1(threadrun);t1.join();return 0;
}

运行结果为

我们需要注意的是，C++11中的线程库其实底层还是封装了linux提供的系统调用接口，所以我们编译的时候还是需要使用-lpthread选项的。

而C++11其实是有跨平台性的。因为它在不同平台下已经写好了不同版本的库。所以对我们而言，不同的平台写代码是没有感觉的。

我们最好使用C++的多线程。因为具有跨平台性

3.线程ID与进程地址空间布局

我们现在还没有解释这个tid究竟是什么东西，我们先看下面的代码

#include <iostream>
#include <pthread.h>
#include <string>
#include <cstdlib>
#include <unistd.h>
#include <thread>
using namespace std;std::string toHex(pthread_t tid)
{char hex[64];snprintf(hex, sizeof(hex), "%p", tid);return hex;
}void *threadRoutine(void *args)
{while(true){cout << "thread id : " << toHex(pthread_self()) << endl; sleep(1);}
}
int main()
{pthread_t tid;pthread_create(&tid, nullptr, threadRoutine, (void*)"thread 1");cout << "main thread create thread done, new thread id: " << toHex(tid) << endl;pthread_join(tid, nullptr);return 0;
}

运行结果为

我们知道的是，内核中并没有明确的线程的概念，只有轻量级进程的概念

而轻量级进程接口是这样的

这个接口我们一般是不用的，包括fork的底层其实用的也是这个接口

这个的第一个参数是一个函数指针，第二个参数是自定义的一个栈…

这个接口是被pthread线程库封装了。

所以我们采用的是pthread_create,pthread_join这些接口。

如下图所示，这个clone这个接口它需要提供一个回调函数，独立栈结构等，用它去维护线程。

而这些都是线程库在做的事情，也就是线程的概念是库给我们维护的，我们用的原生线程库，也要加载到内存中，因为都是基于内存的。线程库是一个动态库，经过页表映射后，也要到共享区的。

这些栈都是在共享区创建的。我们的线程库只需要维护线程的概念即可，不用维护线程的执行流，不过线程库注定了要维护多个线程属性集合，线程也要管理这些线程，先描述在组织。

而这个线程控制块它就要可以找到这些回调函数，独立栈，以及在内部的LWP。这个线程控制块就是用户级线程

所以我们就将这个下面的这个叫做线程的tcb。而每一个tcb的起始地址，叫做线程的tid

所以拿着这个tid,就可以找到库里面的属性了。

而我们前面打印出来的这个地址，我们也可以看到，它是比较大的，其实它就是介于堆栈之间的共享区

每一个线程都必须要有自己的独立栈结构，因为它有独立的调用链，要进行压栈等操作。其中主线程用的就是地址空间中的这个栈。剩下的轻量级进程在我们创建的时候会先创建一个tcb，它里面的起始地址作为线程tid,它的里面有一个默认大小的空间，叫做线程栈，然后内核中调用clone创建好执行流。在clone中形成的临时数据都会压入到这个线程库中的栈结构中。

即，除了主线程，所有其他线程的独立站，都共享区，具体来讲是在pthread库中，tid指向的用户tcb中

所以其实Linux的线程 = 用户级线程 + 内核的LWP

线程可以分为用户级线程和内核级线程，而linux就属于用户级线程

在linux中，每一个用户级线程就要对应内核中的一个LWP。用户级执行流：内核LWP = 1 ： 1

pthread_ create函数会产生一个线程ID，存放在第一个参数指向的地址中。该线程ID和前面说的线程ID不是一回事。
前面讲的线程ID属于进程调度的范畴。因为线程是轻量级进程，是操作系统调度器的最小单位，所以需要一个数值来唯一表示该线程。
pthread_ create函数第一个参数指向一个虚拟内存单元，该内存单元的地址即为新创建线程的线程ID，属于NPTL（原生线程库）线程库的范畴。线程库的后续操作，就是根据该线程ID来操作线程的。
线程库NPTL(原生线程库)提供了pthread_ self函数，可以获得线程自身的ID
pthread_t pthread_self(void);