Thread

threading库

文档地址:https://docs.python.org/3/library/threading.html

常用方法

  • start(): 启动线程
  • setDaemon():True 主线程退出时关闭子线程,默认False主线程退出子线程不退出
  • join(): 等待子线程执行完,主线程执行后续操作,这里堵塞的是主线程,多个子线程是并行的

入门示例

这里创建一个最简单的多线程任务,睡眠100s是为了方便后面查看进程内的线程状态

import time
from threading import Thread

def test():
    print('---[{}] Test.---'.format(time.ctime()))
    time.sleep(100)

for i in range(4):
    t = Thread(target=test)
    t.start()

查看进程内的线程状态,其中包括总线程数,运行中线程数,睡眠线程数,以及CPU、内存使用情况

root@iZ947mgy3c5Z:~# top -Hbn1 -p 24425
top - 13:50:51 up 54 days, 22:03,  5 users,  load average: 0.16, 0.10, 0.06
Threads:   5 total,   0 running,   5 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.0 us,  0.5 sy,  0.0 ni, 95.6 id,  2.7 wa,  0.0 hi,  0.1 si,  0.1 st
KiB Mem:   1016308 total,   937712 used,    78596 free,   121796 buffers
KiB Swap:        0 total,        0 used,        0 free.   134392 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
24425 root      20   0  322900   6404   2612 S  0.0  0.6   0:00.02 python3
24426 root      20   0  322900   6404   2612 S  0.0  0.6   0:00.00 python3
24427 root      20   0  322900   6404   2612 S  0.0  0.6   0:00.00 python3
24428 root      20   0  322900   6404   2612 S  0.0  0.6   0:00.00 python3
24429 root      20   0  322900   6404   2612 S  0.0  0.6   0:00.00 python3

代码里只启动了4个线程,但是这里看到了5个线程,其中多出了的是"主线程",所谓主线程就是进程启动是默认就创建的线程,在这里主线程并不参与任务的执行,而是监视其余4个子线程的执行情况,如果其余线程执行完毕退出后,主线程才会结束退出,PS: 主线程的生命周期是整个进程,子线程的生命周期是一个任务

Thread子类

代码如下

import threading
import time

class MyThread(threading.Thread):
    def run(self):
        print('i am {}@{}'.format(self.name, str(i)))
        time.sleep(100)

if __name__ == '__main__':
    for i in range(4):
        t = MyThread()
        t.start()

执行效果

[root@Da scripts]# python35 thread_subclass.py 
i am Thread-1@0
i am Thread-2@1
i am Thread-3@2
i am Thread-4@3

检查进程中线程

[root@Da ~]# ps -T -p 2711
  PID  SPID TTY          TIME CMD
 2711  2711 pts/2    00:00:00 python35
 2711  2712 pts/2    00:00:00 python35
 2711  2713 pts/2    00:00:00 python35
 2711  2714 pts/2    00:00:00 python35
 2711  2715 pts/2    00:00:00 python35

执行顺序

线程执行顺序是不确定的

[root@Da scripts]# cat thread_subclass.py 
import threading
import time

class MyThread(threading.Thread):
    def run(self):
        for i in range(3):
            time.sleep(1)
            print('i am {}@{}'.format(self.name, str(i)))

def main():
    for i in range(5):
        t = MyThread()
        t.start()

if __name__ == '__main__':
    main()

执行效果

[root@Da scripts]# python35 thread_subclass.py 
i am Thread-1@0
i am Thread-2@0
i am Thread-3@0
i am Thread-4@0
i am Thread-5@0
i am Thread-2@1
i am Thread-3@1
i am Thread-4@1
i am Thread-5@1
i am Thread-1@1
i am Thread-2@2
i am Thread-3@2
i am Thread-4@2
i am Thread-5@2
i am Thread-1@2

全局变量

线程之间共享全局变量

[root@Da scripts]# cat multi_thread_2.py 
from threading import Thread
import time

g_num = 100
def work1():
    global g_num
    for i in range(3):
        g_num += 1
    print('Work1 Current Number: {}'.format(g_num))


def work2():
    global g_num
    print('Work2 Current Number: {}'.format(g_num))

t1 = Thread(target=work1)
t1.start()

time.sleep(1)
t2 = Thread(target=work2)
t2.start()

执行效果

[root@Da scripts]# python35 multi_thread_2.py 
Work1 Current Number: 103
Work2 Current Number: 103

但是,多线程操作同一个数据要小心,有可能会导致得到意外的结果

[root@Da scripts]# cat multi_thread_3.py 
from threading import Thread
import time

g_num = 100
def work1():
    global g_num
    for i in range(100000):
        g_num += 1
    print('Work1 Current Number: {}'.format(g_num))


def work2():
    global g_num
    for i in range(100000):
        g_num += 1
    print('Work2 Current Number: {}'.format(g_num))

t1 = Thread(target=work1)
t1.start()

#time.sleep(1)
t2 = Thread(target=work2)
t2.start()

print('Final num: {}'.format(g_num)

执行效果

[root@Da scripts]# python35 multi_thread_3.py 
Final num: 93080
Work1 Current Number: 113833
Work2 Current Number: 155357

为什么最终结果不正确?

如何避免这类问题

  • 加标识,A线程操作完线程,修改标识,其余线程循环判断标识条件,符合条件才会执行
  • 互斥锁

互斥锁

from threading import Thread, Lock
import time

g_num = 100
def work1():
    global g_num
    # 加锁
    mutex.acquire()
    for oi in range(100000):
        g_num += 1
    mutex.release()
    print('Work1 Current Number: {}'.format(g_num))


def work2():
    global g_num
    # 加锁,如果锁已经被work1上锁了,work2会阻塞等待锁释放,然后在加锁执行
    mutex.acquire()
    for i in range(100000):
        g_num += 1
    mutex.release()
    print('Work2 Current Number: {}'.format(g_num))

mutex = Lock()

t1 = Thread(target=work1)
t1.start()

t2 = Thread(target=work2)
t2.start()
# 加锁,如果mutex已经被work1、work2加锁了,会等待work1、2释放后才会i只想能够打印g_num
mutex.acquire()
print('Final num: {}'.format(g_num))
mutex.release()

执行效果

[root@Da scripts]# python35 multi_thread_3.py 
Work1 Current Number: 100100
Work2 Current Number: 200100
Final num: 200100

此时锁是加在for循环上面的,此时程序由多线程执行变成单线程的了,因为work2一直在等待work1释放锁

把锁加载for循环里可以么?

        mutex.acquire()
        g_num += 1
        mutex.release()

执行效果,虽然中间数据不准确(因为线程间争抢锁,导致执行打印时数据可能被各线程操作多次),但是最终结果是正确的

[root@Da scripts]# python35 multi_thread_3.py 
Final num: 34313
Work1 Current Number: 177511
Work2 Current Number: 200100

有关加锁的几个建议

  • 锁的粒度越小越好
  • 能不加锁就不加锁,尽量只在关键数据的修改的位置加锁

补充说明

  1. 锁阻塞时,是轮训等待还是通知? 系统为了避免线程占用CPU,所以使用的是通知

死锁

线程A、线程B互相等待对方释放锁

```

避免死锁的思路
* 设计时避免死锁
* 银行家算法
* 超时时间



## 同步:线程有序执行
```python
[root@Da scripts]# cat multi_thread_4.py
from threading import Thread, Lock
from time import sleep

def work1():
    while True:
        if lock1.acquire():
            print('---work1---')
            sleep(0.1)
            lock2.release()

def work2():
    while True:
        if lock2.acquire():
            print('---work2---')
            sleep(0.1)
            lock3.release()


def work3():
    while True:
        if lock3.acquire():
            print('---work3---')
            sleep(0.1)
            lock1.release()


if __name__ == '__main__':
    lock1 = Lock()
    lock2 = Lock()
    lock2.acquire()
    lock3 = Lock()
    lock3.acquire()
    t1 = Thread(target=work1)
    t2 = Thread(target=work2)
    t3 = Thread(target=work3)
    t1.start()
    t2.start()
    t3.start()

线程内数据不同函数共享

[root@Da scripts]# cat multi_thread_6.py
import threading

def get_name():
    print('Name: ', local.name)

def set_name(name):
    local.name = name
    get_name()

local = threading.local()
t1 = threading.Thread(target=set_name, args=('Da', ))
t2 = threading.Thread(target=set_name, args=('Yo', ))
t1.start()
t2.start()
t1.join()
t2.join()

执行效果

[root@Da scripts]# python35 multi_thread_6.py
Name:  Da
Name:  Yo

results matching ""

    No results matching ""