全面详细的java线程池解密，看我就够了！

Original 搜狐影音组李忠义搜狐技术产品 2021-01-15

本文字数：10114字

预计阅读时间：26分钟

概述

在实际开发中线程是经常被用到的，但是线程是一种稀缺资源，不能无节制地创建，否则不仅会消耗系统资源还会降低系统的稳定性，而且创建和销毁线程代价很高昂，所以，为了规范线程的使用，线程池就有用武之地了。线程池创建有限的线程并对它们进行管理。

分析和学习源码比较好的方式是从调用入口入手，这样不至于被上千行的代码吓到。ThreadPoolExecutor的入口方法是execute，我们从这个方法分析开始。

线程池状态

在分析ThreadPoolExecutor源码的时候，学习到了一个技术，就是如何用一个数表示多个含义，比如在ThreadPoolExecutor中，一个int数，32位，前3位对应的int数表示runstate，最多表示7个状态，后29位对应的int数表示workerCount，ThreadPoolExecutor有5个状态，用逻辑或｜和逻辑与&能够完成pack和unpack。这种技术，在看Android源码的时候，可以经常看到。

ThreadPoolExecutor有五种状态：

RUNNING: 接收新任务和处理队列中的任务
SHUTDOWN: 不接收新任务，可以处理队列中的任务
STOP: 不接收新任务，也不处理队列中的任务，中断正在处理的任务
TIDYING: 线程都退出了，队列也是空的，进入这个状态
TERMINATED: terminated() 被调用后进入这个状态

execute方法

    /**
     * Executes the given task sometime in the future.  The task
     * may execute in a new thread or in an existing pooled thread.
     *
     * If the task cannot be submitted for execution, either because this
     * executor has been shutdown or because its capacity has been reached,
     * the task is handled by the current {@code RejectedExecutionHandler}.
     *
     * @param command the task to execute
     * @throws RejectedExecutionException at discretion of
     *         {@code RejectedExecutionHandler}, if the task
     *         cannot be accepted for execution
     * @throws NullPointerException if {@code command} is null
     */
    public void execute(Runnable command) {
        if (command == null)
            throw new NullPointerException();
        /*
         * Proceed in 3 steps:
         *
         * 1. If fewer than corePoolSize threads are running, try to
         * start a new thread with the given command as its first
         * task.  The call to addWorker atomically checks runState and
         * workerCount, and so prevents false alarms that would add
         * threads when it shouldn't, by returning false.
         *
         * 2. If a task can be successfully queued, then we still need
         * to double-check whether we should have added a thread
         * (because existing ones died since last checking) or that
         * the pool shut down since entry into this method. So we
         * recheck state and if necessary roll back the enqueuing if
         * stopped, or start a new thread if there are none.
         *
         * 3. If we cannot queue task, then we try to add a new
         * thread.  If it fails, we know we are shut down or saturated
         * and so reject the task.
         */
        int c = ctl.get();
        if (workerCountOf(c) < corePoolSize) {
            if (addWorker(command, true))
                return;
            c = ctl.get();
        }
        //只有RUNNING状态才可以向队列添加任务
        if (isRunning(c) && workQueue.offer(command)) {
            int recheck = ctl.get();
            if (! isRunning(recheck) && remove(command))
                reject(command);
            else if (workerCountOf(recheck) == 0)
                addWorker(null, false);
        }
        else if (!addWorker(command, false))
            reject(command);
    }

execute方法虽然很短但是很重要，它能够解答我们在使用ThreadPoolExecutor过程中的很多疑虑。什么时候创建线程的呢？创建了几个线程？

ThreadPoolExecutor初始化后，runState是RUNNING，workerCount（目前可以先理解为线程数）是0。在execute方法中，首先判断workerCount是否小于corePoolSize，如果小于就调用addWorker去创建新线程执行任务，通过这个逻辑我们就能知道，只要线程池中运行的线程数量没达到核心线程数，添加任务的时候就会创建新线程；如果不小于的话同时线程池是RUNNING状态的话，就将任务放到队列workQueue中，队列分为很多种类型SynchronousQueue(直接提交任务队列)、ArrayBlockingQueue(有界任务队列)、LinkedBlockingQueue(无界任务队列)和PriorityBlockingQueue(优先任务队列)

直接提交队列：offer总是返回false，所以会调用addWorker创建线程执行任务，超过最大线程数（addWorker的第二个参数是false，采用maixmumPoolSize做为界限，下面会有详细解释）后就会执行拒绝策略；

有界任务队列：offer向队列中放入任务，如果队列满了，返回false，会调用addWorker创建线程执行任务，超过最大线程数后就会执行拒绝策略。需要说明的是：如果核心线程数是0的时候，任务放入队列后，会调用addWorker启动一个first task为null的线程；

无界任务队列：offer一直向队列放任务，队列不会满，所以，如果corePoolSize不为0话，只会启动corePoolSize数量的线程，如果corePoolSize为0的话，也只会启动一个线程；

优先任务队列：和无界任务队列一样，只是有优先级而已；

addWorker方法

分析了execute方法，对ThreadPoolExecutor的用法已经有了一个浅深入的了解了，我们在上面的的表述中说addWorker是创建线程执行任务，下面我们看看addWorker到底做了什么

    /**
     * Checks if a new worker can be added with respect to current
     * pool state and the given bound (either core or maximum). If so,
     * the worker count is adjusted accordingly, and, if possible, a
     * new worker is created and started, running firstTask as its
     * first task. This method returns false if the pool is stopped or
     * eligible to shut down. It also returns false if the thread
     * factory fails to create a thread when asked.  If the thread
     * creation fails, either due to the thread factory returning
     * null, or due to an exception (typically OutOfMemoryError in
     * Thread.start()), we roll back cleanly.
     *
     * @param firstTask the task the new thread should run first (or
     * null if none). Workers are created with an initial first task
     * (in method execute()) to bypass queuing when there are fewer
     * than corePoolSize threads (in which case we always start one),
     * or when the queue is full (in which case we must bypass queue).
     * Initially idle threads are usually created via
     * prestartCoreThread or to replace other dying workers.
     *
     * @param core if true use corePoolSize as bound, else
     * maximumPoolSize. (A boolean indicator is used here rather than a
     * value to ensure reads of fresh values after checking other pool
     * state).
     * @return true if successful
     */
    private boolean addWorker(Runnable firstTask, boolean core) {
        retry:
        for (;;) {
            int c = ctl.get();
            int rs = runStateOf(c);

            // Check if queue empty only if necessary.
            if (rs >= SHUTDOWN &&
                ! (rs == SHUTDOWN &&
                   firstTask == null &&
                   ! workQueue.isEmpty()))
                return false;

            for (;;) {
                int wc = workerCountOf(c);
                if (wc >= CAPACITY ||
                    wc >= (core ? corePoolSize : maximumPoolSize))
                    return false;
                if (compareAndIncrementWorkerCount(c))
                    break retry;
                c = ctl.get();  // Re-read ctl
                if (runStateOf(c) != rs)
                    continue retry;
                // else CAS failed due to workerCount change; retry inner loop
            }
        }

        boolean workerStarted = false;
        boolean workerAdded = false;
        Worker w = null;
        try {
            w = new Worker(firstTask);
            final Thread t = w.thread;
            if (t != null) {
                final ReentrantLock mainLock = this.mainLock;
                mainLock.lock();
                try {
                    // Recheck while holding lock.
                    // Back out on ThreadFactory failure or if
                    // shut down before lock acquired.
                    int rs = runStateOf(ctl.get());

                    if (rs < SHUTDOWN ||
                        (rs == SHUTDOWN && firstTask == null)) {
                        if (t.isAlive()) // precheck that t is startable
                            throw new IllegalThreadStateException();
                        workers.add(w);
                        int s = workers.size();
                        if (s > largestPoolSize)
                            largestPoolSize = s;
                        workerAdded = true;
                    }
                } finally {
                    mainLock.unlock();
                }
                if (workerAdded) {
                    t.start();
                    workerStarted = true;
                }
            }
        } finally {
            if (! workerStarted)
                addWorkerFailed(w);
        }
        return workerStarted;
    }

addWorker一上来就是一个嵌套循环，很唬人，其实仔细研究还是可以明白的。

首先外层循环是对线程池状态的检查，如果线程池状态是大于等于SHUTDOWN（也就是SHUTDOWN 或STOP 或TIDYING 或TERNINATED）并且不是状态等于SHUTDOWN且firstTask等于null且队列不等于空的情况时，已经不能去创建新线程了，直接返回false。可以换一种说法，就是线程池状态大于SHUTDOWN或者线程池状态是SHUTDOWN且（firstTask不为null或者队列为空）时，是不允许创建新线程的。也就是说线程池状态等于SHUTDOWN且firstTask等于null且队列不等于空时，线程池还得继续做事呢，这也验证了SHUTDOWN的注释Don't accept new tasks, but process queued tasks。这段描述很绕呀，大家需要仔细推敲理解，真的考验理解能力。

然后内层循环是对线程数量的检查，需要根据传入的表示是否是核心线程的标志core来决定线程数量的边界，true边界是corePoolSize，false边界是maximumPoolSize，如果超出了边界，直接返回false；如果没有超出边界，采用CAS的方式更新ctl（ctl存的数加1，其实就是workerCount加1），更新成功的话直接跳出外循环表示线程池状态和线程数量检查完成而且正确，可以去创建新线程了，更新失败的话（多线程导致的，ctl存的值已经改变了）再次读出ctl存的数看看到底是哪部分改变了，如果状态改变了需要把检查状态的流程也就是外层循环再走一遍，如果状态没变说明是workerCount数量变了只要把对线程数量检查流程也就是内层循环再走一遍就行了。

我们想想为啥要搞这么复杂用了两层嵌套呀，其实这使用的一种叫CAS乐观锁的技术，比使用sychronized悲观锁效率要高，这块内容大家自行学习。

总结一下这块工作：就是为了创建新线程，得先保证线程池状态正确、线程数量不超限同时线程数量计数加1。

下面分析创建线程这块，这块相对好理解些。在ThreadPoolExecutor中其实保存的不是Thread的引用们，而是Worker的引用们，只是Worker（实现了Runnable接口）中封装了Thread，new一个Worker可以理解为new了一个Thread。实例化了一个Worker w，Worker封装的线程不为null的话，把w采用同步的方式放到workers中并且启动线程返回true。这块的同步采用了AQS机制。

这段代码，我们能学到很多东西——

不需要同步执行的代码就不要同步执行，可以减少线程对monitor lock的占用时间，全局锁mainLock在关闭线程池（shutdown/shutdownNow）的时候也需要，所以检查线程状态和向workers添加Worker是需要同步执行的，启动线程不用同步执行；
try{ }finally{ } 可以在finally中执行必须要执行的代码。

Worker类

分析完addWorker方法，就该分析Worker类了，Worker是ThreadPoolExecutor私有的final类型的内部类。

Worker实现了Runnable接口，线程start的时候，就是调用的Worker的run方法，进而调用ThreadPoolExecutor的runWorker方法；

Worker继承了AbstractQueuedSynchronizer类，这样Worker就具有了锁的功能，AQS实现的锁功能很强大而且灵活定制化程度高相比synchronized，AQS可以让我们自己定制实现非重入锁功能，synchronized是重入锁，重入锁简单理解就是：一个线程获取了某个锁还没释放的情况下，还可以再获取这个锁执行。ReentrantLock实现的是重入锁，Worker实现的是非重入锁。实现也很简单，就是重写tryAcquire方法，在方法内做实现，想重入就返回true，不想重入就返回false。

Worker其实就是在真正的任务（execute传进来的Runnable）外面包了一层Runnable，同时把Worker变成锁，通过非重入锁来控制线程能不能进行中断interrupt的，也就是说这个锁不是为了防止并发执行任务的。

进一步解释一下，实际需求是这样的，如果线程是在执行真正任务的时候，线程是不能被中断的，如果线程是因为获取任务等待阻塞时是可以中断的。我们想想这个需求要怎么实现呢？其实办法有很多种，只要在中断线程前能知道当前线程是不是正在执行真正的任务就行，可以用标志位来实现。ThreadPoolExecutor采用的是不重入锁来实现的，线程执行真正任务前先上锁，执行完解锁，在想中断线程的时候尝试获取一下锁，获取不到，说明正在执行真正的任务不能中断。是不是很妙呢？

上面的描述也解决了我们的一个疑惑，为什么不把真正的Runnable直接让线程执行，而是在外面包裹了一个Worker呢？Worker类的注释给出了答案，线程在执行任务的时候，可以通过Worker非重入锁来维护线程的中断控制状态，比如SHUTDOWN时线程在执行Worker前或者在执行真正任务（会进行上锁，具体看runWorker方法）时，是不允许中断的，只有阻塞等待任务的线程可以中断。我觉得除此，还有一个好处就是：包裹一个Runnable，我们可以做一些工作，比如从队列中读取任务，要不然线程执行完真正的Runable就退出了，线程就重用不了了，线程池就失去的意义，这块内容可以和Looper机制一块看，大同小异。

runWorker方法

下面分析ThreadPoolExecutor中很重要的方法，启动先线程后调用的方法runWorker

/**
     * Main worker run loop.  Repeatedly gets tasks from queue and
     * executes them, while coping with a number of issues:
     *
     * 1. We may start out（一开始） with an initial task, in which case we
     * don't need to get the first one. Otherwise, as long as(只要) pool is
     * running, we get tasks from getTask. If it returns null then the
     * worker exits due to（由于） changed pool state or configuration
     * parameters.  Other exits result from exception throws in
     * external code, in which case completedAbruptly（突然） holds, which
     * usually leads processWorkerExit to replace this thread.
     *
     * 2. Before running any task, the lock is acquired to prevent
     * other pool interrupts while the task is executing, and then we
     * ensure that unless pool is stopping, this thread does not have
     * its interrupt set.
     *
     * 3. Each task run is preceded by a call to beforeExecute, which
     * might throw an exception, in which case we cause thread to die
     * (breaking loop with completedAbruptly true) without processing
     * the task.
     *
     * 对异常和错误处理的说明
     * 4. Assuming beforeExecute completes normally, we run the task,
     * gathering any of its thrown exceptions to send to afterExecute.
     * We separately handle RuntimeException, Error (both of which the
     * specs guarantee that we trap) and arbitrary Throwables.
     * Because we cannot rethrow Throwables within Runnable.run, we
     * wrap them within Errors on the way out (to the thread's
     * UncaughtExceptionHandler).  Any thrown exception also
     * conservatively causes thread to die.
     *
     * 5. After task.run completes, we call afterExecute, which may
     * also throw an exception, which will also cause thread to
     * die. According to JLS Sec 14.20, this exception is the one that
     * will be in effect even if task.run throws.
     *
     * The net effect of the exception mechanics is that afterExecute
     * and the thread's UncaughtExceptionHandler have as accurate
     * information as we can provide about any problems encountered by
     * user code.
     *
     * @param w the worker
     */
    final void runWorker(Worker w) {
        Thread wt = Thread.currentThread();
        Runnable task = w.firstTask;
        w.firstTask = null;
        /**
         这块调用unlock其实不是为了解锁，而是让AQS的state变成0，让线程可以进行中断。
         Worker继承自AQS，在实例化的时候，state被设置成了-1。看看Worker的interruptIfStarted方法，state等于-1时是不能线程进行中断的。
         也就是说，线程刚启动，还没执行到runWorker方法，ThreadPoolExecutor就调用了shutdownNow（线程池进入了STOP状态，会调用Worker的interruptIfStarted），这个线程是不能被中断的。
         shutdown方法会让线程进入SHUTDOWN状态，中断空闲的线程，能获取Worker锁的线程是空闲线程(正在getTask获取任务中的线程)，Worker的state -1时，是不能获取锁的。

         总结一下，线程启动没执行到此处时，shutdown和shutdownNow方法都是不能中断此线程的
        */
        w.unlock(); // allow interrupts
        boolean completedAbruptly = true;
        try {
            //一个循环从队列中不断获取任务执行
            while (task != null || (task = getTask()) != null) {
                w.lock(); //这块加锁不是因为并发，而是为了控制线程能否中断
                // If pool is stopping, ensure thread is interrupted;
                // if not, ensure thread is not interrupted.  This
                // requires a recheck in second case to deal with
                // shutdownNow race while clearing interrupt
                //这块保证执行真正任务前，如果线程池是STOP状态线程要中断，如果线程池是小于STOP状态也就是RUNNING或SHUTDOWN状态线程是非中断的
                if ((runStateAtLeast(ctl.get(), STOP) ||
                     (Thread.interrupted() &&
                      runStateAtLeast(ctl.get(), STOP))) &&
                    !wt.isInterrupted())
                    wt.interrupt();
                try {
                    beforeExecute(wt, task);
                    Throwable thrown = null;
                    try {
                        task.run();
                    } catch (RuntimeException x) {
                        thrown = x; throw x;
                    } catch (Error x) {
                        thrown = x; throw x;
                    } catch (Throwable x) {
                        thrown = x; throw new Error(x);
                    } finally {
                        afterExecute(task, thrown);
                    }
                } finally {
                    task = null;
                    w.completedTasks++;
                    w.unlock();
                }
            }
            completedAbruptly = false;
        } finally {
            processWorkerExit(w, completedAbruptly);
        }
    }

getTask方法

runWorker中一个很重要的方法就是从队列中获取任务的方法getTask，重要信息都注释到了代码中

    /**
     * Performs blocking or timed wait for a task, depending on
     * current configuration settings, or returns null if this worker
     * must exit because of any of:
     * 1. There are more than maximumPoolSize workers (due to
     *    a call to setMaximumPoolSize).
     * 2. The pool is stopped.
     * 3. The pool is shutdown and the queue is empty.
     * 4. This worker timed out waiting for a task, and timed-out
     *    workers are subject to（屈服） termination (that is,
     *    {@code allowCoreThreadTimeOut || workerCount > corePoolSize})
     *    both before and after the timed wait, and if the queue is
     *    non-empty, this worker is not the last thread in the pool.
     *
     * @return task, or null if the worker must exit, in which case
     *         workerCount is decremented
     */
    private Runnable getTask() {
        boolean timedOut = false; // Did the last poll() time out? 最后一次poll是否超时

        //这块的循环一方面是业务逻辑需要，另一方面是CAS重试需要
        for (;;) {
            int c = ctl.get();
            int rs = runStateOf(c);

            /**
             * 状态检查
             * 
             * 线程池STOP状态或（线程池SHUTDOWN状态且队列空）时，Worker的数量减1，返回null
             * 到getTask的调用处runWorker看一下，返回null会让线程退出的
             * 也就是说线程池STOP状态或（线程池SHUTDOWN状态且队列空）时，线程最终是要退出的
             */
            // Check if queue empty only if necessary.
            if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
                decrementWorkerCount();
                return null;
            }

            int wc = workerCountOf(c);

            //timed表示是否需要超时等待阻塞控制
            //设置了核心线程允许超时（默认是false）或者当前线程数大于核心线程数，表示需要进行超时控制
            // Are workers subject to culling（淘汰）? 
            boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;

            /**
             * 这块有点难理解，需要和下面代码结合理解。目的是控制线程池的有效线程数量
             * 能执行到这里说明线程池状态是验证通过的
             * wc > maximumPoolSize的情况是因为可能在此方法执行阶段同时执行了setMaximumPoolSize方法；
             * timed是true表示当前线程需要进行超时控制，timeout是true表示上次从阻塞队列中获取任务超时了，并且（当前线程数大于1或者队列为空）时，线程数量尝试减1，如果减1是失败了返回重试，如果成功了返回null
             * 
             * 这是啥意思呢？？？？？？？？我们需要耐心仔细分析一下
             * 首先说明的是，能执行到这块说明，线程池状态是RUNNING或（SHUTDOWN且队列不为空的），
             * 看timed是否为true，如果不为true，说明当前线程已经是核心线程了，不需要超时控制，死等队列返回任务，
             * 如果timed为true，说明当前线程是非核心线程，还得看当前线程上次是否等待任务超时了，如果超时了，还得继续看，如果线程数量大于1，那么线程数量减1；如果没有超时，跳过这个判断，下面进行超时控制。
             * 也就是说，核心线程数不为0的话，会把非核心线程都退出的，核心线程是0的话，保留一个非核心线程，处理队列中的任务，队列空的时候这个非核心线程也得退出
             */
            if ((wc > maximumPoolSize || (timed && timedOut))
                && (wc > 1 || workQueue.isEmpty())) {
                if (compareAndDecrementWorkerCount(c))
                    return null;
                continue;
            }

            try {
                //如果线程数大于核心线程数，采用poll超时阻塞
                //如果线程数大于核心线程数，采用take阻塞直到获取任务
                Runnable r = timed ?
                    workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                    workQueue.take();
                if (r != null)
                    return r;
                timedOut = true;
            } catch (InterruptedException retry) {
                //获取任务时当前线程被中断了，设置timedOut为false返回循环重试
                //从这个用法，我们也知道，线程被中断不等于就要退出线程，具体需要根据处理逻辑来决定
                timedOut = false;
            }
        }
    }

getTask里面逻辑有点复杂，不过结构还是很清晰的，先判断线程池状态，然后根据是否超时控制和上次取任务是否超时来判断线程数，判断都通过后去队列中取任务，判断不通过CAS线程数量减1 return null。

processWorkExit方法

在runWorker中，获取不到执行任务或者任务执行异常的时候，会执行processWorkExit

    /**
     * Performs cleanup and bookkeeping for a dying worker. Called
     * only from worker threads. Unless completedAbruptly is set,
     * assumes that workerCount has already been adjusted to account
     * for exit.  This method removes thread from worker set, and
     * possibly terminates the pool or replaces the worker if either
     * it exited due to user task exception or if fewer than
     * corePoolSize workers are running or queue is non-empty but
     * there are no workers.
     * 上面的注释已经解释清楚了这个方法的作用
     * @param w the worker
     * @param completedAbruptly if the worker died due to user exception
     */
    private void processWorkerExit(Worker w, boolean completedAbruptly) {
        /**
         * completedAbruptly true说明是执行任务发生了异常，这块需要把线程数量减1
         * completedAbruptly false，说明是getTask返回了null，在getTask里已经把线程数量减1了
         */
        if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted
            decrementWorkerCount();

        final ReentrantLock mainLock = this.mainLock;
        //操作workers是需要获取全局锁的
        mainLock.lock();
        try {
            completedTaskCount += w.completedTasks;
            //把线程从workers set中移除
            workers.remove(w);
        } finally {
            mainLock.unlock();
        }
        //尝试去中止线程池
        tryTerminate();

        /**
         * 下面这段代码就不太好理解了
         */
        int c = ctl.get();
        //线程池状态是RUNNING或者SHUTDOWN的时候
        if (runStateLessThan(c, STOP)) {
            //线程正常退出
            if (!completedAbruptly) {
                //allowCoreThreadTimeOut true的时候需要保留的最少线程数是0，false是时候需要保留的最少线程数是corePoolSize；corePoolSize也可能是0
                int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
                //但是，保留的线程数最小是0时，是不对的，还需要看看队列是否为空，队列不为空，至少要保留一个线程执行任务，因为是RUNNIG或SHUTDOWN状态
                if (min == 0 && ! workQueue.isEmpty())
                    min = 1;
                //当前线程数量大于等于min时，不做任何处理；否则，重启一个新线程
                if (workerCountOf(c) >= min)
                    return; // replacement not needed
            }
            //（1）线程异常退出，重启一个新线程
            // (2) 当前线程数量小于需要保留的最小线程数时，重启一个新线程
            /**
             * 但是，我们会发现一个问题：
             * 线程池是SHUTDOWN状态，corePoolSize是3，workerCountOf(c)等于2时，workQueue为空了，难道这时也得重启一个新线程吗？
             * 肯定是不需要的，SHUTDOWN状态的线程池，最终是要销毁所有线程的。
             * addWorker中处理了这种情况，这种情况调用addWorker是直接返回false的，具体看addWorker的源码
             */
            addWorker(null, false);
        }
    }

tryTerminate方法

    /**
     * Transitions to TERMINATED state if either (SHUTDOWN and pool
     * and queue empty) or (STOP and pool empty).  If otherwise
     * eligible to terminate but workerCount is nonzero, interrupts an
     * idle worker to ensure that shutdown signals propagate. This
     * method must be called following any action that might make
     * termination possible -- reducing worker count or removing tasks
     * from the queue during shutdown. The method is non-private to
     * allow access from ScheduledThreadPoolExecutor.
     */
    final void tryTerminate() {
        for (;;) {
            int c = ctl.get();
            /**
             * RUNNING状态，不能中止
             * TIDYING或TERMINATED状态，没有必要中止，因为正在中止
             * SHUTDONW状态，队列不为空，不能中止
             */
            if (isRunning(c) ||
                runStateAtLeast(c, TIDYING) ||
                (runStateOf(c) == SHUTDOWN && ! workQueue.isEmpty()))
                return;
            /**
             * STOP状态
             * SHUTDOWN状态队列为空
             * 是有资格中止的，可是当前线程数不为0，也不行，中断一个空闲线程，这里不是很明白
             */
            if (workerCountOf(c) != 0) { // Eligible to terminate
                interruptIdleWorkers(ONLY_ONE);
                return;
            }

            final ReentrantLock mainLock = this.mainLock;
            mainLock.lock();
            try {
                //这块就很好理解了，状态更新成TIDYING
                if (ctl.compareAndSet(c, ctlOf(TIDYING, 0))) {
                    try {
                        terminated();
                    } finally {
                        //状态更新成TERMINATED
                        ctl.set(ctlOf(TERMINATED, 0));
                        termination.signalAll();
                    }
                    return;
                }
            } finally {
                mainLock.unlock();
            }
            // else retry on failed CAS
        }

shutdown

常用的关闭线程池有两个方法，shutdown和shutdownNow，shutdown让线程池进入SHUTDOWN状态，shutdownNow让线程池进入STOP状态

    /**
     * Initiates an orderly shutdown in which previously submitted
     * tasks are executed, but no new tasks will be accepted.
     * Invocation has no additional effect if already shut down.
     *
     * <p>This method does not wait for previously submitted tasks to
     * complete execution.  Use {@link #awaitTermination awaitTermination}
     * to do that.
     */
    // android-note: Removed @throws SecurityException
    public void shutdown() {
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            checkShutdownAccess();
            //乐观锁设置成SHUTDOWN状态
            advanceRunState(SHUTDOWN);
            //中断空闲线程
            interruptIdleWorkers();
            onShutdown(); // hook for ScheduledThreadPoolExecutor
        } finally {
            mainLock.unlock();
        }
        tryTerminate();
    }


    /**
     * Common form of interruptIdleWorkers, to avoid having to
     * remember what the boolean argument means.
     */
    private void interruptIdleWorkers() {
        interruptIdleWorkers(false);
    }

    /**
     * Interrupts threads that might be waiting for tasks (as
     * indicated by not being locked) so they can check for
     * termination or configuration changes. Ignores
     * SecurityExceptions (in which case some threads may remain
     * uninterrupted).
     *
     * @param onlyOne If true, interrupt at most one worker. This is
     * called only from tryTerminate when termination is otherwise
     * enabled but there are still other workers.  In this case, at
     * most one waiting worker is interrupted to propagate shutdown
     * signals in case all threads are currently waiting.
     * Interrupting any arbitrary thread ensures that newly arriving
     * workers since shutdown began will also eventually exit.
     * To guarantee eventual termination, it suffices to always
     * interrupt only one idle worker, but shutdown() interrupts all
     * idle workers so that redundant workers exit promptly, not
     * waiting for a straggler task to finish.
     */
    private void interruptIdleWorkers(boolean onlyOne) {
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            for (Worker w : workers) {
                Thread t = w.thread;
                if (!t.isInterrupted() && w.tryLock()) {
                    try {
                        t.interrupt();
                    } catch (SecurityException ignore) {
                    } finally {
                        w.unlock();
                    }
                }
                if (onlyOne)
                    break;
            }
        } finally {
            mainLock.unlock();
        }
    }

shutdownnow

    /**
     * Attempts to stop all actively executing tasks, halts the
     * processing of waiting tasks, and returns a list of the tasks
     * that were awaiting execution. These tasks are drained (removed)
     * from the task queue upon return from this method.
     *
     * <p>This method does not wait for actively executing tasks to
     * terminate.  Use {@link #awaitTermination awaitTermination} to
     * do that.
     *
     * <p>There are no guarantees beyond best-effort attempts to stop
     * processing actively executing tasks.  This implementation
     * interrupts tasks via {@link Thread#interrupt}; any task that
     * fails to respond to interrupts may never terminate.
     */
    // android-note: Removed @throws SecurityException
    public List<Runnable> shutdownNow() {
        List<Runnable> tasks;
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            checkShutdownAccess();
            advanceRunState(STOP);
            //中断所有线程
            interruptWorkers();
            //移除队列中剩余的任务
            tasks = drainQueue();
        } finally {
            mainLock.unlock();
        }
        tryTerminate();
        return tasks;
    }

    /**
     * Interrupts all threads, even if active. Ignores SecurityExceptions
     * (in which case some threads may remain uninterrupted).
     */
    private void interruptWorkers() {
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            for (Worker w : workers)
                w.interruptIfStarted();
        } finally {
            mainLock.unlock();
        }
    }

自问自答

为啥每个任务执行的时候都需要上锁呢？

线程池在RUNNING状态的时候，各个线程通过getTask方法从任务队列中获取任务执行，如果获取不到任务线程就阻塞起来。调用shutdown方法会让线程池进入SHUTDOWN状态，线程池不接收新任务，执行完队列中的任务，当然正在执行的任务继续。如果调用shutdown之前就已经有线程阻塞起来了或者正调用getTask方法，在进入SHUTDOWN状态的时候，需要给这些线程（Idle线程）设置中断标志，要不然这些线程可能就永远退不出了，所以，在shutdown方法中要把这些线程中断。可是，正在执行任务的线程是不能中断的。为了区分idle线程和正常执行任务的线程，在ThreadPoolExecutor中采用了给开始执行任务的线程上锁的方式达到区分的目的，这样的话，shutdown方法中在中断线程前尝试去获取一下锁，如果获取不到，说明锁被线程占用正在执行任务，不能中断，如果能获取到，说明是idle线程，可以中断线程。==每个线程中断前都尝试去获取一下锁，说明每个线程的锁是不一样的，如果锁是一样的，一个获取锁的线程没执行完任务前，其它线程因为获取不都锁都阻塞了还怎么执行。同时锁还必须是非重入的，如果可以重入，那通过能不能获取锁来判断线程是不是idle线程，进而能不能中断线程就失去了意义。== 也就是说，每个线程都要有一个自己的非重入锁，在ThreadPoolExecutor中把Worker变成这个锁就很完美，这也是为啥Worker继承了AbstractQueuedSynchronizer。

举例结合源码进行分析

例一

创建一个核心线程数corePoolSize是2，最大线程数maximumPoolSize是5，有界任务队列ArrayBlockingQueue大小是10的线程池。

需要说明的是：线程池只是指定了核心线程的数量，只是数量而已，而不是指定哪些线程是核心线程，这点要搞明白，也就是说RUNNING状态最终只要corePoolSize数量的线程存活就可以了，可能是任何线程。

我们连续执行16个执行时间很长的任务，根据execute源码，addWorker(command, true)会先启动两个线程执行任务，workQueue.offer(command)把十个任务会放到队列中，队列满了后addWorker(command, false)又启动三个线程执行任务，第16个任务添加到线程池的时候时候，addWorker(command, false)根据addWorker的源码，wc大于等于了maximunPoolSize，返回了false，执行了拒绝策略。

如果五个线程都正常执行任务的情况下，调用了shutdown方法，线程池进入SHUTDOWN状态，因为没有线程是空闲的，所以没有线程中断，所有线程正常执行各自当前的任务。如果此时还继续添加新任务的话，根据addWorker的源码，rs等于SHUTDOWN，firstTask不等于null，addWorker返回false，根据execute的源码，就要执行拒绝策略了；5个线程执行完当前任务会从队列中取任务执行，随着运行会有线程取不到任务，根据我们对getTask的源码分析，我们知道这时满足了状态是SHUTDOWN且队列为空的条件，线程数量减1，返回null，根据runWorker源码，我们知道getTask返回null，会调用processWorkerExit(w, completedAbruptly)，将w 当前的Worker移除，尝试去中止线程池，线程就退出了，最终所有线程都会退出。也就验证了SHUTDOWN状态，不能添加新任务，但可以处理队列中的任务。在最后要退出的线程中尝试中止线程池成功了，线程池状态变成了TIDYING，最终变成TERMINATE。

如果有三个线程还在正常执行任务，有两个线程是获取任务等待，根据getTask的分析，我们知道是超时等待。这个时候调用shutdown，这两个等待的线程会被中断，getTask会返回null，线程退出，其它三个线程执行完也会退出。

如果我们让线程池正常运行不停止，所有任务都执行完会是什么样的结局呢？五个线程会poll方式的超时等待，超时后，在getTask中会进行一次新的循环判断，这块要格外注意啦！！！有三个线程执行过程中会满足timed是true、timeout是true、wc>1的条件，线程数量减1，getTask返回null，线程退出；有两个线程times是false了，会take方式等待。也就说只保留corePoolSize数量的线程存活着，其它线程都退出。同时这也解释了，为啥在getTask中，不是超时了就返回null，而是循环执行了一次再次进行了判断，因为超时的线程不一定都要退出，如果是corePoolSize数量以内的线程就需要take等待。

例子二

创建一个corePoolSize是2，maximumPoolSize是5，队列是LinkedBlockingQueue的线程池

这个线程池是可以不断放任务进去执行的，我们也放15个任务，根据execute的源码，执行addWorker(command, true)会先启动两个线程执行任务，workQueue.offer(command)将13个任务放到队列中，两个线程不断从队列中取任务执行，15个任务执行完，两个线程都会take等待。这也解释了，LinkedBlockingQueue的时候，只会最多创建corePoolSize数量的线程，maximumPoolSize是无效的。

调用shutdown对这种线程池的作用，就很简单啦，大家可以自行分析。

在阅读其它源码的时候，还有corePoolSize是0，maximumPoolSize是1的用法，比如DiskLruCache中清理元素的线程

  final ThreadPoolExecutor executorService =
      new ThreadPoolExecutor(0, 1, 60L, TimeUnit.SECONDS, new LinkedBlockingQueue<Runnable>(),
            new DiskLruCacheThreadFactory());

  /**
   * A {@link java.util.concurrent.ThreadFactory} that builds a thread with a specific thread name
   * and with minimum priority.
   */
  private static final class DiskLruCacheThreadFactory implements ThreadFactory {
    @Override
    public synchronized Thread newThread(Runnable runnable) {
      Thread result = new Thread(runnable, "glide-disk-lru-cache-thread");
      result.setPriority(Thread.MIN_PRIORITY);
      return result;
    }
  }

根据execute的源码，我们知道corePoolSize 0的话，会把任务先放队列中，workerCountOf(recheck) == 0，调用addWorker(null, false);创建一个线程，而且只会创建一个线程，所以maximumPoolSize大于1也是没意义的，所有任务执行完这个线程会退出的。

参考

深度解读 java 线程池设计思想及源码实现

https://juejin.im/entry/6844903494223151112

深入理解Java线程池：ThreadPoolExecutor

http://ideabuffer.cn/2017/04/04/%E6%B7%B1%E5%85%A5%E7%90%86%E8%A7%A3Java%E7%BA%BF%E7%A8%8B%E6%B1%A0%EF%BC%9AThreadPoolExecutor/

Java线程池ThreadPoolExecutor使用和分析(二) - execute()原理

http://ideabuffer.cn/2017/04/04/%E6%B7%B1%E5%85%A5%E7%90%86%E8%A7%A3Java%E7%BA%BF%E7%A8%8B%E6%B1%A0%EF%BC%9AThreadPoolExecutor/

[Java并发（三）线程池原理](https://www.cnblogs.com/warehouse/p/10720781.html)

[Java线程池ThreadPoolExecutor使用和分析(二) - execute()原理](https://www.cnblogs.com/trust-freedom/p/6681948.html#label_3_4)

[synchronized 是可重入锁吗？为什么？](https://www.cnblogs.com/incognitor/p/9894604.html)

[一文彻底理解ReentrantLock可重入锁的使用](https://baijiahao.baidu.com/s?id=1648624077736116382&wfr=spider&for=pc)

[Thread的中断机制(interrupt)](https://www.cnblogs.com/onlywujun/p/3565082.html)

[Doug Lea并发编程文章全部译文](http://ifeve.com/doug-lea/)

上期赠书获奖公示

恭喜：“beatyou1”、“望望”、“1234”！

以上读者请添加小编微信：sohu-tech20兑奖！~

也许你还想看

（▼点击文章标题或封面查看）

【周年福利Round4】史诗级java低时延调优案例一

【周年福利Round3】Coroutines（协程）我是这样理解的！

【周年福利Round2】都0202年了，您还不会Elasticsearch？

加入搜狐技术作者天团

千元稿费等你来！

戳这里！☛

法明传[2024]173号：1月1日起，未用示范文本提交起诉状，部分法院将不予立案

法明传[2024]173号：1月1日起，未用示范文本提交起诉状，部分法院将不予立案

法明传[2024]173号：关于加快推进起诉状、答辩状示范文本全面应用工作的通知(附下载链接)

2025.1.1起，全国法院全面推进应用民事起诉状、答辩状示范文本(附下载链接)

法明传[2024]173号：1月1日起，未用示范文本提交起诉状，部分法院将不予立案

全面详细的java线程池解密，看我就够了！

概述

为啥每个任务执行的时候都需要上锁呢？

例一

例子二

深入理解Java线程池：ThreadPoolExecutor

您可能也对以下帖子感兴趣

法明传[2024]173号：1月1日起，未用示范文本提交起诉状，部分法院将不予立案

法明传[2024]173号：1月1日起，未用示范文本提交起诉状，部分法院将不予立案

法明传[2024]173号：关于加快推进起诉状、答辩状示范文本全面应用工作的通知(附下载链接)

2025.1.1起，全国法院全面推进应用民事起诉状、答辩状示范文本(附下载链接)

法明传[2024]173号：1月1日起，未用示范文本提交起诉状，部分法院将不予立案

生成图片，分享到微信朋友圈

全面详细的java线程池解密，看我就够了！

概述

为啥每个任务执行的时候都需要上锁呢？

例一

例子二

深入理解Java线程池：ThreadPoolExecutor

您可能也对以下帖子感兴趣