深入理解并发容器ThreadLocal

2023-09-15 阅读 16 评论 0

摘要：在涉及到多线程需要共享变量的时候，一般有两种方法：其一就是使用互斥锁，使得在每个时刻只能有一个线程访问该变量，好处就是便于编码（直接使用synchronized关键字进行同步访问），缺点在于这增加了线程间的竞争，降低了

在涉及到多线程需要共享变量的时候，一般有两种方法：其一就是使用互斥锁，使得在每个时刻只能有一个线程访问该变量，好处就是便于编码（直接使用synchronized关键字进行同步访问），缺点在于这增加了线程间的竞争，降低了效率；其二就是使用本文要讲的ThreadLocal。如果说synchronized是以“时间换空间”，那么ThreadLocal就是 “以空间换时间” —— 因为 ThreadLocal的原理就是：对于要在线程间共享的变量，为每个线程都提供一个这样的变量，使得这些变量是线程级别的变量，不同线程之间互不影响，从而达到可以并发访问而不出现并发问题的目的。

文章目录

内容大纲
ThreadLocal简介
ThreadLocal方法
TheadLocal使用场景
ThreadLocal的实现原理
ThreadLocalMap详解
ThreadLocal小结
ThreadLocal相关
ThreadLocal应用
- Spring
- RocketMQ
- Zuul
本文小结

内容大纲

在这里插入图片描述

ThreadLocal简介

Java容器框架？在多线程编程中通常解决线程安全的问题我们会利用synchronzed或者lock控制线程对临界区资源的同步顺序从而解决线程安全的问题，但是这种加锁的方式会让未获取到锁的线程进行阻塞等待，很显然这种方式的时间效率并不是很好。线程安全问题的核心在于多个线程会对同一个临界区共享资源进行操作，那么，如果每个线程都使用自己的“共享资源”，各自使用各自的，又互相不影响到彼此即让多个线程间达到隔离的状态，这样就不会出现线程安全的问题。事实上，这就是一种“空间换时间”的方案，每个线程都会都拥有自己的“共享资源”无疑内存会大很多，但是由于不需要同步也就减少了线程可能存在的阻塞等待的情况从而提高的时间效率。

虽然ThreadLocal并不在java.util.concurrent包中而在java.lang包中，但我更倾向于把它当作是一种并发容器（虽然真正存放数据的是ThreadLoclMap）进行归类。从ThreadLocal这个类名可以顾名思义的进行理解，表示线程的“本地变量”，即每个线程都拥有该变量副本，达到人手一份的效果，各用各的这样就可以避免共享资源的竞争。

ThreadLocal方法

ThreadLocal中的方法并不是很多，ThreadLocal方法如下

在这里插入图片描述

TheadLocal使用场景

了解了基本概念以后，接下来看一个样例。定义一个类用于存放静态的ThreadLocal对象，通过多个线程并行地对ThreadLocal对象进行set、get操作，并将值进行打印。来看看每一个线程自己设置进去的值和取出来的值是否是一样的。

package cn.wideth.util;public class MyThreadLocal {static class ResourceClass {public final static ThreadLocal<String> RESOURCE_1 =new ThreadLocal<>();public final static ThreadLocal<String> RESOURCE_2 =new ThreadLocal<>();}public static void setOne(String value) {ResourceClass.RESOURCE_1.set(value);}public static void setTwo(String value) {ResourceClass.RESOURCE_2.set(value);}public static void display() {System.out.println(ResourceClass.RESOURCE_1.get()+ ":" + ResourceClass.RESOURCE_2.get());}public static void main(String []args) {for(int i = 0 ; i < 20 ; i ++) {final String resource = " value = (" + i + ")";new Thread(() -> {try {MyThreadLocal.setOne(Thread.currentThread().getName());MyThreadLocal.setTwo(resource);MyThreadLocal.display();}finally {ResourceClass.RESOURCE_1.remove();ResourceClass.RESOURCE_2.remove();}}).start();}}
}

shell并发、运行结果
在这里插入图片描述

结果分析

大家能够看到输出的线程顺序并不是最初定义线程的顺序，理论上能够说明多线程应当是并发运行的，可是依旧能够保持每一个线程里面的值是相应的，说明这些值已经达到了线程私有的目的。不是说共享变量无法做到线程私有吗？它又是怎样做到线程私有的呢？这就须要我们知道一点点原理上的东西。否则用起来也没那么放心，请看以下的介绍。

ThreadLocal的实现原理

要想学习到ThreadLocal的实现原理，就必须了解它的几个核心方法，包括怎样存怎样取等等，下面我们一个个来看。

void set(T value)

threadlocal、set方法设置在当前线程中threadLocal变量的值，该方法的源码为：

 /*** Sets the current thread's copy of this thread-local variable* to the specified value.  Most subclasses will have no need to* override this method, relying solely on the {@link #initialValue}* method to set the values of thread-locals.** @param value the value to be stored in the current thread's copy of*        this thread-local.*/public void set(T value) {//1. 获取当前线程实例对象Thread t = Thread.currentThread();//2. 通过当前线程实例获取到ThreadLocalMap对象ThreadLocalMap map = getMap(t);if (map != null)//3. 如果Map不为null,则以当前threadLocl实例为key,值为value进行存入map.set(this, value);else//4.map为null,则新建ThreadLocalMap并存入valuecreateMap(t, value);}

方法的逻辑很清晰，具体请看上面的注释。通过源码我们知道value是存放在了ThreadLocalMap里了，当前先把它理解为一个普普通通的map即可，也就是说，数据value是真正的存放在了ThreadLocalMap这个容器中了，并且是以当前threadLocal实例为key。先简单的看下ThreadLocalMap是什么，有个简单的认识就好，下面会具体说的。

首先ThreadLocalMap是怎样来的？源码很清楚，是通过getMap(t)进行获取：

 /*** Get the map associated with a ThreadLocal. Overridden in* InheritableThreadLocal.** @param  t the current thread* @return the map*/ThreadLocalMap getMap(Thread t) {return t.threadLocals;}

该方法直接返回的就是当前线程对象t的一个成员变量threadLocals：

 /* ThreadLocal values pertaining to this thread. This map is maintained* by the ThreadLocal class. */ThreadLocal.ThreadLocalMap threadLocals = null;

也就是说ThreadLocalMap的引用是作为Thread的一个成员变量，被Thread进行维护的。回过头再来看看set方法，当map为Null的时候会通过createMap(t，value)方法：

 /*** Create the map associated with a ThreadLocal. Overridden in* InheritableThreadLocal.** @param t the current thread* @param firstValue value for the initial entry of the map*/void createMap(Thread t, T firstValue) {t.threadLocals = new ThreadLocalMap(this, firstValue);}

tomcat并发可支持多大、该方法就是new一个ThreadLocalMap实例对象，然后同样以当前threadLocal实例作为key,值为value存放到threadLocalMap中，然后将当前线程对象的threadLocals赋值为threadLocalMap。

现在来对set方法进行总结一下：

通过当前线程对象thread获取该thread所维护的threadLocalMap,若threadLocalMap不为null,则以threadLocal实例为key,值为value的键值对存入threadLocalMap,若threadLocalMap为null的话，就新建threadLocalMap然后在以threadLocal为键，值为value的键值对存入即可。

T get()

get方法是获取当前线程中threadLocal变量的值，同样的还是来看看源码：

/*** Returns the value in the current thread's copy of this* thread-local variable.  If the variable has no value for the* current thread, it is first initialized to the value returned* by an invocation of the {@link #initialValue} method.** @return the current thread's value of this thread-local*/public T get() {//1. 获取当前线程的实例对象Thread t = Thread.currentThread();//2. 获取当前线程的threadLocalMapThreadLocalMap map = getMap(t);if (map != null) {//3. 获取map中当前threadLocal实例为key的值的entryThreadLocalMap.Entry e = map.getEntry(this);if (e != null) {@SuppressWarnings("unchecked")//4. 当前entitiy不为null的话，就返回相应的值valueT result = (T)e.value;return result;}}//5. 若map为null或者entry为null的话通过该方法初始化，并返回该方法返回的valuereturn setInitialValue();}

一个容器可支持的并发数，弄懂了set方法的逻辑，看get方法只需要带着逆向思维去看就好，如果是那样存的，反过来去拿就好。代码逻辑请看注释，另外，看下setInitialValue主要做了些什么事情？

    /*** Variant of set() to establish initialValue. Used instead* of set() in case user has overridden the set() method.** @return the initial value*/private T setInitialValue() {T value = initialValue();Thread t = Thread.currentThread();ThreadLocalMap map = getMap(t);if (map != null)map.set(this, value);elsecreateMap(t, value);return value;}

这段方法的逻辑和set方法几乎一致，另外值得关注的是initialValue方法:

 /*** Returns the current thread's "initial value" for this* thread-local variable.  This method will be invoked the first* time a thread accesses the variable with the {@link #get}* method, unless the thread previously invoked the {@link #set}* method, in which case the {@code initialValue} method will not* be invoked for the thread.  Normally, this method is invoked at* most once per thread, but it may be invoked again in case of* subsequent invocations of {@link #remove} followed by {@link #get}.** <p>This implementation simply returns {@code null}; if the* programmer desires thread-local variables to have an initial* value other than {@code null}, {@code ThreadLocal} must be* subclassed, and this method overridden.  Typically, an* anonymous inner class will be used.** @return the initial value for this thread-local*/protected T initialValue() {return null;}

这个方法是protected修饰的也就是说继承ThreadLocal的子类可重写该方法，实现赋值为其他的初始值。关于get方法来总结一下：

通过当前线程thread实例获取到它所维护的threadLocalMap，然后以当前threadLocal实例为key获取该map中的键值对（Entry），若Entry不为null则返回Entry的value。如果获取threadLocalMap为null或者Entry为null的话，就以当前threadLocal为Key，value为null存入map后，并返回null。

void remove()

/*** Removes the current thread's value for this thread-local* variable.  If this thread-local variable is subsequently* {@linkplain #get read} by the current thread, its value will be* reinitialized by invoking its {@link #initialValue} method,* unless its value is {@linkplain #set set} by the current thread* in the interim.  This may result in multiple invocations of the* {@code initialValue} method in the current thread.** @since 1.5*/public void remove() {ThreadLocalMap m = getMap(Thread.currentThread());if (m != null)m.remove(this);}

一台tomcat的并发量？get,set方法实现了存数据和读数据，我们当然还得学会如何删数据。删除数据当然是从map中删除数据，先获取与当前线程相关联的threadLocalMap然后从map中删除该threadLocal实例为key的键值对即可。

ThreadLocalMap详解

从上面的分析我们已经知道，数据其实都放在了threadLocalMap中，threadLocal的get，set和remove方法实际上具体是通过threadLocalMap的getEntry，set和remove方法实现的。如果想真正全方位的弄懂threadLocal，势必得在对threadLocalMap做一番理解。

ThreadLocalMap是归Thread类所有的。它的引用在Thread类里，这也证实了一个问题：ThreadLocalMap类内部为什么有Entry数组，而不是Entry对象？

因为你业务代码能new好多个ThreadLocal对象，各司其职。但是在一次请求里，也就是一个线程里，ThreadLocalMap是同一个，而不是多个，不管你new几次ThreadLocal，ThreadLocalMap在一个线程里就一个，因为再说一次，ThreadLocalMap的引用是在Thread里的，所以它里面的Entry数组存放的是一个线程里你new出来的多个ThreadLocal对象。

核心源码如下：

// 在你调用ThreadLocal.get()方法的时候就会调用这个方法，它的返回是当前线程里的threadLocals的引用。
// 这个引用指向的是ThreadLocal里的ThreadLocalMap对象
ThreadLocalMap getMap(Thread t) {return t.threadLocals;
}public class Thread implements Runnable {// ThreadLocal.ThreadLocalMapThreadLocal.ThreadLocalMap threadLocals = null;
}

handlerthread与thread的区别。ThreadLocalMap中存储的是Entry对象，Entry对象中存放的是key和value。

至于为什么是这样的，我们一步步的来分析ThreadLocalMap！

ThreadLocalMap中的Entry

在ThreadLocalMap中其实是维护了一张哈希表，这个表里面就是Entry对象，而每一个Entry对象简单来说就是存放了我们的key和value值。

那么这个是如何实现的呢？首先我们来想，Entry对象是存放在ThreadLocalMap中，那么对于TreadLocalMap而言就需要一个什么来存放这个Entry对象，我们可以想成一个容器，也就是说ThreadLocalMap需要有一个容器来存放Entry对象，我们来看ThreadLocalMap的源码实现：

/*** The table, resized as necessary.* table.length MUST always be a power of two.*/
private Entry[] table;

threadlocal的值会在多线程间共享、通过注释可以看出，table数组的长度为2的幂次方。接下来看下Entry是什么：

/*** The entries in this hash map extend WeakReference, using* its main ref field as the key (which is always a* ThreadLocal object).  Note that null keys (i.e. entry.get()* == null) mean that the key is no longer referenced, so the* entry can be expunged from table.  Such entries are referred to* as "stale entries" in the code that follows.*/static class Entry extends WeakReference<ThreadLocal<?>> {/** The value associated with this ThreadLocal. */Object value;Entry(ThreadLocal<?> k, Object v) {super(k);value = v;}}

Entry是一个以ThreadLocal为key，Object为value的键值对，另外需要注意的是这里的threadLocal是弱引用，因为Entry继承了WeakReference，在Entry的构造方法中，调用了super(k)方法就会将threadLocal实例包装成一个WeakReferenece。到这里我们可以用一个图来理解下thread，threadLocal，threadLocalMap，Entry之间的关系：

在这里插入图片描述
注意上图中的实线表示强引用，虚线表示弱引用。如图所示，每个线程实例中可以通过threadLocals获取到threadLocalMap，而threadLocalMap实际上就是一个以threadLocal实例为key，任意对象为value的Entry数组。当我们为threadLocal变量赋值，实际上就是以当前threadLocal实例为key，值为value的Entry往这个threadLocalMap中存放。需要注意的是Entry中的key是弱引用，当threadLocal外部强引用被置为null(threadLocalInstance=null),那么系统 GC 的时候，根据可达性分析，这个threadLocal实例就没有任何一条链路能够引用到它，这个ThreadLocal势必会被回收，这样一来，ThreadLocalMap中就会出现key为null的Entry，就没有办法访问这些key为null的Entry的value，如果当前线程再迟迟不结束的话，这些key为null的Entry的value就会一直存在一条强引用链：Thread Ref -> Thread -> ThreaLocalMap -> Entry -> value永远无法回收，造成内存泄漏。当然，如果当前thread运行结束，threadLocal，threadLocalMap,Entry没有引用链可达，在垃圾回收的时候都会被系统进行回收。在实际开发中，会使用线程池去维护线程的创建和复用，比如固定大小的线程池，线程为了复用是不会主动结束的，所以，threadLocal的内存泄漏问题，是应该值得我们思考和注意的问题

set方法

与concurrentHashMap，hashMap等容器一样，threadLocalMap也是采用散列表进行实现的。set方法的源码为：

/*** Set the value associated with key.** @param key the thread local object* @param value the value to be set*/private void set(ThreadLocal<?> key, Object value) {// We don't use a fast path as with get() because it is at// least as common to use set() to create new entries as// it is to replace existing ones, in which case, a fast// path would fail more often than not.Entry[] tab = table;int len = tab.length;int i = key.threadLocalHashCode & (len-1);for (Entry e = tab[i];e != null;e = tab[i = nextIndex(i, len)]) {ThreadLocal<?> k = e.get();if (k == key) {e.value = value;return;}if (k == null) {replaceStaleEntry(key, value, i);return;}}tab[i] = new Entry(key, value);int sz = ++size;if (!cleanSomeSlots(i, sz) && sz >= threshold)rehash();}

threadlocal继承thread类、ThreadLocalMap 中使用开放地址法来处理散列冲突，而 HashMap 中使用的分离链表法。之所以采用不同的方式主要是因为：在 ThreadLocalMap 中的散列值分散的十分均匀，很少会出现冲突。并且 ThreadLocalMap 经常需要清除无用的对象，使用纯数组更加方便。

getEntry方法

getEntry方法源码为：

/*** Get the entry associated with key.  This method* itself handles only the fast path: a direct hit of existing* key. It otherwise relays to getEntryAfterMiss.  This is* designed to maximize performance for direct hits, in part* by making this method readily inlinable.** @param  key the thread local object* @return the entry associated with key, or null if no such*/private Entry getEntry(ThreadLocal<?> key) {int i = key.threadLocalHashCode & (table.length - 1);Entry e = table[i];if (e != null && e.get() == key)return e;elsereturn getEntryAfterMiss(key, i, e);}

remove方法

remove方法源码如下

  /*** Remove the entry for key.*/private void remove(ThreadLocal<?> key) {Entry[] tab = table;int len = tab.length;int i = key.threadLocalHashCode & (len-1);for (Entry e = tab[i];e != null;e = tab[i = nextIndex(i, len)]) {if (e.get() == key) {e.clear();expungeStaleEntry(i);return;}}}

tomcat并发量、该方法逻辑很简单，通过往后环形查找到与指定key相同的entry后，先通过clear方法将key置为null后，使其转换为一个脏entry，然后调用expungeStaleEntry方法将其value置为null，以便垃圾回收时能够清理，同时将table[i]置为null。

ThreadLocal小结

通过上面对ThreadLocal和threadLocalMap源码中只要方法的分析，由于threadLocalMap和concurrentHashMap，hashMap等容器底层结构类似，没有对扩容等机制进行深入分析。通过对上面的分析，可以得到如下结论。

ThreadLocal内存结构图。

在这里插入图片描述
每个Thread对象中都持有一个ThreadLocalMap的成员变量。每个ThreadLocalMap内部又维护了N个Entry节点，也就是Entry数组，每个Entry代表一个完整的对象，key是ThreadLocal本身，value是ThreadLocal的泛型值。

核心源码如下

// java.lang.Thread类里持有ThreadLocalMap的引用
public class Thread implements Runnable {ThreadLocal.ThreadLocalMap threadLocals = null;
}// java.lang.ThreadLocal有内部静态类ThreadLocalMap
public class ThreadLocal<T> {static class ThreadLocalMap {private Entry[] table;// ThreadLocalMap内部有Entry类，Entry的key是ThreadLocal本身，value是泛型值static class Entry extends WeakReference<ThreadLocal<?>> {Object value;Entry(ThreadLocal<?> k, Object v) {super(k);value = v;}}}
}

ThreadLocal相关

pthread和thread的区别、ThreadLocal和Synchronized的区别

Synchronized同步机制保证的是多线程同时操作共享变量并且能正确的输出结果。ThreadLocal不行啊，他把共享变量变成线程私有了，每个线程都有独立的一个变量。举个通俗易懂的案例：网站计数器，你给变量count++的时候带上synchronized即可解决。ThreadLocal的话做不到啊，他没发统计，他只能说能统计每个线程登录了多少次。

ThreadLocal应用

ThreadLocal 被频繁运用到开源中间件中，比如Spring，RocketMQ、Dubbo、Zuul等等，下面就来学习下开源中间件是如何使用 ThreadLocal的。

Spring

另一种场景是Spring事务,事务是和线程绑定起来的,Spring框架在事务开始时会给当前线程绑定一个Jdbc Connection,在整个事务过程都是使用该线程绑定的connection来执行数据库操作，实现了事务的隔离性。Spring框架里面就是用的ThreadLocal来实现这种隔离，代码如下所示:


public abstract class TransactionSynchronizationManager {
//线程绑定的资源,比如DataSourceTransactionManager绑定是的某个数据源的一个Connection,在整个事务执行过程中
//都使用同一个Jdbc Connection
private static final ThreadLocal<Map<Object, Object>> resources =new NamedThreadLocal<>("Transactional resources");
//事务注册的事务同步器
private static final ThreadLocal<Set<TransactionSynchronization>> synchronizations =new NamedThreadLocal<>("Transaction synchronizations");
//事务名称
private static final ThreadLocal<String> currentTransactionName =new NamedThreadLocal<>("Current transaction name");
//事务只读属性
private static final ThreadLocal<Boolean> currentTransactionReadOnly =new NamedThreadLocal<>("Current transaction read-only status");
//事务隔离级别
private static final ThreadLocal<Integer> currentTransactionIsolationLevel =new NamedThreadLocal<>("Current transaction isolation level");
//事务同步开启
private static final ThreadLocal<Boolean> actualTransactionActive =new NamedThreadLocal<>("Actual transaction active");
}

RocketMQ

在 RocketMQ的源码实现中也有用到 ThreadLocal，代码如下：

public class ThreadLocalIndex {private final ThreadLocal<Integer> threadLocalIndex = new ThreadLocal<Integer>();private final Random random = new Random();public int getAndIncrement() {Integer index = this.threadLocalIndex.get();if (null == index) {index = Math.abs(random.nextInt());if (index < 0)index = 0;this.threadLocalIndex.set(index);}index = Math.abs(index + 1);if (index < 0)index = 0;this.threadLocalIndex.set(index);return index;}
}

threadlocal跨线程传递？ThreadLocalIndex 主要用于生产者发送消息的时候，熟悉 RocketMQ 的小伙伴都知道，生产者首先拉取 Topic 的路由信息，一个 Topic 有多个 MessageQueue (消息队列)，发送消息时需要选择一个消息队列进行发送，一般采用轮询的方式选择，此时不同的生产者线程需要有自己负责的轮询顺序，所以使用 ThreadLocalIndex来保证。

Zuul

最近有研究 API 网关的实现原理，Zuul 1.x 算的上一款比较优秀的网关，在它的源码实现中的RequestContext类就用到了 ThreadLocal，保存线程上下文信息。

public class RequestContext extends ConcurrentHashMap<String, Object> {protected static Class<? extends RequestContext> contextClass = RequestContext.class;private static RequestContext testContext = null;// 使用 ThreadLocal 保存线程上下文信息protected static final ThreadLocal<? extends RequestContext> threadLocal = new ThreadLocal<RequestContext>() {@Overrideprotected RequestContext initialValue() {try {return contextClass.newInstance();} catch (Throwable e) {throw new RuntimeException(e);}}};public static RequestContext getCurrentContext() {if (testContext != null) return testContext;RequestContext context = threadLocal.get();return context;}...