【Android应用源码分析】android中HashMap的替代者——SparseArray 源码分析

最新推荐文章于 2023-10-19 17:03:20 发布

原创最新推荐文章于 2023-10-19 17:03:20 发布 · 1.9k 阅读

3 ·

本内容遵循CC 4.0 BY-SA版权协议

标签

#android #SparseArra #源码分析 #hashmap #源码

Android应用源码分析专栏收录该内容

6 篇文章

订阅专栏

本文分析了Android中的SparseArray，作为HashMap的替代者，它通过避免自动装箱和使用二分查找节省内存。尽管在添加删除元素时较慢，但适用于小型数据集。SparseArray在remove时采用删除标记，通过gc方法批量压缩空间，减少GC次数。总结建议，当key为整数且数据量不大时，使用SparseArray可以节省内存。

转载请注明出处：http://blog.csdn.net/zhoubin1992/article/details/48269985

前言

当我们定义

HashMap<Integer, E> hashMap = new HashMap<Integer, E>(); 时IDE会给出一个警告： 用SparseArray<E>来替代，以获取更好性能。HashMap的get和put时间复杂度是O(1)呀，为什么SparseArray的性能会更好？这两天花时间详细分析了下源码，发现SparseArray的优点主要是节约内存，非常适合于移动端。
SparseArray是在android.util下的一个工具类。首先我们来看下SparseArray的方法结构：
这里写图片描述

会发现内容并不多，就是构造方法和增删改查的一些方法。

源码分析

源码详细分析见注释：

package android.util;

import com.android.internal.util.ArrayUtils;
import com.android.internal.util.GrowingArrayUtils;

import libcore.util.EmptyArray;

/**
 * SparseArrays map integers to Objects.  Unlike a normal array of Objects,
 * there can be gaps in the indices.  It is intended to be more memory efficient
 * than using a HashMap to map Integers to Objects, both because it avoids
 * auto-boxing keys and its data structure doesn't rely on an extra entry object
 * for each mapping.
 * SparseArray用于映射integers到object。但不像普通数组那样，sparseArray的元素间没有无用元素。
 * 在映射integers到object的过程中，SparseArray由于采用避免自动装箱的keys和它的数据结构不依赖额外
 * 的对象来存储映射关系的实现，因此它比hashMap的内存使用更高效一些。
 * <p>Note that this container keeps its mappings in an array data structure,
 * using a binary search to find keys.  The implementation is not intended to be appropriate for
 * data structures
 * that may contain large numbers of items.  It is generally slower than a traditional
 * HashMap, since lookups require a binary search and adds and removes require inserting
 * and deleting entries in the array.  For containers holding up to hundreds of items,
 * the performance difference is not significant, less than 50%.</p>
 * 注意：SparseArray在查找keys的过程中采用了二分查找， 这种实现不适合数据量大的情况。由于查找时要用到二分查找，
 * 添加删除时涉及到数组其他元素的挪动，因此通常SparseArray会比hashMap慢。当处理上百的数据量，这种性能差异不是特别
 * 明显，性能差异不超过50%。
 * <p>To help with performance, the container includes an optimization when removing
 * keys: instead of compacting its array immediately, it leaves the removed entry marked
 * as deleted.  The entry can then be re-used for the same key, or compacted later in
 * a single garbage collection step of all removed entries.  This garbage collection will
 * need to be performed at any time the array needs to be grown or the the map size or
 * entry values are retrieved.</p>
 *为了优化性能，SparseArray针对remove case作了优化，remove时它不是立即挤压数组空间，而是标记为delete。
 * 这个被标记的元素要么被重复利用，要么在多次remove之后通过一次gc操作中被挤压出去。
 * gc需要在下列情况之前被执行：数组要扩容；获取SparseArray容量；get values（更详细的见代码注释）；
 * <p>It is possible to iterate over the items in this container using
 * {@link #keyAt(int)} and {@link #valueAt(int)}. Iterating over the keys using
 * <code>keyAt(int)</code> with ascending values of the index will return the
 * keys in ascending order, or the values corresponding to the keys in ascending
 * order in the case of <code>valueAt(int)</code>.</p>
 * 可以用keyAt valueAt实现遍历....
 */
//E对应HashMap的Value
public class SparseArray<E> implements Cloneable {
    // 用来优化删除性能（当有元素被remove delete时），标记已经删除的对象
    private static final Object DELETED = new Object();
    // 用来优化删除性能，标记是否需要垃圾回收
    private boolean mGarbage = false;
    // 存储索引，整数索引（key为整数）从小到大被映射在该数组
    private int[] mKeys;
    // 存储对象（Value）
    private Object[] mValues;
    // SparseArray实际大小
    private int mSize;

    /**
     * Creates a new SparseArray containing no mappings.
     */
    public SparseArray() {
        //默认容量是10个元素
        this(10);
    }

    /**
     * Creates a new SparseArray containing no mappings that will not
     * require any additional memory allocation to store the specified
     * number of mappings.  If you supply an initial capacity of 0, the
     * sparse array will be initialized with a light-weight representation
     * not requiring any additional array allocations.
     */
    public SparseArray(int initialCapacity) {
        if (initialCapacity == 0) {
             //mKeys的初值等于new int[0],mValues的初值等于new Object[0]
            mKeys = EmptyArray.INT;
            mValues = EmptyArray.OBJECT;
        } else {
            //newUnpaddedObjectArray最后指向了VMRuntime的一个native方法,返回一个至少长initialCapacity的数组，
            //但可能更大。增长的大小来自于避免数组后的任何padding。padding的大小依赖于componentType和内存分配器的实现。
            mValues = ArrayUtils.newUnpaddedObjectArray(initialCapacity);
            mKeys = new int[mValues.length];
        }
        mSize = 0;
    }

    @Override
    @SuppressWarnings("unchecked")
    public SparseArray<E> clone() {
        SparseArray<E> clone = null;
        try {
            //java深拷贝
            clone = (SparseArray<E>) super.clone();
            clone.mKeys = mKeys.clone();
            clone.mValues = mValues.clone();
        } catch (CloneNotSupportedException cnse) {
            /* ignore */
        }
        return clone;
    }

    /**
     * Gets the Object mapped from the specified key, or <code>null</code>
     * if no such mapping has been made.
     */
    /**
     * 获得指定key的映射对象，或者null如果没有该映射。
     */
    public E get(int key) {
        return get(key, null);
    }

    /**
     * Gets the Object mapped from the specified key, or the specified Object
     * if no such mapping has been made.
     */
    @SuppressWarnings("unchecked")
    public E get(int key, E valueIfKeyNotFound) {
        //二分查找
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);
        // 如果没找到或者该value已经被标记删除,则返回默认值
        if (i < 0 || mValues[i] == DELETED) {
            return valueIfKeyNotFound;
        } else {
             // i>0 且该位置的元素未被标记为待删除，返回该值mValues[i]
            return (E) mValues[i];
        }
    }

    /**
     * Removes the mapping from the specified key, if there was any.
     */
    /**
     * 删除指定key的映射对象。
     */
    public void delete(int key) {
        //二分查找
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);
        //找到了
        if (i >= 0) {
             //若未被标记delete，标记为delete，回收mGarbage=true
            if (mValues[i] != DELETED) {
                mValues[i] = DELETED;
                mGarbage = true;
            }
        }
    }

    /**
     * Alias for {@link #delete(int)}.
     */
    public void remove(int key) {
        delete(key);
    }

    /**
     * Removes the mapping at the specified index.
     */
    //移除特定位置的元素，注意传入的是mValues的index不是Key
    public void removeAt(int index) {
        //若未被标记delete，标记为delete，回收mGarbage=true
        if (mValues[index] != DELETED) {
            mValues[index] = DELETED;
            mGarbage = true;
        }
    }

    /**
     * Remove a range of mappings as a batch.
     *
     * @param index Index to begin at
     * @param size Number of mappings to remove
     */
    public void removeAtRange(int index, int size) {
        //确定结束位置
        final int end = Math.min(mSize, index + size);
        //从起点开始循环 remove
        for (int i = index; i < end; i++) {
            removeAt(i);
        }
    }
    //目的只有一个压缩空间（压缩数组，把无效的值删除）
    private void gc() {
        // Log.e("SparseArray", "gc start with " + mSize);
        int n = mSize;
        int o = 0;
        int[] keys = mKeys;
        Object[] values = mValues;
        //循环整个元素区间，删除值为DELETED的数，这里比较巧妙，直接对同一个keys和values操作，完成元素的删除和移动！
        //大家注意这里的keys和mKeys是指向同一个地址（values 和mValues类似），所以改变keys、values 的值也即改变了mKeys、mValues。
        for (int i = 0; i < n; i++) {
            Object val = values[i];

            if (val != DELETED) {
                if (i != o) {
                    keys[o] = keys[i];
                    values[o] = val;
                    values[i] = null;
                }
                o++;
            }
        }
        mGarbage = false;
        mSize = o;//实际大小

        // Log.e("SparseArray", "gc end with " + mSize);
    }

    /**
     * Adds a mapping from the specified key to the specified value,
     * replacing the previous mapping from the specified key if there
     * was one.
     */
    /**
     * 添加一个指定key到指定object的映射，如果之前有一个指定key的映射则直接替换掉原映射object。注意gc。
     */
    public void put(int key, E value) {
        //先二分查找，确定插入位置，保证了key数组的有序性
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

        if (i >= 0) {
            //找到了，直接替换
            mValues[i] = value;
        } else {
            // 做一个取反运算，获得应该插入的index
            //没找到的情况下： i = -insertPoint -1，对他取反刚好得insertPoint。
            i = ~i;
            //若i在size范围内，且刚好对应位置标记为delete了，直接放入
            if (i < mSize && mValues[i] == DELETED) {
                mKeys[i] = key;
                mValues[i] = value;
                return;
            }
            //若前面if不成立，即i超出了size范围，或者对应的位置的元素是有效的
            // 如果被标记为需要垃圾回收且SparseArray大小不小于keys数组长度
            if (mGarbage && mSize >= mKeys.length) {
                // 压缩空间，会压缩数组，把无效的值都去掉，保证连续有效值
                gc();
                // Search again because indices may have changed.
                // 再次查找插入点因为索引可能改变
                i = ~ContainerHelpers.binarySearch(mKeys, mSize, key);
            }
            // 插入，如果size不够则会重新分配更大的数组，然后拷贝过去并插入；size足够则用System.arraycopy把插入位置开始的value都后移然后插入
            mKeys = GrowingArrayUtils.insert(mKeys, mSize, i, key);
            mValues = GrowingArrayUtils.insert(mValues, mSize, i, value);
            // 实际大小加1
            mSize++;
        }
    }

    /**
     * Returns the number of key-value mappings that this SparseArray
     * currently stores.
     */
    //返回mSize，注意gc。
    public int size() {
        if (mGarbage) {
            gc();
        }

        return mSize;
    }

    /**
     * Given an index in the range <code>0...size()-1</code>, returns
     * the key from the <code>index</code>th key-value mapping that this
     * SparseArray stores.
     *
     * <p>The keys corresponding to indices in ascending order are guaranteed to
     * be in ascending order, e.g., <code>keyAt(0)</code> will return the
     * smallest key and <code>keyAt(size()-1)</code> will return the largest
     * key.</p>
     */
  //返回索引为index的mKeys值，注意gc。
    public int keyAt(int index) {
        if (mGarbage) {
            gc();
        }

        return mKeys[index];
    }

    /**
     * Given an index in the range <code>0...size()-1</code>, returns
     * the value from the <code>index</code>th key-value mapping that this
     * SparseArray stores.
     *
     * <p>The values corresponding to indices in ascending order are guaranteed
     * to be associated with keys in ascending order, e.g.,
     * <code>valueAt(0)</code> will return the value associated with the
     * smallest key and <code>valueAt(size()-1)</code> will return the value
     * associated with the largest key.</p>
     */
    @SuppressWarnings("unchecked")
    //返回索引为index的mValues值，注意gc。
    public E valueAt(int index) {
        if (mGarbage) {
            gc();
        }

        return (E) mValues[index];
    }

    /**
     * Given an index in the range <code>0...size()-1</code>, sets a new
     * value for the <code>index</code>th key-value mapping that this
     * SparseArray stores.
     */
    //设置索引为index的mValues值为value，注意gc。    
    public void setValueAt(int index, E value) {
        if (mGarbage) {
            gc();
        }

        mValues[index] = value;
    }

    /**
     * Returns the index for which {@link #keyAt} would return the
     * specified key, or a negative number if the specified
     * key is not mapped.
     */
    //返回值为key在mKeys中的index，注意gc。        
    public int indexOfKey(int key) {
        if (mGarbage) {
            gc();
        }

        return ContainerHelpers.binarySearch(mKeys, mSize, key);
    }

    /**
     * Returns an index for which {@link #valueAt} would return the
     * specified key, or a negative number if no keys map to the
     * specified value.
     * <p>Beware that this is a linear search, unlike lookups by key,
     * and that multiple keys can map to the same value and this will
     * find only one of them.
     * <p>Note also that unlike most collections' {@code indexOf} methods,
     * this method compares values using {@code ==} rather than {@code equals}.
     */
    //返回值为value在mValues中的index，注意gc。        
    public int indexOfValue(E value) {
        if (mGarbage) {
            gc();
        }

        for (int i = 0; i < mSize; i++)
            if (mValues[i] == value)
                return i;

        return -1;
    }

    /**
     * Removes all key-value mappings from this SparseArray.
     */
    //清空SparseArray
    public void clear() {
        int n = mSize;
        Object[] values = mValues;

        for (int i = 0; i < n; i++) {
            //值空，利于jvm gc
            values[i] = null;
        }

        mSize = 0;
        mGarbage = false;
    }

    /**
     * Puts a key/value pair into the array, optimizing for the case where
     * the key is greater than all existing keys in the array.
     */
    //往SparseArray加入键值对key/value
    public void append(int key, E value) {
        //若key小于等于已有的最大key，直接Put
        if (mSize != 0 && key <= mKeys[mSize - 1]) {
            put(key, value);
            return;
        }

        if (mGarbage && mSize >= mKeys.length) {
            gc();
        }
        //若key大于了现有的所有key，就不用走put的二分查找过程了，直接append
        mKeys = GrowingArrayUtils.append(mKeys, mSize, key);
        mValues = GrowingArrayUtils.append(mValues, mSize, value);
        mSize++;
    }

    /**
     * {@inheritDoc}
     *
     * <p>This implementation composes a string by iterating over its mappings. If
     * this map contains itself as a value, the string "(this Map)"
     * will appear in its place.
     */
    @Override
    public String toString() {
        if (size() <= 0) {
            return "{}";
        }

        StringBuilder buffer = new StringBuilder(mSize * 28);
        buffer.append('{');
        for (int i=0; i<mSize; i++) {
            if (i > 0) {
                buffer.append(", ");
            }
            int key = keyAt(i);
            buffer.append(key);
            buffer.append('=');
            Object value = valueAt(i);
            if (value != this) {
                buffer.append(value);
            } else {
                buffer.append("(this Map)");
            }
        }
        buffer.append('}');
        return buffer.toString();
    }
}

SparseArray性能体现

阅读上面代码，我来总结下SparseArray的性能体现：
1. SparseArray用于映射integers到object。因为key是一个整数数组，避免了自动装箱的keys和不依赖额外的数据结构去映射K/V关系，从而节省内存，比hashMap的内存使用更高效。
2. SparseArray在get、put中采用二分查找，添加删除元素时涉及到数组其他元素的挪动，因此通常SparseArray会比hashMap慢。当处理上百的数据量，这种性能差异不是特别明显，性能差异不超过50%。所以SparseArray不适合数据量大的情况。
二分查找函数（binarySearch）代码如下：

private static int binarySearch(int[] a, int start, int len, int key) {
        int high = start + len, low = start - 1, guess;

        while (high - low > 1) {
            guess = (high + low) / 2;

            if (a[guess] < key)
                low = guess;
            else
                high = guess;
        }

        if (high == start + len)
            return ~(start + len);
        else if (a[high] == key)
            return high;
        else
            return ~high;
    }

返回值是插入位置，没找到的情况下： i = -insertPoint -1 <0，对他取反(i=~i)刚好得insertPoint。
3.由于key数组需要有序，所以每次的put操作更费时，要二分查找，要在数组删除插入元素。所以对应地，SparseArray针对remove作了优化，remove元素时不是立即压缩数组空间，而是把需要remove的元素标记为delete，同时设置垃圾回收标志mGarbage。这个被标记的元素要么被重复利用，要么在多次remove之后通过一次gc操作中被挤压出去。

关于SparseArray的gc方法

注意gc方法只在如下方法会执行，在remove和get时不会执行。这就是针对remove作了优化，将可能的多次gc操作变为一次完成，见下一条。
用删除标记DELETED结合垃圾回收标记mGarbage减少gc次数。如果每次remove都执行一次gc(循环数组删除移动)，则性能下降。采用优化后，多次remove仅多次设置了标志，在gc触发时，仅需要一次循环就可以将空间压缩好。
gc方法的目的只有一个：压缩空间（压缩数组，把无效的值删除）。查看代码会发现循环整个元素区间，删除值为DELETED的数，这里比较巧妙，直接对同一个keys和values操作，完成元素的删除和移动！同时注意这里的keys和mKeys是指向同一个地址（values 和mValues类似），所以改变keys、values 的值也即改变了mKeys、mValues。
每次gc过程，保证了他的数组(mSize)区间内没有无效值。也是稀疏数组的精髓。

总结

与HashMap比较

HashMap的相关基础和源码见 JAVA 面向对象和集合知识点总结（6.15更新）
HashMap的get put时间复杂度O(1)的代价就是耗费大量内存来存储数据。
HashMap ：消耗内存空间，性能好。
SparseArray ：节约内存空间，性能稍差(上百的数据量性能差异不超过50%)。