Java数据结构与集合学习笔记

常用集合与分类

Collection
- List：按顺序保存元素。
  - ArrayList：数组实现，随机访问性能好。
  - LinkedList：双向链表实现，插入删除性能好。
- Set：不添加重复元素。
  - HashSet：快速查找元素，由HashMap实现。元素须定义 hashCode() 和 equals() 方法。
  - LinkedHashSet：快速查找元素+链表维护读写顺序，由LinkedHashMap实现。元素须定义 hashCode() 和 equals() 方法。
  - TreeSet：有序Set，由TreeMap实现。元素须实现 Comparable 接口。
- Queue：队列，先进先出。
  - Deque：双端队列 (double-ended queue)。
    - ArrayDeque：数组实现，头尾指针在数组中循环移动，性能好。
    - LinkedList：双向链表实现。
  - PriorityQueue：插入后排序元素，使用堆实现，队列头部默认为最小的元素（自然顺序）。
Map
- HashMap：散列表实现，根据Key存取关联的Value。
- LinkedHashMap：HashMap+双向链表维护读写顺序。默认是维护最后写入的顺序，也可以指定为读取排序 (accessOrder)，使用LRU算法实现。
- TreeMap：根据key进行排序，使用红黑树实现。插入时找到key对应的节点并替换value，没找到则生成新节点。

补充：堆和二叉搜索树的区别（红黑树也是二叉搜索树）：

堆：为排序而设计。查找需要遍历，性能差。

二叉搜索树：为查找而设计。

Java 集合框架类图

黄色为接口，绿色为抽象类，蓝色为具体类。虚线箭头表示实现关系，实线箭头表示继承关系。

Collection

Map

ArrayList

ArrayList内部使用数组实现。

ArrayList<String> list = new ArrayList<>();

// 创建并指定初始容量
int capacity = 10;
ArrayList<String> list = new ArrayList<>(capacity);

// 从其他Collection创建
Collection<String> collection;
ArrayList<String> list = new ArrayList<>(collection);

// 添加item
list.add("item");
// 添加到指定位置
list.add(1, "item");

LinkedList：List、Stack、Deque

LinkedList内部使用链表实现，实现了List、Stack、Deque接口。

addFirst()
addLast() = add() = offer()
getFirst() = element()，返回元素或抛异常；peek()，返回元素或null。
removeFirst() = remove()，删除元素或抛异常；poll()，删除元素或返回null。
removeLast()

LinkedList<String> list;

// 添加到开头/末尾
list.add("last");
list.addFirst("first");
list.addLast("last");

// 从开头/末尾移除一个元素
String first = list.removeFirst();
String last = list.removeLast();

List转数组

List<Integer> list;
Integer[] array = list.toArray(new Integer[0]);

// 这里的int[]可以看成一种类型
List<int[]> list;
int[][] array1 = list.toArray(new int[0][]);
int[][] array2 = list.toArray(new int[0][0]);

Arrays

Arrays.asList

工具方法 Arrays.asList() 生成的是一个不可修改的List，其类型为 Arrays.ArrayList。

1 2	List list = Arrays.asList(1,2,3); //- list.add(4); // Unsupported Operation Exception

Arrays.sort与Comparator.compare

1
2
3

int[] array = new int[]{3,1,2};
// 默认为自然顺序，即升序排列，从小到大
Arrays.sort(array); // [1,2,3]

// 使用Comparator，升序
Integer[] array;
Arrays.sort(array, new Comparator<Integer>() {
    public int compare(Integer a, Integer b) {
        // 当a和b一个很大一个很小时，直接相减，整型可能会溢出，导致错误结果
        // return a - b;

        // 可使用Integer.compare，等价于 (a<b) ? -1 : ((a==b) ? 0 : 1)
        return Integer.compare(a, b);
    }
});

LinkedHashMap实现LRU

import java.util.LinkedHashMap;
import java.util.Map;

/**
 * 简单用LinkedHashMap来实现的LRU算法的缓存
 */
public class LRUCache<K, V> extends LinkedHashMap<K, V> {

    private int cacheSize;

    public LRUCache(int cacheSize) {
        super(16, (float) 0.75, true);
        this.cacheSize = cacheSize;
    }

    // 是否移除最古老的元素，主要是put时
    @Override
    protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
        return size() > cacheSize;
    }
}

Collections.synchronizedXxx

生成Collection的同步对象。

// collectiontopics/Synchronization.java
// Using the Collections.synchronized methods
import java.util.*;

public class Synchronization {
  public static void main(String[] args) {
    Collection<String> c = Collections.synchronizedCollection(new ArrayList<>());
    List<String> list = Collections.synchronizedList(new ArrayList<>());
    Set<String> s = Collections.synchronizedSet(new HashSet<>());
    Set<String> ss = Collections.synchronizedSortedSet(new TreeSet<>());
    Map<String,String> m = Collections.synchronizedMap(new HashMap<>());
    Map<String,String> sm = Collections.synchronizedSortedMap(new TreeMap<>());
  }
}

Fail Fast / ConcurrentModificationException

在使用Iterator读取过程中，不能修改Collection，否则会抛出ConcurrentModificationException异常。可以使用ConcurrentHashMap、CopyOnWriteArrayList、CopyOnWriteArraySet 避免异常。

// collectiontopics/FailFast.java
// Demonstrates the "fail-fast" behavior
import java.util.*;

public class FailFast {
  public static void main(String[] args) {
    Collection<String> c = new ArrayList<>();
    Iterator<String> it = c.iterator();
    c.add("An object");
    try {
      String s = it.next();
    } catch(ConcurrentModificationException e) {
      System.out.println(e);
    }
  }
}
/* Output:
java.util.ConcurrentModificationException
*/

equals和hashcode方法

1、使用自己的类作为HashMap的键，必须同时重载 hashCode() 和 equals()。

2、Map原理：Key ==> 散列函数 ==> key.hashCode() ==> Bucket数组 => Bucket (桶，List实现) ==> key.equals() ==> 找到Value。

3、hashCode() 设计：

速度快
分布均匀
对象未改变的情况下，多次调用返回值一致
可以重新创建新的Key，使其和之前Key的hashCode相同（不然Key对象被回收了之后就没法取出对应的Value了）

4、多个Object联合计算HashCode，可以使用Objects工具方法

@Override
public int hashCode() {
return Objects.hash(name, id);
}

public final class Objects {
    public static int hash(Object... values) {
        return Arrays.hashCode(values);
    }
}
public class Arrays {
    public static int hashCode(Object a[]) {
        if (a == null)
            return 0;

        int result = 1;

        for (Object element : a)
            result = 31 * result + (element == null ? 0 : element.hashCode());

        return result;
    }
}

HashMap原理与调优

HashMap内部有一个Node数组 Node<K,V>[] table，也称为桶，每个Node又是一个链表节点。

table [
    Node -> Node,
    (null),
    Node,
    Node -> Node -> Node,
    Node
]

参数

Capacity：容量，即桶的数量
Initial Capacity：初始容量
Size：尺寸
Load Factor：负载因子 = Size / Capacity

实现

HashMap的容量为2的次方（这样计算桶的Index时，可以用位运算代替除法）。
当负载因子超过0.75时，执行rehashing操作，将容量扩展为2倍。
轻负载，插入和查询性能好（但会减慢迭代器遍历，因为table太长）。

调优方法：

如果提前知道条目个数，直接创建容量合适的 HashMap。

参考资料： https://lingcoder.github.io/OnJava8/