问题
最近在线上环境遇到一个奇怪的问题,仅仅20qps的压测,产生非常多的毛刺,初步判断认为毛刺是由于YGC导致。
线上环境为docker容器,4核8G内存,openjdk8u
排查过程
于是登录线上机器查看GC日志,发现GC Workers: 63,但压测服务器仅4核,显然正常情况下不可能有63个GC线程。
[GC pause (G1 Evacuation Pause) (young), 0.0054131 secs]
10 [Parallel Time: 3.6 ms, GC Workers: 63]
11 [GC Worker Start (ms): Min: 1315.3, Avg: 1315.4, Max: 1315.4, Diff: 0.1]
12 [Ext Root Scanning (ms): Min: 0.3, Avg: 0.5, Max: 0.9, Diff: 0.6, Sum: 1.9]
13 [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
14 [Processed Buffers: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0]
15 [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
16 [Code Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.6, Diff: 0.6, Sum: 0.9]
17 [Object Copy (ms): Min: 2.5, Avg: 2.7, Max: 3.0, Diff: 0.5, Sum: 11.0]
18 [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
19 [Termination Attempts: Min: 1, Avg: 2.2, Max: 4, Diff: 3, Sum: 9]
20 [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.1]
21 [GC Worker Total (ms): Min: 3.4, Avg: 3.5, Max: 3.6, Diff: 0.1, Sum: 14.0]
22 [GC Worker End (ms): Min: 1318.9, Avg: 1318.9, Max: 1318.9, Diff: 0.1]
23 [Code Root Fixup: 0.0 ms]
24 [Code Root Purge: 0.0 ms]
25 [Clear CT: 0.1 ms]
26 [Other: 1.6 ms]
27 [Choose CSet: 0.0 ms]
28 [Ref Proc: 0.9 ms]
29 [Ref Enq: 0.0 ms]
30 [Redirty Cards: 0.1 ms]
31 [Humongous Register: 0.1 ms]
32 [Humongous Reclaim: 0.0 ms]
33 [Free CSet: 0.2 ms]
34 [Eden: 204.0M(204.0M)->0.0B(200.0M) Survivors: 0.0B->4096.0K Heap: 204.0M(4096.0M)->3728.6K(4096.0M)]
执行jstack打印堆栈,发现存在几十个C1、C2编译线程。
ParallelGCThreads的计算公式如下:
ParallelGCThreads = 8 + ((N - 8) * 5/8)
把线程数63代入上述公式,得出N=96,恰巧是宿主机的核数。
因此判断JVM获取可用核数错误,拿到的是宿主机核数而非容器可用核数。
availableProcessors()的源码分析
availableProcessors方法在java.lang.Runtime类中,是个native方法。需要跟到hotspot代码中调查。
// Runtime.java
// native代码
// 返回JAVA进程可用核数
public native int availableProcessors();
JDK 8u191之前的代码
// os_linux.cpp
int os::active_processor_count() {
// Linux doesn't yet have a (official) notion of processor sets,
// so just return the number of online processors.
int online_cpus = ::sysconf(_SC_NPROCESSORS_ONLN);
assert(online_cpus > 0 && online_cpus <= processor_count(),


1061

被折叠的 条评论
为什么被折叠?



