Patching Frida JDK17 Support

TLDR;

In this post, I document my journey patching the Frida Java Bridge to fully support hooking JDK 17 (HotSpot) methods. While the GitHub issue #333 details the resultant solution, I wanted to document the process I underwent to find the solution, as well as the lessons learned along the way. Hopefully this will give the next hacker that undertakes porting Frida to newer versions of Java a head start.

Introduction

Back in late 2024, I set out to explore leveraging Frida's Java integration to identify security issues at runtime, kind of like an Interactive Application Security Testing (IAST) tool. The goal initially was to hook dangerous methods like those used in deserialization, log the inputs and inspect them for potential vulnerabilities. Frida had indicated "basic support for JDK 17" back in 2022, and at the time that was the minimum version I needed for the Java application I was targeting. Unfortunately, I found out very quickly that hooking any method in JDK 17 led to the program crashing whenever that hooked method was called, as I documented in GitHub issue #333.

Initially I thought if I documented the problem well enough and provided sample code, someone from the community would swoop in and save the day... after a couple of months of waiting, I realized the calvary was NOT coming and that I could be the hero in this story. So, I got to work.

Getting Started

I started by creating some bare-bones, sample code to work with. The Java code below calls a method called someTest(), and the objective is to change the return value of someTest() to true using Frida.

Frida Java Demo - main/src/main/java/com/frida/App.java

package com.frida;
import java.util.concurrent.TimeUnit;
public class App 
{
    public static void main( String[] args ) {
        System.out.println("[*] Checking status of someTest()...");
        try {
            while(true) {
                boolean result = someTest();
                if(result) { System.out.println("[*] Congrats, result has been changed!"); }
                TimeUnit.SECONDS.sleep(10);
            }
        } catch (InterruptedException e) { }
    }
    public static boolean someTest() { return false; }
}

The Frida instrumentation code I wrote to change the return value is below:

Frida Java Demo - instrumentation-script.js

Java.perform(function () {
    console.log("Is Java Available: " + Java.available)
    var MyClass = Java.use('com.frida.App');
    // Hook the "someTest" function
    MyClass.someTest.implementation = function () {
        console.log('someTest() function called');
        return true;
    };
});

(The full source code of the Java demo application and instrumentation script, along with build instructions is located on my github page here).

First Big Break

When the Frida instrumented JDK17 application would crash, it would produce an error message "Unable to make thread_from_jni_environment() helper for the current architecture" in the Frida console.

Frida CLI error when hooking JDK17 method

Is Java Available: true
Error: Unable to make thread_from_jni_environment() helper for the current architecture
    at <anonymous> (frida/node_modules/frida-java-bridge/lib/jvm.js:232)
    at <anonymous> (frida/node_modules/frida-java-bridge/lib/jvm.js:276)
    at <anonymous> (frida/node_modules/frida-java-bridge/lib/vm.js:12)
    at j (frida/node_modules/frida-java-bridge/lib/jvm.js:291)

I started by looking for that error message in the Frida source code. I cloned down the main Frida repo, and following the build instructions, executed make. The main Frida repo then cloned down all of the other supporting Frida repos (git submodules) and compiled. I used ripgrep (although normal grep would have worked) - with the -a flag to include binary files - to search through the main Frida code base, looking for that error message.

I found two instances of that error:

In the frida/build/subprojects/frida-gum/bindings/gumjs/runtime.bundle.p/node_modules/frida-java-bridge/lib/jvm.js JavaScript file
In the build/subprojects/frida-python/frida/_frida/_frida.abi3.so dynamic library

Inspecting the jvm.js file, the error message was undoubtedly coming from the makeThreadFromJniHelper function.

Frida Java Bridge - jvm.js

function makeThreadFromJniHelper (api) {
  let offset = null;

  const tryParse = threadOffsetParsers[Process.arch];
  if (tryParse !== undefined) {
    const vm = new VM(api);
    const findClassImpl = vm.perform(env => env.handle.readPointer().add(6 * pointerSize).readPointer());
    offset = parseInstructionsAt(findClassImpl, tryParse, { limit: 10 });
  }

  if (offset === null) {
    return () => {
      throw new Error('Unable to make thread_from_jni_environment() helper for the current architecture');
    };
  }
  ...

After some time spent analyzing the makeThreadFromJniHelper() function and its call graph, I figured out it was looking for a lea (Load Effective Address) assembly instruction at most (e.g. limit) 10 instructions away from the base address of the FindClass() (represented by the variable findClassImpl in the Frida code) function in the libjvm.so library. FindClass() is a JNI function that is responsible for finding and loading the Java class definition for a given class name. If the lea isn't found 10 instructions away, it would produce that error message.

At this point, I had a couple of theories as to what was going wrong:

The FindClass() function pointer wasn't correct for JDK17's libjvm.so library
The lea assembly instruction is more than 10 instructions away from the FindClass() base address

To test these theories, I needed a way to inspect the Frida internal state during runtime. That meant modifying Frida so it would echo out the variables in jvm.js that contained the data I was interested in.

Making Changes to Frida

The jvm.js file I discovered earlier containing the crashing makeThreadFromJniHelper() function comes from the frida-java-bridge Git submodule. When the main Frida repo is built with make, it downloads all of the supporting Git Submodules and produces a _frida.abi3.so dynamic library. The _frida.abi3.so is leveraged by the Python Frida command, and is stored in a location like $HOME/.local/lib/python3.11/site-packages/frida/_frida.abi3.so on Linux. So by modifying the jvm.js file (located at frida/build/subprojects/frida-gum/bindings/gumjs/runtime.bundle.p/node_modules/frida-java-bridge/lib/jvm.js), running a make clean; make on the main Frida repo, I could produce a new _frida.abi3.so library and copy that to Python's site-packages folder to activate my changes!

Modifying the Frida Java Bridge

# Add some console.log() statements to echo out the internal state
$ vi frida/build/subprojects/frida-gum/bindings/gumjs/runtime.bundle.p/node_modules/frida-java-bridge/lib/jvm.js

# Build the main Frida repo, with our changes in the frida-java-bridge
$ cd frida ; make clean; make

# Activate the new _frida.abi3.so containing our changes by copying it to the Python site-packages folder
$ cp build/subprojects/frida-python/frida/_frida/_frida.abi3.so $HOME/.local/lib/python3.11/site-packages/frida/_frida.abi3.so

# Use the modified Frida as normal
frida -p <pid> -l <instrumentation-script>

Peeking Under the Hood

Once I figured out how to modify the frida-java-bridge, I started adding some console.log() statements to examine its internal state.

I started by checking whether the pointer to the base address of FindClass() was correct by adding a log statement to the makeThreadFromJniHelper() function:

Frida Java Bridge - jvm.js

function makeThreadFromJniHelper (api) {
  let offset = null;

  const tryParse = threadOffsetParsers[Process.arch];
  if (tryParse !== undefined) {
    const vm = new VM(api);
    const findClassImpl = vm.perform(env => env.handle.readPointer().add(6 * pointerSize).readPointer());
    offset = parseInstructionsAt(findClassImpl, tryParse, { limit: 10 });
    // Log the base address of the FindClass function
    console.log("findClassImpl: " + findClassImpl);
  }

After recompiling the Frida repo and activating the new _frida.abi3.so, I instrumented the Java Demo application once again, and was presented with the FindClass() base pointer:

findClassImpl: 0x7fc19d423f20

Great, we can now verify that pointer Frida gave me with GDB.

PwnDBG Optional

NOTE: I used the awesome GDB plug-in pwndbg to colorize the output and provide some nice quality-of-life enhancements. While not strictly necessary, I'd highly recommend!

$ gdb /opt/adoptium/jdk-17.0.12+7/lib/server/libjvm.so
...
pwndbg> disassemble 0x00007fc19d423f20
Dump of assembler code for function jni_FindClass:
   0x00007fc19d423f20 <+0>:     endbr64                 # 1
   0x00007fc19d423f24 <+4>:     push   rbp              # 2
   0x00007fc19d423f25 <+5>:     mov    rbp,rsp          # 3
   0x00007fc19d423f28 <+8>:     push   r15              # 4
   0x00007fc19d423f2a <+10>:    push   r14              # 5
   0x00007fc19d423f2c <+12>:    push   r13              # 6
   0x00007fc19d423f2e <+14>:    mov    r13,rdi          # 7
   0x00007fc19d423f31 <+17>:    push   r12              # 8
   0x00007fc19d423f33 <+19>:    mov    r12,rsi          # 9
   0x00007fc19d423f36 <+22>:    push   rbx              # 10
   0x00007fc19d423f37 <+23>:    lea    rbx,[rdi-0x2b0]  # 11

By using the disassemble command with the Frida-provided pointer in GDB, I was able to confirm the pointer is indeed accurate, as evidence by the "Dump of assembler code for function jni_FindClass" output.

However, if you recall, Frida is looking for a lea instruction at most 10 instructions away from the base address of FindClass(). As can be seen from the GDB output above, the first lea instructions is 11 instructions away for JDK17!

By increasing the limit to 11, the "Unable to make thread_from_jni_environment() helper for the current architecture" error wasn't being thrown anymore!

Frida Java Bridge - jvm.js

function makeThreadFromJniHelper (api) {
    ...
    // 10 -> 11
    offset = parseInstructionsAt(findClassImpl, tryParse, { limit: 11 });   
    ...
}

However, the Demo application was still crashing whenever the Frida-modified someTest() function was called. To make thing even more difficult, there was no error message this time around to help pinpoint where the crash was occurring. So much for a quick fix!

What Is Frida Doing?

By this time, I needed to better understand how the JVM manages Java objects internally, as well as how Frida was modifying Java methods in order to troubleshoot the crash.

It turns out, the JVM stores the metadata of Java objects - like classes and methods - in a memory region called the "Metaspace". When a Java class or method is created, a data structure is written to the Metaspace that represents the created Java object. These structures are defined in the JDK source code, for instance a "Method" Java object is defined in method.hpp.

When Frida hooks a Java method, it actually modifies those data structures in the Metaspace. By replacing key pointers in the InstanceKlass, ConstMethod and Method structures, Frida is able to force the JVM to execute Frida's Javascript bridge - which then runs the user-provided instrumentation script - instead of the original targeted method.

Frida uses the Java Native Interface (JNI) to find the pointers to these structures in the Metaspace and then uses pre-calculated offsets to read/write to the various fields in those structures. How does Frida know what parts of the Metaspace to modify in order to hook a Java method? Simple(-ish)! It borrows some of the logic from the JVMTI RedefineClasses() function. As the official documentation states:

"This function is used to replace the definition of a class with a new definition, as might be needed in fix-and-continue debugging. Redefinition can cause new versions of methods to be installed."

By comparing the Frida Java Bridge code to the VM_RedefineClasses::doit() function, you can get a sense for how alike the two code bases are. One example is below of Frida's and JDK's codebase both calling the flush_dependent_code(), ClassLoaderDataGraph::classes_do() and ResolvedMethodTable::adjust_method_entries() functions.

Frida Java Bridge - installJVM()

  function installJvmMethod (method, methodId, thread) {
  ...
  const mark = api['VM_RedefineClasses::mark_dependent_code'];
  const flush = api['VM_RedefineClasses::flush_dependent_code'];
  if (mark !== undefined) {
    mark(NULL, method.instanceKlass);
    flush();
  } else {
    flush(NULL, method.instanceKlass, thread);
  }

  const traceNamePrinted = Memory.alloc(1);
  traceNamePrinted.writeU8(1);
  api['ConstantPoolCache::adjust_method_entries'](method.cache, method.instanceKlass, traceNamePrinted);
  ...
  api['ClassLoaderDataGraph::classes_do'](klassClosure);
  ...
  }

JDK17 - src/hotspot/share/prims/jvmtiRedefineClasses.cpp

VM_RedefineClasses::doit() {
  ...
  flush_dependent_code();

  // Adjust constantpool caches and vtables for all classes
  // that reference methods of the evolved classes.
  // Have to do this after all classes are redefined and all methods that
  // are redefined are marked as old.
  AdjustAndCleanMetadata adjust_and_clean_metadata(current);
  ClassLoaderDataGraph::classes_do(&adjust_and_clean_metadata);

  // JSR-292 support
  if (_any_class_has_resolved_methods) {
  bool trace_name_printed = false;
  ResolvedMethodTable::adjust_method_entries(&trace_name_printed);
  }
...
}

My assumption currently is that something changed in the RedefineClasses() function or its downstream call graph between JDK16 -> JDK17. But what?

Examining Memory Structures

Given what I knew about how the frida-java-bridge worked, I decided to compare the Method structure in the Metaspace before and after Frida modified it. First, I needed a pointer in the Metaspace to the original (aka old) and new (aka frida-modified) Methods. By adding an extra console.log() statement in frida-java-bridge's jvm.js file - specifically in the doManglers() function - we can extract that information.

Frida Java Bridge - jvm.js

function doManglers (vm) {
  ...
  vm.perform(env => {
    const api = getApi();
    const thread = api['JavaThread::thread_from_jni_environment'](env.handle);
    let force = false;

    withJvmThread(() => {
      localReplaceManglers.forEach(mangler => {
        const { method, originalMethod, impl, methodId, newMethod } = mangler;
        if (originalMethod === null) {
          mangler.originalMethod = fetchJvmMethod(method);
          mangler.newMethod = nativeJvmMethod(method, impl, thread);
          // This will print the pointers to the original (old) and new (frida-modified) method
          console.log("mangler: " + JSON.stringify(mangler));   
  ...

The output looked something like this (it has been truncated from its original 107 line length):

{
    "methodId": "0x7f3308174a18",
    "method": "0x7f32d44004a8",
    "newMethod": {
        "method": "0x7f323befe338",         // frida-modified-method pointer
        "methodSize": 104,
        "oldMethod": {
            "method": "0x7f32d44004a8",     // Original method pointer
            "methodSize": 104,
        }
    },
}

Now, we can use GDB's cast feature to inspect the structure of the target method (someTest() in this example) before-and-after applying Frida's hooks. A Java core dump file from the crashed sample app contains these data structures, so generate one of those before proceeding.

Dwarf Symbols Required

At this point, I switched to using the JDK17 version from Debian's, technically Kali, repositories (openjdk-17-dbg) as it contained the necessary Dwarf symbols required for casting. I had been using Adoptium's JDKs for the analysis above as they have the necessary debugging symbols Frida needs, but without the Dwarf symbols the GDB casting failed.

gdb /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so ./core.116453

pwndbg> set $newMethod = (Method*) 0x7f323befe338
pwndbg> p *$newMethod
$3 = {
  <Metadata> = {
    <MetaspaceObj> = {<No data fields>},
    members of Metadata:
    _vptr.Metadata = 0x7f330e51f6c8
  },
  members of Method:
  _constMethod = 0x7f323befe300,
  _method_data = 0x0,
  _method_counters = 0x0,
  _adapter = 0x0,
  _access_flags = {
    _flags = 234881289
  },
  _vtable_index = -2,
  _intrinsic_id = 0,
  _flags = 0,
  _trace_flags = {
    _flags = 0
  },
  _i2i_entry = 0x0,
  _from_compiled_entry = 0x0,
  _code = 0x0,
  _from_interpreted_entry = 0x0
}

pwndbg> set $oldMethod = (Method*) 0x7f32d44004a8
pwndbg> p *$oldMethod
$4 = {
  <Metadata> = {
    <MetaspaceObj> = {<No data fields>},
    members of Metadata:
    _vptr.Metadata = 0x7f330e51f6c8
  },
  members of Method:
  _constMethod = 0x7f32d4400470,
  _method_data = 0x0,
  _method_counters = 0x7f32d4400630,
  _adapter = 0x7f3308104b20,
  _access_flags = {
    _flags = 196617
  },
  _vtable_index = -2,
  _intrinsic_id = 0,
  _flags = 0,
  _trace_flags = {
    _flags = 0
  },
  _i2i_entry = 0x7f32f8a92540 "H\213S\b\017\267J,\017\267R*+с\372\365\001",
  _from_compiled_entry = 0x7f32f8ad6255 "D\213S(A\367\302\b",
  _code = 0x0,
  _from_interpreted_entry = 0x7f32f8a92540 "H\213S\b\017\267J,\017\267R*+с\372\365\001"
}

As seen above, a number of fields are blank in the frida-modified-method (new) data structure that are present in the original method (old) data structure. I asked ChatGPT for information on these missing fields, and here's what it told me:

_adapter: This field holds a pointer to the adapter code blob used to bridge between the caller's calling convention (e.g., from compiled or native) and the expected calling convention of the Java method.
_from_interpreted_entry - JVM entry when called from interpreted code
_from_compiled_entry - Entry point for compiled code calling this method
_i2i_entry - Entry point for interpreted-to-interpreted calls
_method_counters - Pointer to method profiling info (e.g., invocation counts)

Based on this new information, I searched the JDK17 and JDK16 code bases for instances of where these fields were set in JDK16 but not JDK17. I concentrated my search on the code paths that were involved with a typical RedefineClasses() function execution. Specifically, I focused on where _i2i_entry and _from_interpreted_entry were defined as the stack trace (1) indicated that the crash happened after calling jni_invoke_static(), and these fields are important for that function call.

--------------- T H R E A D ---------------

Current thread (0x00007f3308018790): JavaThread "main" [_thread_in_Java, id=116454, stack(0x00007f330cd5b000,0x00007f330ce5b000)]

Stack: [0x00007f330cd5b000,0x00007f330ce5b000], sp=0x00007f330ce59988, free space=1018k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) j com.frida.App.main([Ljava/lang/String;)V+8 v ~StubRoutines::call_stub V [libjvm.so+0x8593f2] JavaCalls::call_helper(JavaValue, methodHandle const&, JavaCallArguments, JavaThread)+0x302 V [libjvm.so+0x8f7a78] jni_invoke_static(JNIEnv_, JavaValue, _jobject, JNICallType, _jmethodID, JNI_ArgumentPusher, JavaThread*) [clone .constprop.1]+0x3a8 V [libjvm.so+0x909174] jni_CallStaticVoidMethod+0x254 C [libjli.so+0x47e3] JavaMain+0xdd3 C [libjli.so+0x813d] ThreadJavaMain+0xd

It didn't take long (relatively speaking) to find a potential culprit. The frida-java-bridge calls the Method::restore_unshareable_info() function from libjvm.so.

Frida Java Bridge - jvm.js

function nativeJvmMethod (method, impl, thread) {
  const api = getApi();
  ...
  api['Method::restore_unshareable_info'](newMethod.method, thread);

In JDK16, the Method::restore_unshareable_info() called Method::link_method(), which is responsible for setting important data structures in the Frida-modified method, like _i2i_entry and _from_interpreted_entry. This behavior does not occur in JDK17.

JDK 16 - src/hotspot/share/oops/method.cpp

void Method::restore_unshareable_info(TRAPS) {
  assert(is_method() && is_valid_method(this), "ensure C++ vtable is restored");

  // Since restore_unshareable_info can be called more than once for a method, don't
  // redo any work.
  if (adapter() == NULL) {
    methodHandle mh(THREAD, this);
    link_method(mh, CHECK);           // <-- link_method() call present in JDK16. it sets
  }                                   // the _i2i_entry and _from_interpreted_entry fields
}

JDK 17 - src/hotspot/share/oops/method.cpp

void Method::restore_unshareable_info(TRAPS) {
  assert(is_method() && is_valid_method(this), "ensure C++ vtable is restored");
}

So if we can manually call the link_method() function in libjvm.so, would that correctly configure the missing fields? Spoiler alert: it did! This turned out to be the last issue to get JDK17 support working. Now the easy part (again relatively speaking): implementing the fix.

Patching frida-java-bridge

The complete pull request located here contains the complete fix, which is only 9 lines long, excluding the comments! There are two important pieces to the patch:

Registering the link_method() function
Calling the link_method() function

Registering the Function

The frida-java-bridge works in part by calling various functions from libjvm.so to manipulate the data structures in the Metaspace. In the jvm.js file, there's a functions object that lists all of the functions Frida will attempt to resolve so it can be called later on.

Frida Java Bridge - jvm.js

functions: {
...
          _ZN6Method24restore_unshareable_infoEP6Thread: ['Method::restore_unshareable_info', 'void', ['pointer', 'pointer']],
          // This is the mangled version of link_method() that I added
          _ZN6Method11link_methodERK12methodHandleP10JavaThread: ['Method::link_method', 'void', ['pointer', 'pointer', 'pointer']],
          _ZN6Method10jmethod_idEv: ['Method::jmethod_id', 'pointer', ['pointer']],

Those weird-looking entries above like _ZN6Method11link_methodERK12methodHandleP10JavaThread are mangled C++ symbols from libjvm.so. To lookup a function's mangled name, like link_method(), I used the nm command to dump and search the symbol table. Use nm with and without the -C flag to find the mangled name.

$ nm -C /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so | grep 'link_method'
0000000000834130 t InstanceKlass::link_methods(JavaThread*)
0000000000ebd500 t SystemDictionary::link_method_handle_constant(Klass*, int, Klass*, Symbol*, Symbol*, JavaThread*)
0000000000c0a460 t Method::link_method(methodHandle const&, JavaThread*)  # Third from the top
0000000000c0a410 t Method::unlink_method()

$ nm /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so | grep 'link_method'
0000000000834130 t _ZN13InstanceKlass12link_methodsEP10JavaThread
0000000000ebd500 t _ZN16SystemDictionary27link_method_handle_constantEP5KlassiS1_P6SymbolS3_P10JavaThread
0000000000c0a460 t _ZN6Method11link_methodERK12methodHandleP10JavaThread   # Also third from the top
0000000000c0a410 t _ZN6Method13unlink_methodEv

Once you have the mangled name, you can add that symbol along with the parameter list of the function to the functions object in jvm.js.

Calling the function

Now we need to call the link_method() in the frida-java-bridge. The function signature for link_method() looks like this:

void Method::link_method(const methodHandle& h_method, TRAPS) {

For this, I basically reverse engineered how other libjvm.so functions are called in frida-java-bridge. Here's the JavaScript code I ultimately came up wtih:

Frida Java Bridge - jvm.js

if (api.version >= 17) {
  // Only certain JDK versions of restore_unshareable_info() call
  // link_method(). Manually call if necessary.
  const methodHandle = Memory.alloc(2 * pointerSize);
  methodHandle.writePointer(newMethod.method);
  methodHandle.add(pointerSize).writePointer(thread);
  api['Method::link_method'](newMethod.method, methodHandle, thread);
}

Conclusion

What initially was an attempt to implement an IAST tool turned into a nine month journey of understanding how the frida-java-bridge worked and why method replacement in JDK17 was failing. Along the way I gained a greater appreciation for how impressive Frida is and the technical wizardry that goes into making it work. As JDK versions greater then 17 are currently unsupported, I hope this blog post reduces the time it takes the next hacker to pick up where I left off.

A big thank you to oleavr for reviewing and approving my Pull Request, as well as to all of the other contributors that put their time and energy into making this fantastic tool!