The Schema Was Not Supposed to Run Code

A schema feels harmless until it becomes JavaScript.

That is the uncomfortable bit in the new protobuf.js research from Cyera. The library is not a tiny corner dependency. Cyera says protobuf.js has more than 48 million weekly npm downloads. It sits inside Node services, build tools, SDKs, CLIs, and probably a few internal systems nobody has thought about since the last dependency audit.

Cyera found six vulnerabilities in protobuf.js and protobufjs-cli. The short version: crafted protobuf data can trigger denial of service, and in some paths it can become code execution. The fixed versions are protobuf.js 7.5.6 and 8.0.2, plus protobufjs-cli 1.2.1 and 2.0.2.

If you run JavaScript services that accept protobuf definitions, JSON descriptors, or generated protobuf code from outside a trusted release process, this is not an abstract library bug. It is an execution boundary problem.

The boring object becomes the dangerous one

The part worth paying attention to is not only the CVSS score.

Cyera describes one chain where attacker-controlled input reaches a prototype pollution gadget. Later, the same Node process uses protobuf.js to encode or decode a message. Because protobuf.js resolves type names through plain property lookups, a polluted Object.prototype can make an attacker-controlled string look like a valid protobuf primitive.

Then protobuf.js does the thing that makes this nasty: it generates an encoder or decoder function and compiles it with Function().

That is normal for performance. It is also exactly why schema-derived data needs to be treated with suspicion. Once data gets inserted into generated code, the boundary has moved. You are no longer just parsing a message. You are creating executable JavaScript based on names and shapes that may have come from somewhere else.

There is a second path in pbjs, the protobuf.js code-generation CLI. Cyera says crafted schema names can pass through into emitted JavaScript and run when the generated file is imported. That one does not need the prototype pollution prerequisite.

This is the kind of bug that gets missed because everyone mentally files schemas under "configuration" instead of "code".

Why this keeps happening

JavaScript has a long history of turning structure into behavior.

Template engines compile templates. Validators compile rules. ORMs build query functions. Serializers generate fast encoders and decoders. Build tools transform config into executable bundles. Most of this is reasonable engineering. Nobody wants to interpret every field slowly on every request if they can compile a faster path once.

But the security model often lags behind the performance trick.

Teams will lock down direct user input, then casually allow user-controlled or partner-controlled schemas because schemas look like metadata. They will review application code, then import generated code from a build step without asking whether the generator handled hostile names correctly. They will block eval, then forget that new Function() in a dependency is the same family of problem.

This is not a reason to panic about every code generator. It is a reason to stop pretending generated code is somehow less real than handwritten code.

What to check now

Start with the simple inventory question: do you have protobufjs or protobufjs-cli anywhere in production code, build tooling, SDK generation, or CI?

If yes:

Update protobuf.js to 7.5.6 or 8.0.2.
Update protobufjs-cli to 1.2.1 or 2.0.2.
Check whether any service accepts protobuf schemas, descriptors, or .proto files from users, tenants, plugins, partners, or CI artifacts.
Treat generated protobuf files as code, not as inert build output.
Look for prototype pollution bugs in the same process. They can become much worse when a later dependency compiles generated functions.
Make dependency scanners look at dev and build dependencies too. CLI generators often sit outside the normal runtime threat model until they do not.

For most small teams, the practical fix is boring: patch, regenerate artifacts if needed, and make sure untrusted schema input is not flowing into production services or build steps.

The bigger lesson

The security boundary in modern apps is not just HTTP input anymore.

It is package install scripts. It is CI YAML. It is browser extensions. It is model prompts. It is generated SDKs. It is schemas that get compiled into functions because performance matters.

That does not mean everything is doomed. It means the old question, "is this code or data?", is often the wrong question. A better one is: "can this data influence code that will run later?"

If the answer is yes, treat it like code review territory. Pin it. Patch it. Keep it out of untrusted hands. And if it crosses a trust boundary, assume someone will eventually try to make it execute.

Sources

▸ TAGS

#protobufjs#nodejs#supply-chain#code-generation#rce#developer-security#javascript

← BACK TO ALL ARTICLES