May 17th, 2024

Code highlighting with Shiki & keeping your Cloudflare worker bundle small

Shiki is a powerful & flexible code highlighter that's fully ESM compatible, making it ideal for web use since it can be tree-shaken, loading only the necessary parts as needed.

Looks straightforward to me, npm install shiki, import codeToHtml, done!

We now have this basic Code component that looks something like this:

import { useEffect, useState } from 'react';
import { BundledLanguage, codeToHtml } from 'shiki';

export interface CodeProps {
  children?: string;
  lang: BundledLanguage;
}

export const Code = (props: CodeProps) => {
  const { children: code, lang } = props;

  const [highlightedCode, setHighlightedCode] = useState<string>();
  useEffect(() => {
    if (!code) {
      return;
    }

    codeToHtml(code, { lang, theme: 'dracula' }).then(setHighlightedCode);
  }, [code, lang]);

  if (!code) {
    return null;
  }

  return (
    <pre className="bg-slate-800 text-slate-400 p-4 rounded overflow-x-auto">
      {highlightedCode ? (
        <code dangerouslySetInnerHTML={{ __html: highlightedCode }} />
      ) : (
        <code>{code}</code>
      )}
    </pre>
  );
};

And can be used as follows <Code lang="javascript">console.log('Hello world')</Code>.

It will first render the code as raw text, and once the client side bundles are loaded & executing it will replace it with the highlighted html code. Great, works perfectly fine (so I thought). And I can see in the inspector it's only loading what it needs, that's just an additional 288 KB over the wire!

Cloudflare deploy successful, but nothing to see?

The CI showed successful deployment logs, but the page kept on showing "Nothing is here yet". The function logs in the Cloudflare console also didn't reveal much more besides An unknown error occured. (Code: 8000068). That's weird huh!?

After some time I came across this PR that would introduce better feedback in the wrangler CLI tool, that's also what I use to deploy. So I tried building locally and deploying with the updated version (3.57) and - surprise surprise - it did show an error!

🌍  Uploading... (87/286)
✨ Success! Uploaded 87 files (199 already uploaded) (2.06 sec)
✨ Uploading _headers
✨ Uploading _redirects
✨ Uploading Functions bundle

🌎 Deploying...
✘ [ERROR] Failed to publish your Function. Got error: Your Functions script is over the 1 MiB size limit (workers.api.error.script_too_large)

❌ Deployment failed!

I never really gave the worker bundle size a lot of thought, and focused mostly on keeping the client light. The worker bundle is the function & essentially our server side application. While the client only loads what it needs when it's used, all those files need to be bundled in the function in order to maybe serve them when they are requested.

The build folder size also substantially increased from ~1 MB to ~10 MB . Important to note though; this is full Vite build output containing all client & server files, not necessarily the worker bundle.

# without shiki implementation
➜ du -sh build
1.2M	build

# with shiki implementation
➜ du -sh build
9.7M	build

A more fine-grained bundle

Going back to the docs I noticed a section about bundles sizes, more fine-grained bundles and even specifically Cloudflare Workers. Although that last section was more focused on generated highlighted html code server side, which is not exactly what we want to do here.

I changed the implementation to use the slimmed down core highlighter and explicitly define which themes and languages we want to support. Not ideal if you work with a CMS and and just want to be able to reference whatever you like without changing your code. But I figured I'll just include some sensibles that I assume I will be using mostly.

The Code component is now a bit more explicit and bit less flexible and looks like this:

import { useEffect, useState } from 'react';
import { BundledLanguage } from 'shiki';

export interface CodeProps {
  children?: string;
  lang: BundledLanguage;
}

export const Code = (props: CodeProps) => {
  const { children: code, lang } = props;

  const [highlightedCode, setHighlightedCode] = useState<string>();
  useEffect(() => {
    if (!code) {
      return;
    }

    // https://shiki.style/guide/install#fine-grained-bundle
    import('shiki/core').then(async ({ getHighlighterCore }) => {
      const highlighter = await getHighlighterCore({
        themes: [
          import('shiki/themes/dracula.mjs'),
        ],
        langs: [
          import('shiki/langs/javascript.mjs'),
          import('shiki/langs/typescript.mjs'),
          import('shiki/langs/json.mjs'),
          import('shiki/langs/shell'),
          import('shiki/langs/console.mjs'),
        ],
        loadWasm: await import('shiki/wasm'),
      });

      setHighlightedCode(highlighter.codeToHtml(code, { lang, theme: 'dracula' }));
    });
  }, [code, lang]);

  if (!code) {
    return null;
  }

  return (
    <pre className="bg-slate-800 text-slate-400 p-4 rounded overflow-x-auto">
      {highlightedCode ? (
        <code dangerouslySetInnerHTML={{ __html: highlightedCode }} />
      ) : (
        <code>{code}</code>
      )}
    </pre>
  );
};

Awesome, still works! Let's deploy it and see how it looks on Cloudflare.

...

OK, let's try to build & deploy locally again.

✘ [ERROR] Failed to publish your Function. Got error: Your Functions script is over the 1 MiB size limit (workers.api.error.script_too_large)

# with fine-grained shiki implementation
➜ du -sh build
2.3M	build

The final build folder output decreased to ~2 MB but the resulting worker bundle apparently was still too large. I tried removing all languages but 1, but the issue persisted. Probably if I would strip enough code from the app and remove some other dependencies I could maybe get it below the 1 MB limit? Not really an option.

Using esm.sh CDN

The docs also mentioned using Shiki from CDN such as esm.sh. But what is esm.sh?

It's designed to provide an easy way to use ECMAScript modules (ESM) from npm in web browsers. It acts as a CDN (Content Delivery Network) that allows developers to import JavaScript modules directly into their browser applications without needing to install them locally or bundle them using a tool like Webpack.
Key features and benefits of esm.sh include:
ESM Compatibility: esm.sh ensures that npm packages are transformed into ES modules that can be natively imported by modern browsers.
Versioning: Developers can specify which version of a package they want to use, similar to how npm works.
CDN Performance: As a CDN, esm.sh provides fast, cached access to modules, improving load times and reducing server load.
Automatic Transpilation: It handles the transpilation of CommonJS and other module formats to ESM, making it easier to use packages that aren't originally authored as ES modules.
according to Chad (GEE-PEE-TEE)

Not as in loading 1 big bloated file and accessing through window, but really by embedding it as a module and relying on ESM imports using <script type="module">...</script>. That doesn't (really) work with React, even though you can render a script module within your component, you'll end up with a lot of unpredictable behaviour and the code not executing when you expect it to.

The framework I'm using however, Remix (any modern web framework really) also loads its client bundles through modules. So I'm guessing I can make use of similar code within my React component itself without trying to render a separate script module within it?

Attempt 3 & a refactor later our Code component now is again super flexible supporting any theme and any language, while it has (virtually) 0 impact on our client & server bundle, it now looks like this:

import { useEffect, useState } from 'react';
import { BundledLanguage } from 'shiki';

export interface CodeProps {
  children?: string;
  lang: BundledLanguage;
}

export const Code = (props: CodeProps) => {
  const { children: code, lang } = props;

  const [highlightedCode, setHighlightedCode] = useState<string>();
  useEffect(() => {
    if (!code) {
      return;
    }

    // @ts-expect-error: import from esm.sh to avoid large worker bundle
    import('https://esm.sh/[email protected]').then(async ({ codeToHtml }) => {
      setHighlightedCode(await codeToHtml(code, { lang, theme: 'dracula' }));
    });
  }, [code, lang]);

  if (!code) {
    return null;
  }

  return (
    <pre className="bg-slate-800 text-slate-400 p-4 rounded overflow-x-auto">
      {highlightedCode ? (
        <code dangerouslySetInnerHTML={{ __html: highlightedCode }} />
      ) : (
        <code>{code}</code>
      )}
    </pre>
  );
};

It works, and this time it also deploys to Cloudflare! Our own server & client bundle remains unaffected and Shiki is loaded from a blazingly fast CDN, resulting in even a smaller size than when it was bundled as ESM module originally, with only 279 KB extra over the wire.

The only downside is that you don't have types out of the box when using TypeScript with esm.sh.

Conclusion

For this type of package & use-case I think it makes perfect sense to load it from CDN. Noting that this would not have been an issue if you deploy your web app as a Node container or use some other more classic way of hosting Node.js web apps.

But personally I like the idea of "not having a server" & deploying my website to a distributed cloud function with fast response times across the globe regardless of users their geographical location. It's fast, free, lightweight, you have preview deploys and it requires 0 maintenance or follow up.

Demo: https://3feb5f78.wouterds.pages.dev/example-code-block

If wrangler would have failed fast to begin things would have been clearer earlier on. Luckily that PR is merged now and as of v3.57 it will fail fast.