Packages

Highlight

Tempest's highlighter is a package for server-side, high-performance, and flexible code highlighting.

Quickstart

Require tempest/highlight with composer:

composer require tempest/highlight

And highlight code like this:

$highlighter = new \Tempest\Highlight\Highlighter();

$code = $highlighter->parse($code, 'php');

Supported languages

All supported languages can be found in the GitHub repository.

Themes

There are a bunch of themes included in this package. You can load them either by importing the correct CSS file into your project's CSS file, or you can manually copy a stylesheet.

@import "../../../../../vendor/tempest/highlight/src/Themes/Css/highlight-light-lite.css";

You can build your own CSS theme with just a couple of classes, copy over the base stylesheet, and make adjustments however you like. Note that pre tag styling isn't included in this package.

Inline themes

If you don't want to or can't load a CSS file, you can opt to use the InlineTheme class. This theme takes the path to a CSS file, and will parse it into inline styles:

$highlighter = new Highlighter(new InlineTheme(__DIR__ . '/../src/Themes/Css/solarized-dark.css'));

Terminal themes

Terminal themes are simpler because of their limited styling options. Right now there's one terminal theme provided: LightTerminalTheme. More terminal themes are planned to be added in the future.

use Tempest\Highlight\Highlighter;
use Tempest\Highlight\Themes\LightTerminalTheme;

$highlighter = new Highlighter(new LightTerminalTheme());

echo $highlighter->parse($code, 'php');

Gutter

This package can render an optional gutter if needed.

$highlighter = new Highlighter()->withGutter(startAt: 10);

The gutter will show additions and deletions, and can start at any given line number:

  10  public function before(TokenType $tokenType): string
  11  {
  12      $style = match ($tokenType) {
13 -          TokenType::KEYWORD => TerminalStyle::FG_DARK_BLUE,
14 -          TokenType::PROPERTY => TerminalStyle::FG_DARK_GREEN,
15 -          TokenType::TYPE => TerminalStyle::FG_DARK_RED,
16 +          TokenType::GENERIC => TerminalStyle::FG_DARK_CYAN,
  17          TokenType::VALUE => TerminalStyle::FG_BLACK,
  18          TokenType::COMMENT => TerminalStyle::FG_GRAY,
  19          TokenType::ATTRIBUTE => TerminalStyle::RESET,
  20      };
  21
  22      return TerminalStyle::ESC->value . $style->value;
  23  }

Finally, you can enable gutter rendering on the fly if you're using commonmark code blocks by appending {startAt} to the language definition:

```php{10}
echo 'hi'!
```
10echo 'hi'!

Special highlighting tags

This package offers a collection of special tags that you can use within your code snippets. These tags won't be shown in the final output, but rather adjust the highlighter's default styling. All these tags work multi-line, and will still properly render its wrapped content.

Note that highlight tags are not supported in terminal themes.

Emphasize, strong, and blur

You can add these tags within your code to emphasize or blur parts:

  • {_ content _} adds the .hl-em class
  • {* content *} adds the .hl-strong class
  • {~ content ~} adds the .hl-blur class
{_Emphasized text_}
{*Strong text*}
{~Blurred text~}

This is the end result:

{_Emphasized text_}
{*Strong text*}
{~Blurred text~}

Additions and deletions

You can use these two tags to mark lines as additions and deletions:

  • {+ content +} adds the .hl-addition class
  • {- content -} adds the .hl-deletion class
{-public class Foo {}-}
{+public class Bar {}+}
public class Foo {}
public class Bar {}

As a reminder: all these tags work multi-line as well:

   1  public function before(TokenType $tokenType): string
   2  {
   3      $style = match ($tokenType) {
 4 -          TokenType::KEYWORD => TerminalStyle::FG_DARK_BLUE,
 5 -          TokenType::PROPERTY => TerminalStyle::FG_DARK_GREEN,
 6 -          TokenType::TYPE => TerminalStyle::FG_DARK_RED,
 7 -          TokenType::GENERIC => TerminalStyle::FG_DARK_CYAN,
 8 -          TokenType::VALUE => TerminalStyle::FG_BLACK,
 9 -          TokenType::COMMENT => TerminalStyle::FG_GRAY,
10 -          TokenType::ATTRIBUTE => TerminalStyle::RESET,
  11      };
  12
  13      return TerminalStyle::ESC->value . $style->value;
  14  }

Custom classes

You can add any class you'd like by using the {:classname: content :} tag:

<style>
.hl-a {
    background-color: #FFFF0077;
}

.hl-b {
    background-color: #FF00FF33;
}
</style>

```php
{:hl-a:public class Foo {}:}
{:hl-b:public class Bar {}:}
```
public class Foo {}
public class Bar {}

Inline languages

Within inline Markdown code tags, you can specify the language by prepending it between curly brackets:

`{php}public function before(TokenType $tokenType): string`

You'll need to set up commonmark properly to get this to work.

CommonMark integration

If you're using league/commonmark, you can highlight codeblocks and inline code like so:

use League\CommonMark\Environment\Environment;
use League\CommonMark\Extension\CommonMark\CommonMarkCoreExtension;
use League\CommonMark\MarkdownConverter;
use Tempest\Highlight\CommonMark\HighlightExtension;

$environment = new Environment();

$environment
    ->addExtension(new CommonMarkCoreExtension())
    ->addExtension(new HighlightExtension());

$markdown = new MarkdownConverter($environment);

Keep in mind that you need to manually install league/commonmark:

composer require league/commonmark;

Implementing a custom language

Let's explain how tempest/highlight works by implementing a new language — Blade is a good candidate. It looks something like this:

@if(! empty($items))
    <div class="container">
        Items: {{ count($items) }}.
    </div>
@endslot

In order to build such a new language, you need to understand three concepts of how code is highlighted: patterns, injections, and languages.

Patterns

A pattern represents part of code that should be highlighted. A pattern can target a single keyword like return or class, or it could be any part of code, like for example a comment: /* this is a comment */ or an attribute: #[Get(uri: '/')].

Each pattern is represented by a simple class that provides a regex pattern, and a TokenType. The regex pattern is used to match relevant content to this specific pattern, while the TokenType is an enum value that will determine how that specific pattern is colored.

Here's an example of a simple pattern to match the namespace of a PHP file:

use Tempest\Highlight\IsPattern;
use Tempest\Highlight\Pattern;
use Tempest\Highlight\Tokens\TokenType;

final readonly class NamespacePattern implements Pattern
{
    use IsPattern;

    public function getPattern(): string
    {
        return 'namespace (?<match>[\w\\\\]+)';
    }

    public function getTokenType(): TokenType
    {
        return TokenType::TYPE;
    }
}

Note that each pattern must include a regex capture group that's named match. The content that matched within this group will be highlighted.

For example, this regex namespace (?<match>[\w\\\\]+) says that every line starting with namespace should be taken into account, but only the part within the named group (?<match>…) will actually be colored. In practice that means that the namespace name matching [\w\\\\]+, will be colored.

Yes, you'll need some basic knowledge of regex. Head over to https://regexr.com/ if you need help, or take a look at the existing patterns in this repository.

In summary:

  • Pattern classes provide a regex pattern that matches parts of code.
  • Those regexes should contain a group named match, which is written like so (?<match>…), this group represents the code that will actually be highlighted.
  • Finally, a pattern provides a TokenType, which is used to determine the highlight style for the specific match.

Injections

Once you've understood patterns, the next step is to understand injections. Injections are used to highlight different languages within one code block. For example: HTML could contain CSS, which should be styled properly as well.

An injection will tell the highlighter that it should treat a block of code as a different language. For example:

<div>
    <x-slot name="styles">
        <style>
            body {
                background-color: red;
            }
        </style>
    </x-slot>
</div>

Everything within <style></style> tags should be treated as CSS. That's done by injection classes:

use Tempest\Highlight\Highlighter;
use Tempest\Highlight\Injection;
use Tempest\Highlight\IsInjection;
use Tempest\Highlight\ParsedInjection;

final readonly class CssInjection implements Injection
{
    use IsInjection;

    public function getPattern(): string
    {
        return '<style>(?<match>(.|\n)*)<\/style>';
    }

    public function parseContent(string $content, Highlighter $highlighter): ParsedInjection
    {
        return new ParsedInjection(
            content: $highlighter->parse($content, 'css')
        );
    }
}

Just like patterns, an injection must provide a pattern. This pattern, for example, will match anything between style tags: <style>(?<match>(.|\n)*)<\/style>.

The second step in providing an injection is to parse the matched content into another language. That's what the parseContent() method is for. In this case, we'll get all code between the style tags that was matched with the named (?<match>…) group, and parse that content as CSS instead of whatever language we're currently dealing with.

In summary:

  • Injections provide a regex that matches a blob of code of language A, while in language B.
  • Just like patterns, injection regexes should contain a group named match, which is written like so: (?<match>…).
  • Finally, an injection will use the highlighter to parse its matched content into another language.

Languages

The last concept to understand: languages are classes that bring patterns and injections together. Take a look at the HtmlLanguage, for example:

class HtmlLanguage extends BaseLanguage
{
    public function getName(): string
    {
        return 'html';
    }

    public function getAliases(): array
    {
        return ['htm', 'xhtml'];
    }

    public function getInjections(): array
    {
        return [
            ...parent::getInjections(),
            new PhpInjection(),
            new PhpShortEchoInjection(),
            new CssInjection(),
            new CssAttributeInjection(),
        ];
    }

    public function getPatterns(): array
    {
        return [
            ...parent::getPatterns(),
            new OpenTagPattern(),
            new CloseTagPattern(),
            new TagAttributePattern(),
            new HtmlCommentPattern(),
        ];
    }
}

This HtmlLanguage class specifies the following things:

  • PHP can be injected within HTML, both with the short echo tag <?= and longer <?php tags
  • CSS can be injected as well, JavaScript support is still work in progress
  • There are a bunch of patterns to highlight HTML tags properly

On top of that, it extends from BaseLanguage. This is a language class that adds a bunch of cross-language injections, such as blurs and highlights. Your language doesn't need to extend from BaseLanguage and could implement Language directly if you want to.

With these three concepts in place, let's bring everything together to explain how you can add your own languages.

Adding custom languages

So we're adding Blade support. We could create a new language class and start from scratch, but it'd probably be easier to extend an existing language, HtmlLanguage is probably the best. Let create a new BladeLanguage class that extends from HtmlLanguage:

class BladeLanguage extends HtmlLanguage
{
    public function getName(): string
    {
        return 'blade';
    }

    public function getAliases(): array
    {
        return [];
    }

    public function getInjections(): array
    {
        return [
            ...parent::getInjections(),
        ];
    }

    public function getPatterns(): array
    {
        return [
            ...parent::getPatterns(),
        ];
    }
}

With this class in place, we can start adding our own patterns and injections. Let's start with adding a pattern that matches all Blade keywords, which are always prepended with the @ sign. Let's add it:

final readonly class BladeKeywordPattern implements Pattern
{
    use IsPattern;

    public function getPattern(): string
    {
        return '(?<match>\@[\w]+)\b';
    }

    public function getTokenType(): TokenType
    {
        return TokenType::KEYWORD;
    }
}

And register it in our BladeLanguage class:

    public function getPatterns(): array
    {
        return [
            ...parent::getPatterns(),
            new BladeKeywordPattern(),
        ];
    }

Next, there are a couple of places within Blade where you can write PHP code: within the @php keyword, as well as within keyword brackets: @if (count(…)). Let's write two injections for that:

final readonly class BladePhpInjection implements Injection
{
    use IsInjection;

    public function getPattern(): string
    {
        return '\@php(?<match>(.|\n)*?)\@endphp';
    }

    public function parseContent(string $content, Highlighter $highlighter): ParsedInjection
    {
        return new ParsedInjection(
            content: $highlighter->parse($content, 'php')
        );
    }
}
final readonly class BladeKeywordInjection implements Injection
{
    use IsInjection;

    public function getPattern(): string
    {
        return '(\@[\w]+)\s?\((?<match>.*)\)';
    }

    public function parseContent(string $content, Highlighter $highlighter): ParsedInjection
    {
        return new ParsedInjection(
            content: $highlighter->parse($content, 'php')
        );
    }
}

Let's add these to our BladeLanguage class as well:

    public function getInjections(): array
    {
        return [
            ...parent::getInjections(),
            new BladePhpInjection(),
            new BladeKeywordInjection(),
        ];
    }

Next, you can write {{ … }} and {!! … !!} to echo output. Whatever is between these brackets is also considered PHP, so, one more injection:

final readonly class BladeEchoInjection implements Injection
{
    use IsInjection;

    public function getPattern(): string
    {
        return '({{|{!!)(?<match>.*)(}}|!!})';
    }

    public function parseContent(string $content, Highlighter $highlighter): ParsedInjection
    {
        return new ParsedInjection(
            content: $highlighter->parse($content, 'php')
        );
    }
}

And, finally, you can write Blade comments like so: {{-- --}}, this can be a simple pattern:

final readonly class BladeCommentPattern implements Pattern
{
    use IsPattern;

    public function getPattern(): string
    {
        return '(?<match>\{\{\-\-(.|\n)*?\-\-\}\})';
    }

    public function getTokenType(): TokenType
    {
        return TokenType::COMMENT;
    }
}

With all of that in place, the only thing left to do is to add our language to the highlighter:

$highlighter->addLanguage(new BladeLanguage());

And we're done! Blade support with just a handful of patterns and injections!

Adding tokens

Some people or projects might want more fine-grained control over how specific words are coloured. A common example are null, true, and false in json files. By default, tempest/highlight will treat those value as normal text, and won't apply any special highlighting to them:

{
  "null-property": null,
  "value-property": "value"
}

However, it's super trivial to add your own, extended styling on these kinds of tokens. Start by adding a custom language, let's call it ExtendedJsonLanguage:

use Tempest\Highlight\Languages\Json\JsonLanguage;

class ExtendedJsonLanguage extends JsonLanguage
{
    public function getPatterns(): array
    {
        return [
            ...parent::getPatterns(),
        ];
    }
}

Next, let's add a pattern that matches null:

use Tempest\Highlight\IsPattern;
use Tempest\Highlight\Pattern;
use Tempest\Highlight\Tokens\DynamicTokenType;
use Tempest\Highlight\Tokens\TokenType;

final readonly class JsonNullPattern implements Pattern
{
    use IsPattern;

    public function getPattern(): string
    {
        return '\: (?<match>null)';
    }

    public function getTokenType(): TokenType
    {
        return new DynamicTokenType('hl-null');
    }
}

Note how we return a DynamicTokenType from the getTokenType() method. The value passed into this object will be used as the classname for this token.

Next, let's add this pattern in our newly created ExtendedJsonLanguage:

class ExtendedJsonLanguage extends JsonLanguage
{
    public function getPatterns(): array
    {
        return [
            ...parent::getPatterns(),
            {*new JsonNullPattern(),*}
        ];
    }
}

Finally, register ExtendedJsonLanguage into the highlighter:

$highlighter->addLanguage(new ExtendedJsonLanguage());

Note that, because we extended JsonLanguage, this language will target all code blocks tagged as json. You could provide a different name, if you want to make a distinction between the default implementation and yours (this is what's happening on this page):

class ExtendedJsonLanguage extends JsonLanguage
{
    public function getName(): string
    {
        return 'json_extended';
    }

    // …
}

There we have it!

{
    "null-property": null,
    "value-property": "value"
}

You can add as many patterns as you like, you can even make your own TokenType implementation if you don't want to rely on DynamicTokenType:

enum ExtendedTokenType: string implements TokenType
{
    case VALUE_NULL = 'null';
    case VALUE_TRUE = 'true';
    case VALUE_FALSE = 'false';

    public function getValue(): string
    {
        return $this->value;
    }

    public function canContain(TokenType $other): bool
    {
        return false;
    }
}

Opt-in features

tempest/highlight has a couple of opt-in features, if you need them.

Markdown support

composer require league/commonmark;
use League\CommonMark\Environment\Environment;
use League\CommonMark\Extension\CommonMark\CommonMarkCoreExtension;
use League\CommonMark\MarkdownConverter;
use Tempest\Highlight\CommonMark\HighlightExtension;

$environment = new Environment();

$environment
    ->addExtension(new CommonMarkCoreExtension())
    ->addExtension(new HighlightExtension(/* You can manually pass in configured highlighter as well */));

$markdown = new MarkdownConverter($environment);

Word complexity

Ellison is a simple library that helps identify complex sentences and poor word choices. It uses similar heuristics to Hemingway, but it doesn't include any calls to third-party APIs or LLMs. Just a bit of PHP:

The app highlights lengthy, complex sentences and common errors; if you see a yellow sentence, shorten or split it. If you see a red highlight, your sentence is so dense and complicated that your readers will get lost trying to follow its meandering, splitting logic — try editing this sentence to remove the red. 

You can utilize a shorter word in place of a purple one. Click on highlights to fix them. 

Adverbs and weakening phrases are helpfully shown in blue. Get rid of them and pick words with force, perhaps. 

Phrases in green have been marked to show passive voice. 

You can enable Ellison support by installing assertchris/ellison:

composer require assertchris/ellison

You'll have to add some additional CSS classes to your stylesheet as well:

.hl-moderate-sentence {
    background-color: #fef9c3;
}

.hl-complex-sentence {
    background-color: #fee2e2;
}

.hl-adverb-phrase {
    background-color: #e0f2fe;
}

.hl-passive-phrase {
    background-color: #dcfce7;
}

.hl-complex-phrase {
    background-color: #f3e8ff;
}

.hl-qualified-phrase {
    background-color: #f1f5f9;
}

pre[data-lang="ellison"] {
    text-wrap: wrap;
}

The ellison language is now available:

```ellison
Hello world!
```

You can play around with it here.