evfmt is an opinionated formatter for Unicode presentation selectors (U+FE0E and U+FE0F).
The name stands for “emoji variation formatter”. It is a command-line formatter and a Rust library.
This project was developed with AI assistance, guided by detailed design documents and substantial testing.
evfmt is ready for normal formatter use. Core commands such as evfmt format and evfmt check, documented exit codes, and the formatter's hard invariants are intended to remain stable.
The Rust library APIs and some advanced CLI use are still experimental. evfmt follows Cargo's SemVer compatibility conventions.
Many Unicode characters have dual presentations, text and emoji:
| text presentation | emoji presentation | |
|---|---|---|
| U+0023 | # | #️ |
| U+00A9 | ©︎ | ©️ |
| U+26A0 | ⚠︎ | |
| U+2764 | ❤︎ | ❤️ |
Unicode provides invisible presentation selectors (U+FE0E for text, U+FE0F for emoji) to request a specific presentation (though platforms may not always honor the request). These two characters (U+FE0E and U+FE0F) are variation selectors. Unicode Technical Standard #51 also calls them presentation selectors in the emoji context, and this document follows that convention. (The ev in evfmt stands for emoji variation, after the emoji variation sequences that these selectors produce.)
Each character can therefore appear in three forms: bare (no selector), text (U+FE0E), or emoji (U+FE0F). Without explicit selectors, certain dual-presentation characters may look different on different platforms. On the other hand, selectors are considered redundant or even defective in other contexts. evfmt normalizes these selectors for you.
The emoji selector U+FE0F also appears in multi-character emoji sequences such as keycaps and Emoji ZWJ sequences (where multiple emoji are joined into one). evfmt normalizes selector usage in these sequences as well.
Different platforms can render the same character differently when presentation selectors are missing or ambiguous. evfmt produces a canonical source spelling that reduces this cross-platform inconsistency:
- Chooses a deterministic form—bare, text, or emoji—for each character with dual presentations
- Preserves all emoji sequences that are recommended for general interchange (RGI)
- Removes stray selectors in unsupported positions
- Respects
.gitignoreand.evfmtignore
Hard invariants: evfmt is idempotent, deterministic, and only modifies presentation selectors—no other content is touched.
Install the CLI from crates.io:
cargo install evfmt --lockedAdd the library to another crate with cargo add:
cargo add evfmtMinimum supported Rust version (MSRV): Rust 1.88
If you are working from a local checkout, you can also install it with:
cargo install --path evfmt# Format one file.
evfmt format README.markdown# Format a group of files.
evfmt format docs/*.md# Format files under the current directory recursively.
evfmt format .# A bare heart (U+2764) becomes the emoji-form heart by default.
# The first command prints the same string as the second command: Love ❤️
printf '%b' 'Love \u2764' | evfmt format
printf '%b' 'Love \u2764\ufe0f'Use - as an explicit stdin operand when mixing stdin with files. A path such as ./- still means a file literally named -. Repeating - is allowed and reads the same stdin stream again from its current position; with piped input, the first - normally consumes the stream.
evfmt format a.txt - b.txt
evfmt format ./-
printf '%b' 'Love \u2764' | evfmt check -# Check without modifying (exits 1 if changes are needed)
evfmt check .With no file operands, evfmt check checks stdin.
# If a file name looks like an option, add `--` before file operands.
evfmt format -- --set-ignore
evfmt check -- --set-ignore| Code | Meaning |
|---|---|
0 |
Success (or no changes in check mode) |
1 |
Changes needed (check mode only) |
2 |
Error (I/O, invalid UTF-8, usage) |
Emoji ZWJ sequences are sequences of multiple emoji characters joined by the zero-width joiner (ZWJ; U+200D). For example, the rainbow flag 🏳️🌈 joins the white flag 🏳️ and the rainbow 🌈. evfmt normalizes each component in a ZWJ sequence as if that component appeared without the surrounding ZWJ links.
Keycap sequences combine a base character (0–9, #, or *) with the combining enclosing keycap (U+20E3) to produce keycap buttons like 1️⃣ and #️⃣. The base character can appear bare, with a text selector (U+FE0E), or with an emoji selector (U+FE0F) before the keycap mark. Historically, bare keycap sequences were used for both text and emoji presentations, which made them ambiguous. evfmt normalizes bare keycap sequences to explicit text forms by default. Explicit text and emoji selectors in keycap forms are preserved.
The options below are for projects that need to tune traversal or presentation policy beyond the default formatter behavior.
By default, evfmt enables all ignore filters: it skips files ignored by Git, files matched by .evfmtignore, and hidden files or directories. Change the enabled ignore filters only when you have a specific reason to include or exclude one of those classes.
| Option | Effect |
|---|---|
--set-ignore=<filter>[,<filter>...] |
Replace the set |
--add-ignore=<filter>[,<filter>...] |
Add to the set |
--remove-ignore=<filter>[,<filter>...] |
Remove from the set |
Ignore flags take one or more comma-separated filter labels:
| Label | Meaning |
|---|---|
git |
Files matched by Git ignore rules |
evfmt |
Files matched by .evfmtignore |
hidden |
Hidden files and directories |
Use commas to combine labels: git,evfmt,hidden.
Use all by itself to select every ignore filter; this is the default. For example, --remove-ignore=all formats everything reachable from the operands, including Git-ignored, .evfmtignore-ignored, and hidden files. Using none by itself with --set-ignore also disables all ignore filters.
Use this when you want to format hidden files while still honoring Git ignore rules and .evfmtignore:
evfmt format --remove-ignore=hidden .By default, evfmt leaves bare ASCII characters and Unicode emoji-default characters alone, while text-default non-ASCII characters with dual presentations get an explicit emoji selector. For example, # and bare sparkles (U+2728) stay bare, while a bare copyright sign (U+00A9) normalizes to ©️ (U+00A9 U+FE0F).
evfmt is a formatter, not a presentation editor. If you want to change how the copyright sign looks on your platform—say, switching it from emoji presentation to text presentation—do that in your editor by adding or removing the presentation selector (U+FE0E or U+FE0F). Run evfmt only after you are happy with how your document renders.
Use these recipes when the default policy is close to what you want, but a small class of symbols needs different handling.
Use this when you want explicit emoji selectors on emoji-default characters for more uniform rendering across platforms:
evfmt format \
--remove-prefer-bare-for=emoji-default \
README.markdownWith that option, bare emoji-default characters normalize to explicit emoji forms. Explicit text forms such as ⚡︎ stay text.
Use this when rights marks, arrows, and card suits already render as text-style symbols, and you want explicit text selectors for portability:
evfmt format \
--add-bare-as-text=rights-marks,arrows,card-suits \
README.markdownWith that option, bare rights marks, arrows, and card suits normalize to explicit text forms such as ©︎, ®︎, ™︎, ➡︎, and ♠︎. Explicit emoji-form symbols such as ©️, ➡️, and ♠️ stay emoji.
Use this when copyright and trademark-style marks already look like text on your reference platform, and you want their text presentation to stay bare in your files:
evfmt format \
--add-bare-as-text=rights-marks \
--add-prefer-bare=rights-marks \
README.markdownWith those options, bare or text-form copyright-style marks normalize to bare copyright-style marks. Explicit emoji-form marks such as ©️ stay emoji.
Use this when existing text contains bare keycap sequences such as 1 + U+20E3 and # + U+20E3, and you want them treated as emoji keycaps rather than text-style keycaps. Some older emoji mappings used these bare keycap sequences as emoji.
evfmt format \
--remove-bare-as-text=keycap-emojis \
README.markdownWith that option, bare keycap sequences normalize to explicit emoji forms such as 1️⃣ and #️⃣. Explicit text-form keycaps such as 1︎⃣ stay text.
The policy is shaped by two choices: how bare characters render on your reference platform, and which bare characters are stable enough to keep bare in your files. A reference platform is the environment whose bare-character rendering you are using as your baseline, usually the editor, terminal, or browser where you review the formatted text.
The CLI exposes those choices as two mutable sets:
bare-as-text: Which variation positions the reference platform shows as text when bare. Many modern platforms show bare non-ASCII, non-keycap characters as emoji, so the default set isascii,keycap-chars.prefer-bare: Among characters that can stay bare without changing their appearance on the reference platform, which ones should stay bare rather than getting an explicit selector. The default set isascii,emoji-defaults, so characters with default emoji presentation in Unicode stay bare, while text-default non-ASCII characters still get explicit selectors.
To choose the right policy, first decide whether a character's bare form looks like text or emoji on your reference platform. Put it in bare-as-text if the bare form looks like text. Then decide whether the character should stay bare in the files you publish, as long as doing so preserves the intended presentation. Put it in prefer-bare if bare spelling is stable enough for your target platforms.
The two choices determine how evfmt repairs each ambiguous standalone variation position:
| If a character is... | evfmt does this |
|---|---|
in bare-as-text and prefer-bare |
changes explicit text to bare; leaves others alone |
in bare-as-text only |
changes bare to explicit text; leaves others alone |
in prefer-bare only |
changes explicit emoji to bare; leaves others alone |
| in neither set | changes bare to explicit emoji; leaves others alone |
With the default sets bare-as-text = ascii,keycap-chars and prefer-bare = ascii,emoji-defaults, ASCII bare forms and emoji-default bare forms stay bare, text-default non-ASCII bare forms get explicit emoji selectors, and bare keycap-character forms get explicit text selectors.
Use these flags to update the policy sets. Each flag takes one or more comma-separated variation sets:
To update the bare-as-text set:
| Option | Effect |
|---|---|
--set-bare-as-text=<set>[,<set>...] |
Replace the set |
--add-bare-as-text=<set>[,<set>...] |
Add to the set |
--remove-bare-as-text=<set>[,<set>...] |
Remove from the set |
To update the prefer-bare set:
| Option | Effect |
|---|---|
--set-prefer-bare=<set>[,<set>...] |
Replace the set |
--add-prefer-bare=<set>[,<set>...] |
Add to the set |
--remove-prefer-bare=<set>[,<set>...] |
Remove from the set |
The policy sets start as prefer-bare = ascii,emoji-defaults and bare-as-text = ascii,keycap-chars, and flags are processed from left to right. set-* replaces the current set, add-* unions items into it, and remove-* subtracts items from it.
Each <set> in the list can be either one of the named sets below, one code point with u(HEX), such as u(00A9), or one character, such as #, *, or ©️. Presentation selectors are ignored when matching a single character. Except for sets whose names start with keycap-, named sets apply only to ordinary non-keycap positions.
| Set | Meaning |
|---|---|
ascii |
Dual-presentation characters in the ASCII range, such as #, *, and digits |
text-defaults |
Dual-presentation characters whose bare form defaults to text presentation in Unicode |
emoji-defaults |
Dual-presentation characters whose bare form defaults to emoji presentation in Unicode |
rights-marks |
Copyright and registered/trademark-style marks with dual presentations |
arrows |
Arrow symbols with dual presentations |
card-suits |
Card suit symbols with dual presentations |
keycap-chars |
Dual-presentation characters in keycap-character positions |
non-keycap-chars |
Dual-presentation characters in ordinary non-keycap positions |
keycap-emojis |
Emoji keycap bases (#, *, 0–9) in keycap-character positions |
The meaning of named sets may change as Unicode adds or revises dual-presentation characters.
Use all by itself to select every variation position evfmt can format. For example, --remove-prefer-bare=all makes every format-supported position require an explicit selector. Use none by itself with --set-* policy flags to clear that policy set. For example, --set-prefer-bare=none stops keeping any character bare just because it was in prefer-bare; with the default bare-as-text set, bare ASCII and bare keycap-character forms then normalize to explicit text form and ordinary bare non-ASCII normalizes to explicit emoji form.
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or https://www.apache.org/licenses/LICENSE-2.0)
- MIT License (LICENSE-MIT or https://opensource.org/licenses/MIT)
at your option.
Please report bugs, regressions, and feature requests in the issue tracker.