Yosina 1.1.1
dotnet add package Yosina --version 1.1.1
NuGet\Install-Package Yosina -Version 1.1.1
<PackageReference Include="Yosina" Version="1.1.1" />
<PackageVersion Include="Yosina" Version="1.1.1" />
<PackageReference Include="Yosina" />
paket add Yosina --version 1.1.1
#r "nuget: Yosina, 1.1.1"
#:package Yosina@1.1.1
#addin nuget:?package=Yosina&version=1.1.1
#tool nuget:?package=Yosina&version=1.1.1
Yosina C#
A C# port of the Yosina Japanese text transliteration library.
Overview
Yosina is a transliteration library that specifically deals with the letters and symbols used in Japanese writing.
Usage
Using Recipes (Recommended)
using Yosina;
// Create a recipe with desired transformations
var recipe = new TransliterationRecipe
{
KanjiOldNew = true,
ReplaceSpaces = true,
ReplaceSuspiciousHyphensToProlongedSoundMarks = true,
ReplaceCircledOrSquaredCharacters = TransliterationRecipe.ReplaceCircledOrSquaredCharactersOptions.Enabled,
ReplaceCombinedCharacters = true,
HiraKata = "hira-to-kata", // Convert hiragana to katakana
ReplaceJapaneseIterationMarks = true, // Expand iteration marks
ToFullwidth = TransliterationRecipe.ToFullwidthOptions.Enabled
};
// Create the transliterator
var transliterator = Entrypoint.MakeTransliterator(recipe);
// Use it with various special characters
var input = "①②③ ⒶⒷⒸ ㍿㍑㌠㋿"; // circled numbers, letters, ideographic space, combined characters
var result = transliterator(input);
Console.WriteLine(result); // "(1)(2)(3) (A)(B)(C) 株式会社リットルサンチーム令和"
// Convert old kanji to new
var oldKanji = "舊字體";
result = transliterator(oldKanji);
Console.WriteLine(result); // "旧字体"
// Convert half-width katakana to full-width
var halfWidth = "テストモジレツ";
result = transliterator(halfWidth);
Console.WriteLine(result); // "テストモジレツ"
// Demonstrate hiragana to katakana conversion with iteration marks
var mixedText = "学問のすゝめ";
result = transliterator(mixedText);
Console.WriteLine(result); // "学問ノススメ"
Using Direct Configuration
using Yosina;
// Configure with direct transliterator configs
var configs = new[]
{
new TransliteratorConfig("kanji-old-new"),
new TransliteratorConfig("spaces"),
new TransliteratorConfig("prolonged-sound-marks"),
new TransliteratorConfig("circled-or-squared"),
new TransliteratorConfig("combined")
};
var transliterator = Entrypoint.MakeTransliterator(configs);
var result = transliterator("some japanese text");
Console.WriteLine(result);
Advanced Usage with Options
using Yosina;
using Yosina.Transliterators;
// Use with specific options
var jisTransliterator = Entrypoint.MakeTransliterator(
new TransliteratorConfig("jisx0201-and-alike", new Jisx0201AndAlikeTransliterator.Options
{
FullwidthToHalfwidth = true,
ConvertGL = true
})
);
var jisResult = jisTransliterator("Hello World!"); // Full-width ASCII
Console.WriteLine(jisResult); // "Hello World!"
Available Transliterators
1. Circled or Squared (circled-or-squared)
Converts circled or squared characters to their plain equivalents.
- Options:
templates(custom rendering),includeEmojis(include emoji characters) - Example:
①②③→(1)(2)(3),㊙㊗→(秘)(祝)
2. Combined (combined)
Expands combined characters into their individual character sequences.
- Example:
㍻(Heisei era) →平成,㈱→(株)
3. Hiragana-Katakana Composition (hira-kata-composition)
Combines decomposed hiraganas and katakanas into composed equivalents.
- Options:
composeNonCombiningMarks(compose non-combining marks) - Example:
か + ゙→が,ヘ + ゜→ペ
4. Hiragana-Katakana (hira-kata)
Converts between hiragana and katakana scripts bidirectionally.
- Options:
mode("hira-to-kata" or "kata-to-hira") - Example:
ひらがな→ヒラガナ(hira-to-kata)
5. Hyphens (hyphens)
Replaces various dash/hyphen symbols with common ones used in Japanese.
- Options:
precedence(mapping priority order) - Available mappings: "ascii", "jisx0201", "jisx0208_90", "jisx0208_90_windows", "jisx0208_verbatim"
- Example:
2019—2020(em dash) →2019-2020
6. Ideographic Annotations (ideographic-annotations)
Replaces ideographic annotations used in traditional Chinese-to-Japanese translation.
- Example:
㆖㆘→上下
7. IVS-SVS Base (ivs-svs-base)
Handles Ideographic and Standardized Variation Selectors.
- Options:
charset,mode("ivs-or-svs" or "base"),preferSVS,dropSelectorsAltogether - Example:
葛󠄀(葛 + IVS) →葛
8. Japanese Iteration Marks (japanese-iteration-marks)
Expands iteration marks by repeating the preceding character.
- Example:
時々→時時,いすゞ→いすず
9. JIS X 0201 and Alike (jisx0201-and-alike)
Handles half-width/full-width character conversion.
- Options:
fullwidthToHalfwidth,convertGL(alphanumerics/symbols),convertGR(katakana),u005cAsYenSign - Example:
ABC123→ABC123,カタカナ→カタカナ
10. Kanji Old-New (kanji-old-new)
Converts old-style kanji (旧字体) to modern forms (新字体).
- Example:
舊字體の變換→旧字体の変換
11. Mathematical Alphanumerics (mathematical-alphanumerics)
Normalizes mathematical alphanumeric symbols to plain ASCII.
- Example:
𝐀𝐁𝐂(mathematical bold) →ABC
12. Prolonged Sound Marks (prolonged-sound-marks)
Handles contextual conversion between hyphens and prolonged sound marks.
- Options:
skipAlreadyTransliteratedChars,allowProlongedHatsuon,allowProlongedSokuon,replaceProlongedMarksFollowingAlnums - Example:
イ−ハト−ヴォ(with hyphen) →イーハトーヴォ(prolonged mark)
13. Radicals (radicals)
Converts CJK radical characters to their corresponding ideographs.
- Example:
⾔⾨⾷(Kangxi radicals) →言門食
14. Spaces (spaces)
Normalizes various Unicode space characters to standard ASCII space.
- Example:
A B(ideographic space) →A B
15. Roman Numerals (roman-numerals)
Converts Unicode Roman numeral characters to their ASCII letter equivalents.
- Example:
Ⅰ Ⅱ Ⅲ→I II III,ⅰ ⅱ ⅲ→i ii iii
16. Small Hirakatas (small-hirakatas)
Converts small hiragana and katakana characters to their ordinary-sized equivalents.
- Example:
ぁぃぅ→あいう,ァィゥ→アイウ
17. Archaic Hirakatas (archaic-hirakatas)
Converts archaic kana (hentaigana) to their modern hiragana or katakana equivalents.
- Example:
𛀁→え
18. Historical Hirakatas (historical-hirakatas)
Converts historical hiragana and katakana characters to their modern equivalents.
- Options:
hiraganas("simple", "decompose", or "skip"),katakanas("simple", "decompose", or "skip"),voicedKatakanas("decompose" or "skip") - Example:
ゐ→い(simple),ゐ→うぃ(decompose),ヰ→イ(simple)
Development
Projects
- Yosina: Main library containing core transliterators and generated code
- Yosina.Tests: Unit tests for the library
- Yosina.Codegen: Code generation tool for creating transliterators from JSON data
Code Generation
The C# implementation includes a code generation system that creates transliterator classes from JSON data files in the ../data directory.
Running Code Generation
dotnet run --project src/Yosina.Codegen
Building and Testing
Build the solution:
dotnet build
Run tests:
dotnet test
Create NuGet package:
dotnet pack
Format Code
dotnet format
Linting
dotnet format --verify-no-changes
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net8.0
- No dependencies.
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.