Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduces input macros #399

Merged
merged 7 commits into from
Dec 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions McBopomofo.xcodeproj/project.pbxproj
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
D427F7B4279086DC004A2160 /* InputSourceHelper in Frameworks */ = {isa = PBXBuildFile; productRef = D427F7B3279086DC004A2160 /* InputSourceHelper */; };
D427F7B6279086F6004A2160 /* InputSourceHelper in Frameworks */ = {isa = PBXBuildFile; productRef = D427F7B5279086F6004A2160 /* InputSourceHelper */; };
D427F7C127908EFC004A2160 /* OpenCCBridge in Frameworks */ = {isa = PBXBuildFile; productRef = D427F7C027908EFC004A2160 /* OpenCCBridge */; };
D43FC40B2B23788400ED5A1C /* InputMacro.swift in Sources */ = {isa = PBXBuildFile; fileRef = D43FC40A2B23788400ED5A1C /* InputMacro.swift */; };
D44FB74527915565003C80A6 /* Preferences.swift in Sources */ = {isa = PBXBuildFile; fileRef = D44FB74427915555003C80A6 /* Preferences.swift */; };
D44FB74A2791B829003C80A6 /* VXHanConvert in Frameworks */ = {isa = PBXBuildFile; productRef = D44FB7492791B829003C80A6 /* VXHanConvert */; };
D44FB74D2792189A003C80A6 /* PhraseReplacementMap.cpp in Sources */ = {isa = PBXBuildFile; fileRef = D44FB74B2792189A003C80A6 /* PhraseReplacementMap.cpp */; };
Expand Down Expand Up @@ -167,6 +168,7 @@
D427F7AC27907B7E004A2160 /* NotifierUI */ = {isa = PBXFileReference; lastKnownFileType = wrapper; name = NotifierUI; path = Packages/NotifierUI; sourceTree = "<group>"; };
D427F7B2279086B5004A2160 /* InputSourceHelper */ = {isa = PBXFileReference; lastKnownFileType = wrapper; name = InputSourceHelper; path = Packages/InputSourceHelper; sourceTree = "<group>"; };
D427F7BF27908EAC004A2160 /* OpenCCBridge */ = {isa = PBXFileReference; lastKnownFileType = wrapper; name = OpenCCBridge; path = Packages/OpenCCBridge; sourceTree = "<group>"; };
D43FC40A2B23788400ED5A1C /* InputMacro.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = InputMacro.swift; sourceTree = "<group>"; };
D44FB74427915555003C80A6 /* Preferences.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = Preferences.swift; sourceTree = "<group>"; };
D44FB7482791B346003C80A6 /* VXHanConvert */ = {isa = PBXFileReference; lastKnownFileType = wrapper; name = VXHanConvert; path = Packages/VXHanConvert; sourceTree = "<group>"; };
D44FB74B2792189A003C80A6 /* PhraseReplacementMap.cpp */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.cpp.cpp; path = PhraseReplacementMap.cpp; sourceTree = "<group>"; };
Expand Down Expand Up @@ -278,6 +280,7 @@
D4E569DB27A34CC100AC2CEF /* KeyHandler.mm */,
6AE215102A2849BB005A6A02 /* UTF8Helper.cpp */,
6AE2150F2A2849BB005A6A02 /* UTF8Helper.h */,
D43FC40A2B23788400ED5A1C /* InputMacro.swift */,
D456576D279E4F7B00DF6BC9 /* KeyHandlerInput.swift */,
D461B791279DAC010070E734 /* InputState.swift */,
D427F76B278CA1BA004A2160 /* AppDelegate.swift */,
Expand Down Expand Up @@ -671,6 +674,7 @@
6A0D4F4515FC0EB100ABF4B3 /* Mandarin.cpp in Sources */,
6ACC3D452793701600F1B140 /* ParselessLM.cpp in Sources */,
D41355DE278EA3ED005E5CBD /* UserPhrasesLM.cpp in Sources */,
D43FC40B2B23788400ED5A1C /* InputMacro.swift in Sources */,
6AE215112A2849BB005A6A02 /* UTF8Helper.cpp in Sources */,
6ACC3D3F27914F2400F1B140 /* KeyValueBlobReader.cpp in Sources */,
B058C5272AC9DF51002EDD66 /* ServiceProvider.swift in Sources */,
Expand Down
39 changes: 39 additions & 0 deletions Source/Data/Macros.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
今天日期 ㄐㄧㄣ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
MACRO@DATE_TODAY_SHORT ㄐㄧㄣ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
MACRO@DATE_TODAY_MEDIUM ㄐㄧㄣ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
MACRO@DATE_TODAY_MEDIUM_ROC ㄐㄧㄣ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
MACRO@DATE_TODAY_MEDIUM_CHINESE ㄐㄧㄣ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
昨天日期 ㄗㄨㄛˊ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
MACRO@DATE_YESTERDAY_SHORT ㄗㄨㄛˊ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
MACRO@DATE_YESTERDAY_MEDIUM ㄗㄨㄛˊ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
MACRO@DATE_YESTERDAY_MEDIUM_ROC ㄗㄨㄛˊ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
MACRO@DATE_YESTERDAY_MEDIUM_CHINESE ㄗㄨㄛˊ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
明天日期 ㄇㄧㄥˊ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
MACRO@DATE_TOMORROW_SHORT ㄇㄧㄥˊ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
MACRO@DATE_TOMORROW_MEDIUM ㄇㄧㄥˊ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
MACRO@DATE_TOMORROW_MEDIUM_ROC ㄇㄧㄥˊ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
MACRO@DATE_TOMORROW_MEDIUM_CHINESE ㄇㄧㄥˊ-ㄊㄧㄢ-ㄖˋ-ㄑㄧˊ -8
現在時刻 ㄒㄧㄢˋ-ㄗㄞˋ-ㄕˊ-ㄎㄜˋ -8
MACRO@TIME_NOW_SHORT ㄒㄧㄢˋ-ㄗㄞˋ-ㄕˊ-ㄎㄜˋ -8
MACRO@TIME_NOW_MEDIUM ㄒㄧㄢˋ-ㄗㄞˋ-ㄕˊ-ㄎㄜˋ -8
現在時間 ㄒㄧㄢˋ-ㄗㄞˋ-ㄕˊ-ㄐㄧㄢ -8
MACRO@TIME_NOW_SHORT ㄒㄧㄢˋ-ㄗㄞˋ-ㄕˊ-ㄐㄧㄢ -8
MACRO@TIME_NOW_MEDIUM ㄒㄧㄢˋ-ㄗㄞˋ-ㄕˊ-ㄐㄧㄢ -8
目前時區 ㄇㄨˋ-ㄑㄧㄢˊ-ㄕˊ-ㄑㄩ -8
MACRO@TIMEZONE_STANDARD ㄇㄨˋ-ㄑㄧㄢˊ-ㄕˊ-ㄑㄩ -8
MACRO@TIMEZONE_GENERIC_SHORT ㄇㄨˋ-ㄑㄧㄢˊ-ㄕˊ-ㄑㄩ -8
所在時區 ㄙㄨㄛˇ-ㄗㄞˋ-ㄕˊ-ㄑㄩ -8
MACRO@TIMEZONE_STANDARD ㄙㄨㄛˇ-ㄗㄞˋ-ㄕˊ-ㄑㄩ -8
MACRO@TIMEZONE_GENERIC_SHORT ㄙㄨㄛˇ-ㄗㄞˋ-ㄕˊ-ㄑㄩ -8
今年干支 ㄐㄧㄣ-ㄋㄧㄢˊ-ㄍㄢ-ㄓ -8
MACRO@THIS_YEAR_GANZHI ㄐㄧㄣ-ㄋㄧㄢˊ-ㄍㄢ-ㄓ -8
去年干支 ㄑㄩˋ-ㄋㄧㄢˊ-ㄍㄢ-ㄓ -8
MACRO@LAST_YEAR_GANZHI ㄑㄩˋ-ㄋㄧㄢˊ-ㄍㄢ-ㄓ -8
明年干支 ㄇㄧㄥˊ-ㄋㄧㄢˊ-ㄍㄢ-ㄓ -8
MACRO@NEXT_YEAR_GANZHI ㄇㄧㄥˊ-ㄋㄧㄢˊ-ㄍㄢ-ㄓ -8
今年生肖 ㄐㄧㄣ-ㄋㄧㄢˊ-ㄕㄥ-ㄒㄧㄠˋ -8
MACRO@THIS_YEAR_CHINESE_ZODIAC ㄐㄧㄣ-ㄋㄧㄢˊ-ㄕㄥ-ㄒㄧㄠˋ -8
去年生肖 ㄑㄩˋ-ㄋㄧㄢˊ-ㄕㄥ-ㄒㄧㄠˋ -8
MACRO@LAST_YEAR_CHINESE_ZODIAC ㄑㄩˋ-ㄋㄧㄢˊ-ㄕㄥ-ㄒㄧㄠˋ -8
明年生肖 ㄇㄧㄥˊ-ㄋㄧㄢˊ-ㄕㄥ-ㄒㄧㄠˋ -8
MACRO@NEXT_YEAR_CHINESE_ZODIAC ㄇㄧㄥˊ-ㄋㄧㄢˊ-ㄕㄥ-ㄒㄧㄠˋ -8
6 changes: 3 additions & 3 deletions Source/Data/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ data-plain-bpmf.txt: bin/cook-plain-bpmf.py BPMFBase.txt BPMFPunctuations.txt
bin/cook-plain-bpmf.py BPMFBase.txt BPMFPunctuations.txt data-plain-bpmf.txt

data.txt: bin/cook.py BPMFBase.txt BPMFMappings.txt BPMFPunctuations.txt \
PhraseFreq.txt phrase.occ Symbols.txt \
heterophony1.list heterophony2.list heterophony3.list
bin/cook.py PhraseFreq.txt BPMFMappings.txt BPMFBase.txt BPMFPunctuations.txt Symbols.txt $@
PhraseFreq.txt phrase.occ Symbols.txt Macros.txt\
heterophony1.list heterophony2.list heterophony3.list
bin/cook.py PhraseFreq.txt BPMFMappings.txt BPMFBase.txt BPMFPunctuations.txt Symbols.txt Macros.txt $@

PhraseFreq.txt: bin/buildFreq.py phrase.occ exclusion.txt
bin/buildFreq.py
Expand Down
2 changes: 1 addition & 1 deletion Source/Data/Symbols.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@
☢ ㄈㄨˊ-ㄕㄜˋ-ㄒㄧㄥˋ -8
≥ ㄉㄚˋ-ㄩˊ-ㄉㄥˇ-ㄩˊ -8
🉐 ㄉㄜˊ -8
▼ ㄉㄠˋ-ㄙㄢ-ㄐㄧㄠˇ -8
℡ ㄉㄧㄢˋ-ㄏㄨㄚˋ -8
▼ ㄉㄠˋ-ㄙㄢ-ㄐㄧㄠˇ -8
㏒ ㄉㄨㄟˋ-ㄕㄨˋ -8
♏ ㄊㄧㄢ-ㄒㄧㄝ -8
♎ ㄊㄧㄢ-ㄔㄥˋ -8
Expand Down
6 changes: 6 additions & 0 deletions Source/Data/bin/cook.py
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,12 @@
assert len(row) == 3, row
output.append(tuple(row))

with open(sys.argv[6]) as macro_file:
for line in macro_file:
row = line.rstrip().split(" ")
assert len(row) == 3, row
output.append(tuple(row))

output = convert_vks_rows_to_sorted_kvs_rows(output)
with open(sys.argv[-1], "w") as fout:
fout.write(HEADER)
Expand Down
11 changes: 10 additions & 1 deletion Source/Engine/McBopomofoLM.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,10 @@ void McBopomofoLM::setExternalConverter(std::function<std::string(std::string)>
m_externalConverter = externalConverter;
}

void McBopomofoLM::setMacroConverter(std::function<std::string(std::string)> macroConverter) {
m_macroConverter = macroConverter;
}

std::vector<Formosa::Gramambular2::LanguageModel::Unigram> McBopomofoLM::filterAndTransformUnigrams(const std::vector<Formosa::Gramambular2::LanguageModel::Unigram> unigrams, const std::unordered_set<std::string>& excludedValues, std::unordered_set<std::string>& insertedValues)
{
std::vector<Formosa::Gramambular2::LanguageModel::Unigram> results;
Expand All @@ -189,17 +193,22 @@ std::vector<Formosa::Gramambular2::LanguageModel::Unigram> McBopomofoLM::filterA
}

std::string value = originalValue;

if (m_phraseReplacementEnabled) {
std::string replacement = m_phraseReplacement.valueForKey(value);
if (!replacement.empty()) {
value = replacement;
}
}
if (m_macroConverter) {
std::string replacement = m_macroConverter(value);
value = replacement;
}
if (m_externalConverterEnabled && m_externalConverter) {
std::string replacement = m_externalConverter(value);
value = replacement;
}
if (insertedValues.find(value) == insertedValues.end()) {
if (!value.empty() && insertedValues.find(value) == insertedValues.end()) {
results.emplace_back(value, unigram.score());
insertedValues.insert(value);
}
Expand Down
3 changes: 3 additions & 0 deletions Source/Engine/McBopomofoLM.h
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,8 @@ class McBopomofoLM : public Formosa::Gramambular2::LanguageModel {
bool externalConverterEnabled() const;
/// Sets a lambda to let the values of unigrams could be converted by it.
void setExternalConverter(std::function<std::string(std::string)> externalConverter);
/// Sets a lambda to convert the macros.
void setMacroConverter(std::function<std::string(std::string)> macroConverter);

const std::vector<std::string> associatedPhrasesForKey(const std::string& key);
bool hasAssociatedPhrasesForKey(const std::string& key);
Expand Down Expand Up @@ -128,6 +130,7 @@ class McBopomofoLM : public Formosa::Gramambular2::LanguageModel {
bool m_phraseReplacementEnabled;
bool m_externalConverterEnabled;
std::function<std::string(std::string)> m_externalConverter;
std::function<std::string(std::string)> m_macroConverter;
};
};

Expand Down
Loading