Video: https://youtu.be/wuXPPO2vNgU
GitHub: https://github.com/LexingtonWhalen/KanjiStrokesAnalysis
(I think you guys should be able to view the jupyter notebook file. Let me know!)
Purpose:
I wanted to analyze the stroke counts (a measure of kanji complexity) versus the frequency of words using that kanji. I was unable to find any data on this, so I did it myself! I thought maybe this would shed some light on how languages can balance possibility for expression and memory limits of human beings (ie: 8 strokes can create more characters than 1, but they are harder to remember).
In the GitHub are the pngs of the graphs shown. They are pretty neat!
Features:
* Analysis of Kanji stroke patterns by overall word frequency!
* Weighs each Kanji by their average use in natural language, then finds the most common stroke counts used in every-day language!
* Graphs! (Bar and Pie right now).
* Finds standard deviation and mean of most common stroke counts!
Modules / Packages:
* numpy: https://numpy.org/devdocs/contents.html
* pandas: https://pandas.pydata.org/pandas-docs/stable/index.html
* re: https://docs.python.org/3/library/re.html
* matplotlib: https://matplotlib.org/
* scipy: https://www.scipy.org/
* math: https://docs.python.org/3/library/math.html
* random: https://docs.python.org/3/library/random.html
* collections: https://docs.python.org/3/library/collections.html
[link] [comments]
from Language Learning https://ift.tt/3uvUXL7
via Learn Online English Speaking
Comments
Post a Comment