Studying Japanese kanji (part I)

28 May 2017 | Tags: meta

Kanji are Chinese characters that have been adopted in Japanese, and are used to write the language, along two syllabaries: hiragana and katakana. Although there are tens of thousands of Chinese characters, the ones that are used in Japanese is a relatively smaller subset. The Jōyō kanji, an official list that compiles the most used characters, is a bit over 2000 kanji.

As a student, I've struggled with kanji. When I was taking language lessons at university, we would study them by brute force: writing them over and over again. This was enough to learn around 100 of them, but it's not a very efficient approach.

I later discovered J. Heisig's book "Remembering the Kanji", where he outlines the principles of a method to learn to write the jōyō kanji. The fundamental claims made by Heisig are:

Every kanji is made up of a reduced set of "components".
It's easier to learn a kanji if you memorize which components make the whole character, instead of memorizing it stroke by stroke.
To increase efficiency of memorization, mnemonics are used.
It is more efficient to learn kanji first, and not worry about their pronunciation or vocabulary words using them. To do that, each kanji is assigned a unique keyword, usually one that is related to the meaning the kanji conveys.

The components concept is what is very interesting, and most modern methods are based on it. Components can be either full kanji, or some strokes grouped together.

For instance, let's say that we already know the following kanji: 口 (mouth) and 千 (thousand). From, that, learning 舌 should be easy if we think in terms of (舌 = 口 + 千), instead of focusing in memorizing the individual strokes. For simple kanji like this one it might not seem such a big deal, but for more complex kanji, this is priceless. Some examples:

愛 (love): ⺤ + 冂 + 心 + 夂
勇 (courage): マ + 男
様 (honorific suffix -sama): 木 + 羊 + 水

Heisig would also build a mnemonic out of the individual components. A mnemonic is a memorization technique. In this case, a story is constructed that uses the individual components, usually in an absurd, silly way –this aids in recalling. For instance, to remember the kanji 舌, its mnemonic would feature a short story in which the keywords mouth (口) and thousand (千) would appear.

That completely made sense to me, but I didn't implement this method in place. The book suggested writing out the kanji, their components and the keyword in index cards, and use those to review. Carrying out that stack of cards is not practical, and at that time in my life I was oblivious to flashcard apps. I started to get really busy with my computing studies and put Japanese aside.

Fast forward a some years, I decided to resume my Japanese studies. I found out about Spaced Repetition Software (SRS) and flashcard apps that implemented them, like Anki. I signed up to WaniKani, a SRS method that would teach the kanji, their most common readings, and associated vocabulary words. I got familiar with around 1000 kanji with this method, but I found a lot of problems in it, and progression was getting harder a harder:

WaniKani doesn't let you study ahead of time. For instance, if I had 30 minutes of free time at a moment, some times I couldn't study because my next batch of reviews/new flashcards would be scheduled to two hours later.
The website required network connection to do the reviews and study (I believe they have mitigated this issue now). For me, that meant no study while commuting on the tube or traveling.
There was inconsistency in the mnemonics. For instace, to remember pronunciation, sometimes "rooster" would be use for "kaku", sometimes for "koku".
Some of the keywords assigned to kanji were really hard to recall, because they were not common English words (English is just my third language).
At that time, I was using another SRS (Anki) to study grammar and sentences, and I found that dealing with two different SRS is very painful. For instance, Anki would limit the maximum amount of reviews per day, or the maximum amount of new cards to study. If you use an extra SRS on top of that, the limits don't make sense anymore. I found that most days, I could only study with one SRS… and reviews would pile up on both. It was quite demoralizing and stressful.

I cancelled my WaniKani subscription and try a different approach: using a pre-made Anki deck to study kanji (there are some available online, like versions of RTK decks, and other alternatives like Kanji Damage.

Most of the struggles I had with WaniKani were gone when I switched to a kanji-only deck for Anki, but still I had the problem of some characters or mnemonics being very hard to recall: words such us "spindle", "halberd", or "bestow" I had to look them up in a dictionary; a lot of the keywords were synonyms but I didn't know the nuances of each one.

The solution seemed obvious: start to translate those keywords and make up my own mnemonics on the go. But it turned out that later on I would find yet another synonym or keyword that would clash with the Spanish translation that I made. I have a Spanish version of Remembering the Kanji, but the keywords that they use in there are just terrible. A lot of synonym that just mean the same, mnemonics that don't work for me…

Another common problem I've had across all these systems is that when I was reviewing the cards, sometimes I would make a different decomposition than the author. For instance, the character 元 could be broken down in multiple ways:

元 = 一 + ⺎
元 = 二 + 儿
元 = 二 + ⺎

For me the third option was the worst, because they would "merge" component lines, when a different composition that would not require this merging was available! I needed consistency, and I was failing to recall properly because the break down I was doing in my mind didn't match the author's.

I ended up quitting Japanese, again.

Now I've resumed my studies and I'm on round #3 to learn kanji. And I decided to find a method that would work for me. My requirements were:

My own keyword set in Spanish, that minimizes the use of synonyms with very slight nuances.
My own components. For instance, in the Spanish translation of RTK, the kanji 十 (ten) would take up the meaning of "needle" when acting as a component. My imagination has a hard time devising mnemonics with "needle". But I can do fine if I switch it for "cross".
A consistent way of breaking down characters into components, always following the same criteria.
A sorting order that would prioritize most common kanji first, since I plan to take around 6 months studying the characters and I'll be learning vocabulary and sentences in the meantime.

Long story sort, I input the characters, keywords, and components breakdown into a spreadsheet and coded a script in Node that would do the sorting. I ended up with a spreadsheet that I could import as a deck into Anki.

I can't share the spreadsheet as it is because some of the radicals are taken from different sources/methods (for instance, I took ⺌ as WaniKani's "triceratops", and Kanji Damage's "George Michael" for 冂), and I want to respect their intellectual property. But I'm going to share the sorting and the Spanish keywords I've used for the kanji, in case it might be of help.

Behold Benko's kanji list!

In an upcoming post I would share the algorithm I coded, as well as the process I followed to create the list.

一緒に勉強しましょう〜！