Looks like the Great Firewall or something like it is preventing you from completely loading www.skritter.com because it is hosted on Google App Engine, which is periodically blocked. Try instead our mirror:

legacy.skritter.cn

This might also be caused by an internet filter, such as SafeEyes. If you have such a filter installed, try adding appspot.com to the list of allowed domains.

several queue ideas

jww1066   August 30th, 2009 11:09a.m.

I use the queue a lot and have a couple of ideas to make it a little more efficient to add characters and phrases, and a little more helpful in terms of finding things to add.

One basic thing would be to have the pinyin be a drop-down much like the traditional characters. There are only so many options for how a given set of characters can be pronounced. If the user wants to override one of the normal possibilities, which I think would be less than 5% of the time, let them do it but don't make people type in the pinyin every time.

Another thing would be to retrieve the definition automatically from MDBG or whatever. The Anki Mandarin toolkit uses Google Translate, for example. The user should be able to edit whatever is retrieved, of course.

Another dictionary integration that we've talked about before would be to find the components of each character, and if they're not already in the user's known set, add them to the queue.

A related idea which would be TOTALLY AWESOME would be
to queue up words (not characters) that use the new character together with characters that the user already knows. For example, if the user already knows 年 and 上 and then learns 半, we could have a "add related words" button that would add 半年 (half a year) and 上半年 (the first half of the year).

Finally, if there could be way to make the Queue accept a GET URL to queue up a character, then we could make a bookmark toolbar item up at the top of the browser so we could drag characters/phrases we run into while browsing the Web onto that bookmark and have the items added to our queue. We'd still need to go back to Skritter to check the pinyin and definition, of course.

James

jww1066   August 30th, 2009 12:19p.m.

Oh yeah I forgot one. MDGB and many other dictionaries (and Skritter itself, now that I think about it) display pinyin with tone marks, but the queue doesn't accept pinyin with tone marks, only with tone numbers. So when you copy and paste from those other sites and the pinyin is something like zháo​ it chokes and you have to change it to zhao2​.

James

marchey   August 30th, 2009 12:49p.m.

The queue is a great idea, but honestly it is a bit bothersome right now. I would like to see a parser integrated into it. This is how it could work:

1. In an edit box you paste whatever text you want (could be a, e-mail you got from a friend, a text from the internet, part of a book you are reading...whatever. Maybe there should be a limit on the number of characters to keep the application light.

2. After clicking a button the text would be parsed (some very good free parsers exist, so technically it should be no problem). The list of words is then matched up with the characters/ words that you already know and only the characters/words that are 'new' are returned.

3. You get the opportunity to edit this list, more or less as it is now.

4. Finally you queue the result.

The advantage of this system is that it makes this website a lot more generic to use. Whatever method or book you use to study Chinese, you can easily queue up the words you want to learn.

Marc

mike_thatguy   August 31st, 2009 9:41a.m.

I've gotta say, I do like the TOTALLY AWESOME idea that James mentioned. Of course, the number of combinations could be a bit out of hand when adding some very common characters...

nick   August 31st, 2009 5:50p.m.

I've been resisting doing the combinatorial pronunciation dropdown because then we lose people's best efforts on doing proper pinyin spacing, capitalization, etc. However, those best efforts have turned out to be very 乱 anyway, so I guess we can manually edit them. Will confer with Scott on the dropdown; if we do it, it'll probably be in a while.

MDBG definitions are often kind of whack. I love MDBG, but I'm a bit worried that users will just accept what shows up there (sometimes including all 9001 senses of a word) instead of putting in the definition they were thinking of when they saw the word. But I could be off base on that. I mean, we could seed Skritter with all of MDBG right now, but we'd just have to do all the editing ourselves instead of getting you guys to help us with the ones you think are important.

I had only been thinking about the component-adding system from the practice page, but I guess we could probably do it from anywhere. That's way up in the air but will be sweet.

The "related words" button sounds pretty cool--definitely something we'd have to build a bunch of infrastructure for, though. Doing that kind of stuff in web app requests can get tricky. I've added it to my list, though.

The add-words-from-elsewhere URL sounds great. I think it might be a lot easier at first to have it silently add those words that are in the database and skip the ones that aren't, rather than hooking up a new system for queueing up words for you to create. But we might be able to swing that, too, eventually.

The pinyin parsing algorithm should be accepting most words with tone marks (zháo works in my test), but as I was finishing writing it, I found some edge cases which doomed my whole approach. Rewriting the fatty pinyin parser to take all words with tone marks properly is there on the list, but not high priority. If there were an open-source implementation lying around that could do it really robustly, that'd be great. Otherwise we can open-source this one once I've got it fixed up.

Marc, the Chinese text segmentation engine integration is good, and we've been wanting to do it for a while. We haven't started working on it, so I don't know what segmenter we'd use, but I do know they're out there. I'm looking forward to getting that, because yeah, it will work exactly as you describe, and that will rock.

Nicki   August 31st, 2009 11:29p.m.

Sorry about the 乱.

nick   September 1st, 2009 9:52a.m.

没问题, the 乱 is to be expected. Pinyin formatting is hard, and you guys are all here to learn some 字, not to Romanize stuff.

This forum is now read only. Please go to Skritter Discourse Forum instead to start a new conversation!