Trace Id is missing

Copilot is your AI companion

Always by your side, ready to support you whenever and wherever you need it.
A picture of a dog with a pop up bubble saying Talk to Copilot and asking What celeb does my dog look like?

Microsoft Research IME Corpus

This download consists of data only: it provides a test data set for the task of Japanese character conversion for text input. Last published: December 21, 2005.

Important! Selecting a language below will dynamically change the complete page content to that language.

Download
  • Version:

    1.0

    Date Published:

    7/15/2024

    File Name:

    MSRIMECorpus.zip

    File Size:

    4.3 MB

    This download consists of data only: it provides a test data set for the task of Japanese character conversion for text input. The data set consists of: (1) reference files, which consist of Japanese sentences that are randomly extracted from news articles (no more than one sentence has been extracted per news article); (2) reading files, which consist of corresponding kana readings for the sentences in the reference files; (3) n-best files, which contain 100-best conversion candidates for each sentence in the reading files. More detailed information about the corpus is found in the technical report, Microsoft Research IME Corpus, MSR-TR-2005-168.
  • Supported Operating Systems

    Windows 10, Windows 7, Windows 8

    • Windows 7, Windows 8, or Windows 10
    • Click Download and follow the instructions.

Follow Microsoft